The State of Benchmarking in Node.js
Benchmarking becomes more important as we build more and more applications and tooling for runtimes like Node.js and Bun. This article is about macro and micro benchmarking, and explores options we can use today. The article includes code examples and a CodeSandbox to try and implement in your own applications.
Contents
Macro benchmarking
Benchmarking a part of your application code is an important scenario. While running a real-world application, how many times are expensive functions called and how much time is spent in each? These are essential metrics for any CPU intensive code such as bundlers, compilers, linters, formatters, and so on.
Not many of those tools are using the node:perf_hooks module, while much
of this native module is available since Node.js v8.5.0 (released over 6 years
ago). This included performance.now()
, performance.timerify()
and
PerformanceObserver
, and the built-in module has been improved and extended
ever since. This module allows to integrate all sorts of performance timings
right into your application.
Ecosystem
There aren’t that many libraries or runners on top of the Node.js built-ins. I sure hope I’m missing something, but here are some data points at the time of writing:
- 1 package found for npmjs.com/search?q=timerify
- 4 packages found for npmjs.com/search?q=PerformanceObserver
- 32 hits for npmjs.com/search?q=perf_hooks
- a few hits for twitter.com/search?q=perf_hooks
I think there’s room for tooling in this space to reduce boilerplate and improve accessibility. For instance, it would help a lot if we could import a utility to wrap functions in any application, and render or return metrics about the wrapped functions as needed.
Example: PerformanceObserver for functions
Let’s look at an example which logs each recorded function invocation with a
PerformanceObserver
instance:
This will log three PerformanceEntry
objects and one of the properties is
duration
:
$ node observer.mjs
PerformanceNodeEntry {
name: 'myFunctionUnderTest',
entryType: 'function',
startTime: 20.211291000247,
duration: 0.02987500000745058,
detail: []
}
PerformanceNodeEntry {
name: 'myFunctionUnderTest',
entryType: 'function',
startTime: 20.426791000179946,
duration: 0.0017919996753335,
detail: []
}
PerformanceNodeEntry {
name: 'myFunctionUnderTest',
entryType: 'function',
startTime: 20.432208000682294,
duration: 0.0006669992581009865,
detail: []
}
Now we have a basis to record the number of calls to a function and the duration of each call.
Note: this article focuses on the function
performance entry type. Other
valid entryTypes
include mark
, measure
, http
, net
and dns
. See the
Node.js docs on perf_hooks for more details and examples.
Example: timerify application code
Expanding on this idea, I was hoping there would be utilities to make this easier and more accessible, but unfortunately I didn’t find much.
Since Node.js provides the building blocks, I created this Performance.js class in Knip last year. I’ve been meaning to turn this into a separate module to be published, but haven’t got around doing this.
For this article I created a modified version to try and play with. The code is in this CodeSandbox.
Run the demo from the terminal inside the sandbox:
Here’s the gist of it, again with the PerformanceObserver
class and
performance.timerify()
function as the main building blocks:
And here’s how to use it in any real-world application:
After running this, here’s some example output:
$ node performance.mjs --performance
Name size min max median sum
---- ---- ------ ------ ------ ------
fnA 3 101.18 300.59 200.70 602.47
fnB 1 502.24 502.24 502.24 502.24
Total running time: 804ms
The functions are only wrapped when using the --performance
flag. Without the
flag, the functions are not wrapped and there is no overhead.
Micro benchmarking
Benchmarking arbitrary code in isolation is important too. Sometimes you want to benchmark and compare two or more ways to do the same thing. Paste some code, let it ramble for a bit and see results. There’s plenty of options available to do this in a browser, but what about Node.js and other runtimes?
We have options console.time()
and performance.now()
, but there’s some
boilerplate and ceremony involved to get results.
And we shouldn’t have to worry about things like process isolation, state resets between runs, external conditions, turbulence, and aggregating numbers to yield statistically significant results.
For some more serious benchmarking, we’ll need something better.
Ecosystem
Node.js was pretty close to having a built-in node:benchmark
module. In
November 2023, a pull request to add an experimental node:benchmark
module to Node.js core was opened. And closed, after an interesting debate.
This leaves us with a diverse set of packages for micro benchmarking in Node.js:
- node-bench - the effort that led to this PR, currently in active
development, looking for feedback and ideas; aims to be the foundation of
node:benchmark
- Benchmark.js - still good and widely used
- isitfast - not production-ready yet, but innovative and promising
- cronometro - runs tests in isolated worker threads
- Tinybench - also works in the browser (like Benchmark.js)
- mitata - fast and accurate (used by Bun and Deno)
If you need something production-ready today, Benchmark.js is a good choice. It’s battle-tested and versatile. However, its latest release was 6 years ago and the repository has been archived (as of 2024-04-14).
The other options are all worth checking out. Consult the overview table and benchmarks-comparisons that Vinicius Lourenço put together for more details.
For the record, Deno has a built-in benchmark runner.
A CLI for such tools would be great. Have some code in a file and let a CLI tool import and benchmark it. Much like aforementioned tools, but move the API from runtime to CLI.
Pitfalls
Before we continue, here’s the mandatory warning to not forget about the pitfalls of micro benchmarking:
- Running code in isolation means missing real-world context and different compiler optimizations. For various reasons, the same code may have different performance characteristics when running in the context of a real-world application.
- Micro benchmarking is often associated with premature optimization. Don’t lose sight of the big picture!
Example: string concatenation
Let’s look at an example. We want to know which function is the fastest, and by how much. The following functions do the same thing:
Benchmark.js
Benchmark.js is great and battle-tested software, despite the fact its last publish was early 2017 when it was tested on Node.js version 10 and 11.
Let’s create a test suite to benchmark and compare three string concatenation alternatives:
Running this on my machine gives the following output:
$ node benchmark.mjs
plus x 20,171,223 ops/sec ±0.41% (94 runs sampled)
join x 10,288,969 ops/sec ±0.19% (101 runs sampled)
concat x 17,782,613 ops/sec ±0.18% (98 runs sampled)
Fastest is plus
Clear output. All options are fast, but we have a winner.
Tinybench
Tinybench is the new kid on the block. You can use it stand-alone, and it also comes shipped with Vitest.
The API of Tinybench is similar to Benchmark.js:
Running this gives the following output:
$ node tinybench.mjs
┌─────────┬───────────┬──────────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name │ ops/sec │ Average Time (ns) │ Margin │ Samples │
├─────────┼───────────┼──────────────┼────────────────────┼──────────┼─────────┤
│ 0 │ 'plus' │ '13,188,219' │ 75.8252486995182 │ '±0.61%' │ 6594110 │
│ 1 │ 'join' │ '7,958,935' │ 125.64493939618565 │ '±0.54%' │ 3979468 │
│ 2 │ 'concat' │ '11,681,195' │ 85.60767506752819 │ '±0.91%' │ 5840598 │
└─────────┴───────────┴──────────────┴────────────────────┴──────────┴─────────┘
A CLI, maybe?
Wouldn’t it be convenient if we could just export our functions from a file:
And point our imaginary bench
CLI tool at this file:
$ bench string-concat.js
plus x 20,171,223 ops/sec ±0.41% (94 runs sampled)
join x 10,288,969 ops/sec ±0.19% (101 runs sampled)
concat x 17,782,613 ops/sec ±0.18% (98 runs sampled)
Fastest is plus
And, maybe, one day:
$ node --bench string-concat.js
Conclusion
Although the building blocks are there, I think especially in the area of macro optimizations there’s room for tooling to make our lives easier.
When it comes to micro benchmarking, it feels a bit odd to recommend a package last updated in 2017 (Benchmark.js). Let’s watch this space!
This concludes my perspective on the current state of benchmarking in Node.js, at the end of 2023. Do you agree?