Skip to main contentOverview
Tracing necessarily incurs a performance cost by its very nature.
We have conducted some benchmarks on our Node SDK under various scenarios both
for transparency and also to guide our own improvement efforts, especially since
we are only in beta.
The benchmark scripts for the Node SDK can be found
here.
Methodology
The benchmarks are fairly simple.
The main work is being performed by an Express server that we spin up
before the benchmarks start.
Some endpoints compute hashes repeatedly, some just sit and wait, and others
send and receive large bodies.
We have three benchmark files: SDK disabled, SDK enabled, and SDK enabled with
transforms.
These are run one after the other and completely segregated so we don’t have to
worry about the SDK’s import hooks polluting other tests.
Each benchmark file then runs the same set of tasks.
We have two main scenarios to consider:
- High CPU — we compute hashes in a loop to max out CPU utilization
- High I/O — we wait on a timer simulating pending I/O
We run the benchmarks like so:
- Run the benchmarks with memory monitoring off, since memory monitoring
(
process.memoryUsage) has some performance impact due to traversing
pages.
- Run the benchmarks with memory monitoring on, but we discard the other
results it gives.
- Run the benchmarks with memory monitoring off but with transforms on to get
the impact of transforms.
The following results were obtained on an M1 MacBook Pro.
Memory
| Configuration | Avg RSS Δ | Max RSS Δ |
| SDK Active | 55.49 MB | 63.31 MB |
| SDK Active w/ Transforms | 54.63 MB | 60.94 MB |
Memory is quite hard to measure exactly due to garbage collection and timing
issues, so take this with a grain of salt.
It was also measured with the internal Node.js process.memoryUsage function,
which might not be what the system reports (for example in top).
CPU
| Workload | Throughput Δ | Tail Latency Δ | User CPU Δ |
| CPU-bound | -9.9% | -4.3% | +0.2% |
| I/O-bound | -0.6% | -2.0% | +9.5% |
| Transform endpoints | -1.1% | +1.0% | -0.1% |
It is expected that during the CPU bound test that CPU contention starts to foul
the results.
The I/O bound case might be more illuminating for real world usages.
Transforms show negligible impact when only a small number of transforms are
registered, and its impact is expected to scale linearly with the number of
transforms registered.
Conclusion
In production use you may see a small hit to performance with the SDK enabled,
but in our testing this is very minimal.
Also note that results will vary based on CPU, OS, and system state.
If you encounter any performance related issues with our SDK, please feel free to
contact us or raise an issue on GitHub.