Benchmarks
These are local wall-clock reference measurements for the Gaussian SCM query suite. The absolute numbers are machine-specific; the relative ordering is the useful part. Warm timings reuse an already prepared model. Cold timings include fresh model preparation before each query.
The native-port table uses the shared continuous oracle suite. Every port matched the oracle within tolerance for associational, interventional, effect, and counterfactual query results.
Language |
Warm ms |
Cold ms |
vs Python cold |
|---|---|---|---|
C++ |
0.0010 |
0.0036 |
7.53x |
Rust |
0.0151 |
0.0102 |
2.67x |
Ruby |
0.0281 |
0.0133 |
2.04x |
Go |
0.0189 |
0.0138 |
1.97x |
TypeScript |
0.0017 |
0.0192 |
1.42x |
Python |
0.0012 |
0.0272 |
1.00x |
Swift |
0.0293 |
0.0377 |
0.72x |
Lua |
0.0519 |
0.0446 |
0.61x |
C# |
0.0029 |
0.0509 |
0.53x |
Java |
0.0172 |
0.0726 |
0.37x |
Octave |
0.0988 |
0.0829 |
0.33x |
R |
0.2009 |
0.2123 |
0.13x |
Julia |
0.0200 |
4.3262 |
0.01x |
The oracle suite is intentionally small, so sub-0.01 ms differences should be read as local microbenchmark noise unless they also hold on larger generated Gaussian sweeps. The cold column is the safer cross-language comparison because it includes model setup and avoids over-weighting tiny hot-path calls.
R bnlearn Comparison
The R comparison uses bnlearn as an independent Gaussian reference over
backdoor, mediated, joint-intervention, and collider-conditioned models.
py-scm matches bnlearn at floating-point precision across all checked
query families.
Query type |
Metrics |
Worst absolute difference |
Result |
|---|---|---|---|
Associational |
824 |
5.329e-15 |
Exact |
Interventional |
38 |
4.885e-15 |
Exact |
Causal effect |
24 |
1.776e-15 |
Exact |
Counterfactual |
19 |
1.776e-15 |
Exact |
The runtime table reports ranges across benchmark cases. Speedup is
bnlearn time divided by py-scm time, so values above 1.0x favor
py-scm. A zero-rounded bnlearn prior timing is excluded from the hot
associational speedup range.
Query type |
Mode |
|
|
Speedup |
|---|---|---|---|---|
Associational |
hot |
0.000345-0.000751 |
0.000-0.041 |
53.25x-78.38x |
Associational |
cold |
0.006287-0.026034 |
0.264-0.477 |
13.72x-44.63x |
Interventional |
hot |
0.000383-0.000610 |
0.050-0.150 |
98.77x-391.31x |
Interventional |
cold |
0.025948-0.046683 |
0.300-1.150 |
9.14x-31.64x |
Causal effect |
hot |
0.001225-0.001415 |
0.150-0.250 |
112.15x-204.08x |
Causal effect |
cold |
0.031248-0.052408 |
0.400-0.650 |
10.50x-13.79x |
Counterfactual |
hot |
0.002350-0.002987 |
0.600-1.300 |
205.71x-461.54x |
Counterfactual |
cold |
0.029754-0.049667 |
0.900-1.600 |
26.43x-33.24x |
The main reading is that py-scm agrees with bnlearn to numerical
precision and remains materially faster on the same Gaussian workloads, both
for repeated queries and for cold first-hit queries.