Benchmarks

These are local wall-clock reference measurements for the Gaussian SCM query suite. The absolute numbers are machine-specific; the relative ordering is the useful part. Warm timings reuse an already prepared model. Cold timings include fresh model preparation before each query.

The native-port table uses the shared continuous oracle suite. Every port matched the oracle within tolerance for associational, interventional, effect, and counterfactual query results.

Continuous Query Runtime By Port

Language

Warm ms

Cold ms

vs Python cold

C++

0.0010

0.0036

7.53x

Rust

0.0151

0.0102

2.67x

Ruby

0.0281

0.0133

2.04x

Go

0.0189

0.0138

1.97x

TypeScript

0.0017

0.0192

1.42x

Python

0.0012

0.0272

1.00x

Swift

0.0293

0.0377

0.72x

Lua

0.0519

0.0446

0.61x

C#

0.0029

0.0509

0.53x

Java

0.0172

0.0726

0.37x

Octave

0.0988

0.0829

0.33x

R

0.2009

0.2123

0.13x

Julia

0.0200

4.3262

0.01x

The oracle suite is intentionally small, so sub-0.01 ms differences should be read as local microbenchmark noise unless they also hold on larger generated Gaussian sweeps. The cold column is the safer cross-language comparison because it includes model setup and avoids over-weighting tiny hot-path calls.

R bnlearn Comparison

The R comparison uses bnlearn as an independent Gaussian reference over backdoor, mediated, joint-intervention, and collider-conditioned models. py-scm matches bnlearn at floating-point precision across all checked query families.

Accuracy Against R bnlearn

Query type

Metrics

Worst absolute difference

Result

Associational

824

5.329e-15

Exact

Interventional

38

4.885e-15

Exact

Causal effect

24

1.776e-15

Exact

Counterfactual

19

1.776e-15

Exact

The runtime table reports ranges across benchmark cases. Speedup is bnlearn time divided by py-scm time, so values above 1.0x favor py-scm. A zero-rounded bnlearn prior timing is excluded from the hot associational speedup range.

Runtime Against R bnlearn

Query type

Mode

py-scm ms

bnlearn ms

Speedup

Associational

hot

0.000345-0.000751

0.000-0.041

53.25x-78.38x

Associational

cold

0.006287-0.026034

0.264-0.477

13.72x-44.63x

Interventional

hot

0.000383-0.000610

0.050-0.150

98.77x-391.31x

Interventional

cold

0.025948-0.046683

0.300-1.150

9.14x-31.64x

Causal effect

hot

0.001225-0.001415

0.150-0.250

112.15x-204.08x

Causal effect

cold

0.031248-0.052408

0.400-0.650

10.50x-13.79x

Counterfactual

hot

0.002350-0.002987

0.600-1.300

205.71x-461.54x

Counterfactual

cold

0.029754-0.049667

0.900-1.600

26.43x-33.24x

The main reading is that py-scm agrees with bnlearn to numerical precision and remains materially faster on the same Gaussian workloads, both for repeated queries and for cold first-hit queries.