py-scm

Common

These are common type definitions and interfaces used.

class pyscm.common.IReasoningModel

Bases: ABC

Interventional model.

abstract iquery(Y: str, X: Dict[str, float], samples=1000) float

Conducts an interventional query: E[Y=y|do(X=x)].

Parameters
  • Y

  • X

  • samples – Number of samples.

Returns

E[Y=y|do(X=x)].

abstract samples(size=1000) DataFrame

Samples data from the marginals.

Parameters

size – Number of samples.

Returns

Samples.

class pyscm.common.Parameters(M: Series, C: DataFrame)

Bases: object

Parameters.

C: DataFrame
M: Series
__init__(M: Series, C: DataFrame) None

Reasoning

The reasoning engine logic is contained in this module.

class pyscm.reasoning.ReasoningModel(d: DiGraph, M: Series, C: DataFrame)

Bases: IReasoningModel

Reasoning model.

__init__(d: DiGraph, M: Series, C: DataFrame)

ctor.

Parameters
  • d – Directed acyclic graph.

  • M – Means.

  • C – Covariance matrix.

cquery(node: Any, factual: Dict[Any, float], counterfactual: List[Dict[Any, float]], n_samples=10000) DataFrame

Conducts a counterfactual query.

Parameters
  • node – The target node (e.g. dependent variable, y).

  • factual – The factual evidence. All variables must be observed!

  • counterfactual – The counterfactual evidence. Must be parents of target node.

  • n_samples – Number of samples.

Returns

Counterfactual results.

equery(Y: str, X1: Dict[str, float], X2: Dict[str, float], samples=1000) float

Conducts an Average Causal Effect (ACE) query: E[Y=y|do(X=x1)] - E[Y=y|do(X=x2)].

Parameters
  • Y

  • X1 – X1 (X=x1).

  • X2 – X2 (X=x2).

  • samples – Number of samples.

Returns

ACE.

iquery(Y: str, X: Dict[str, float], samples=1000) Series

Conducts an interventional query: E[Y=y|do(X=x)].

Parameters
  • Y

  • X

  • samples – Number of samples.

Returns

E[Y=y|do(X=x)].

pquery(observations: Union[None, Dict[str, float]] = None) Tuple[Series, DataFrame]

Performs associational/probabilistic inference.

Denote the following.

  • \(z\) as the variable observed

  • \(y\) as the set of other variables

  • \(\mu\) as the vector of means
    • \(\mu_z\) as the partitioned \(\mu\) of length \(|z|\)

    • \(\mu_y\) as the partitioned \(\mu\) of length \(|y|\)

  • \(\Sigma\) as the covariance matrix
    • \(\Sigma_{yz}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|z|\) columns

    • \(\Sigma_{zz}\) as the partitioned \(\Sigma\) of \(|z|\) rows and \(|z|\) columns

    • \(\Sigma_{yy}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|y|\) columns

If we observe evidence \(z_e\), then the new means \(\mu_y^{*}\) and covariance matrix \(\Sigma_y^{*}\) corresponding to \(y\) are computed as follows.

  • \(\mu_y^{*} = \mu_y - \Sigma_{yz} \Sigma_{zz} (z_e - \mu_z)\)

  • \(\Sigma_y^{*} = \Sigma_{yy} \Sigma_{zz} \Sigma_{yz}^{T}\)

Parameters

observations – Observations.

Returns

Tuple of means and covariance matrix.

samples(size=1000) DataFrame

Samples data from the marginals.

Parameters

size – Number of samples.

Returns

Samples.

pyscm.reasoning.create_reasoning_model(d: Union[DiGraph, Dict[str, Any]], p: Union[Parameters, Dict[str, Any]]) ReasoningModel

Create a reasoning model.

Parameters
  • d – DAG.

  • p – Parameters.

Returns

ReasoningModel.

Associational Query

Associational query logic is contained in this module.

pyscm.associational.condition(X: List[int], Y: List[int], y: ndarray, m: ndarray, S: ndarray) Tuple[ndarray, ndarray]

Conditions X on Y; Y is the conditioning set (e.g. P(X | Y=y)).

Parameters
  • X – Indices.

  • Y – Indices.

  • y – The evidence e.g. Y=y.

  • m – Means.

  • S – Covariances.

Returns

Tuple of updated means and covariances.

pyscm.associational.make_sym_pos_semidef(S: ndarray) ndarray

Make a covariance matrix symmetric positive semidefinite.

Parameters

S – Covariances.

Returns

Symmetric positive semidefinite covariances.

Interventional Query

Interventional query logic is contained in this module.

pyscm.interventional.do_op(X: List[Any], X_val: Union[List[float], ndarray], d: DiGraph, M: Series, C: DataFrame) Tuple[DiGraph, Series, DataFrame]

Do an interventional query. All parents of each x in X are removed.

Parameters
  • X – Nodes.

  • X_val – Values corresponding to nodes.

  • d – DAG.

  • M – Means.

  • C – Covariances.

Returns

Tuple of DAG, means, and covariances.

pyscm.interventional.get_causal_effect(Y: int, X: List[int], Z: List[int], x: Dict[int, float], m: ndarray, S: ndarray, samples=1000) Series

Estimate causal effect.

Parameters
  • Y – Target.

  • X – X eg do(X=x).

  • Z – Z (parents of X).

  • m – Means.

  • S – Covariances.

  • samples – Number of samples.

Returns

Causal effect.

pyscm.interventional.remove_parents(children: List[Any], d: DiGraph, M: Series, C: DataFrame) Tuple[DiGraph, Series, DataFrame]

Creates a new model remove all parents of child node.

Parameters
  • children – Children.

  • d – DAG.

  • M – Means.

  • C – Covariances.

Returns

Tuple of new DAG, means and covariances.

Counterfactual Query

Counterfactual query logic is contained in this module.

pyscm.counterfactual.do_counterfactual(node: Any, factual: Dict[Any, float], counterfactual: List[Dict[Any, float]], d: DiGraph, M: Series, C: DataFrame, n_samples=10000) DataFrame

Estimate the counterfactual.

pyscm.counterfactual.get_scm(node: Any, d: DiGraph, M: Series, C: DataFrame, n_samples=10000) LinearRegression

Get the Structural Causal Model (SCM).

Parameters
  • node – Target node/variable.

  • d – DAG.

  • M – Means.

  • C – Covariances.

  • n_samples – Number of samples.

Returns

SCM.

Sampling

Sampling logic is contained in this module.

pyscm.sampling.sample(M: Series, C: DataFrame, size=1000) DataFrame

Generates samples from a multivariate normal distribution.

Parameters
  • M – Means.

  • C – Covariances.

  • size – Number of samples.

Returns

Samples.

Serde

Serialization and deserialization is contained in this module.

pyscm.serde.dict_to_graph(d: Dict[str, Any]) DiGraph

Convert a dictionary to a graph.

Parameters

d – Dictionary.

Returns

nx.DiGraph.

pyscm.serde.dict_to_model(data: Dict[str, Any]) ReasoningModel

Convert dictionary to model.

Parameters

d – Dictionary.

Returns

ReasoningModel.

pyscm.serde.graph_to_dict(g: networkx.classes.graph.Graph | networkx.classes.digraph.DiGraph) Dict[str, Any]

Convert graph to dictionary.

Parameters

g – nx.Graph or nx.DiGraph.

Returns

Dictionary.

pyscm.serde.model_to_dict(model: ReasoningModel) Dict[str, Any]

Convert model to dictionary.

Parameters

model – ReasoningModel.

Returns

Dictionary.

Learn

Learning algorithms are contained in this module.

class pyscm.learn.Pc(t=0.05, alpha=0.1)

Bases: object

Learns a Bayesian Belief Network using the PC-algorithm.

__init__(t=0.05, alpha=0.1)

ctor.

Parameters
  • t – Marginal independence threshold. Values lower than this one are considered as independent.

  • alpha – Conditional independence threshold. Values lower than this one are considered as conditionally independent.

fit(X: DataFrame)

Learns the DAG.

Returns

Pc.

class pyscm.learn.Tree(alpha=0.1)

Bases: object

Learns a Bayesian Belief Network with a tree structure.

__init__(alpha=0.1)

ctor.

Parameters

alpha – Conditional independence threshold. Values lower than this one are considered as conditionally independent.

fit(X: DataFrame)

Learns the DAG.

Returns

Tree.

pyscm.learn.compute_indep(df: DataFrame, t=0.05, alpha=0.1) DataFrame

Computes pairwise conditional independence tests.

Parameters
  • df – Data.

  • t – Threshold for marginal independence.

  • alpha – Alpha value for conditional independence.

Returns

Independence test results with columns: x, y, is_indep, p.

pyscm.learn.get_local_models(Xy: DataFrame) DataFrame

Learns all local models.

Parameters

Xy – Data.

Returns

Dataframe with results of local models. Hints on edge orientation.

pyscm.learn.identify_v_structures(d: DiGraph) List[Tuple[str, str, str]]

Identifies all v-structures (colliders) in the DAG.

Parameters

d – DAG.

Returns

List of collider configurations.

pyscm.learn.learn_local_model(Xy: DataFrame, y_col: str) Dict[str, Any]

Learn a local model from the data with respect to the y variable.

Parameters
  • Xy – Data.

  • y_col – y variable.

Returns

Dictionary of results of local model.

pyscm.learn.learn_skeleton(df: DataFrame, t=0.05, alpha=0.1) Graph

Learns the skeleton of the graph.

Parameters
  • df – Data.

  • t – Threshold for marginal independence.

  • alpha – Alpha value for conditional independence.

Returns

Undirected graph (skeleton).

pyscm.learn.learn_tree_structure(df: DataFrame) Graph

Learn a tree structure.

Parameters

df – Data.

Returns

Tree.

pyscm.learn.learn_v_structures(u: Graph, df: DataFrame) DiGraph

Learns the v-structures (colliders) of the graph.

Parameters
  • u – Undirected graph (skeleton).

  • df – Data.

Returns

Directed acyclic graph (DAG).

pyscm.learn.orient_by_inference(u: Graph, d: DiGraph, p: DataFrame) Tuple[Graph, DiGraph]

Orient edges by inference.

Parameters
  • u – Undirected graph (skeleton).

  • d – DAG.

  • p – Local model edge hints.

Returns

Tuple of undirected graph and DAG.

pyscm.learn.orient_by_models(u: Graph, d: DiGraph, p: DataFrame) Tuple[Graph, DiGraph]

Orient edges by using hints based on local models.

Parameters
  • u – Undirected graph (skeleton).

  • d – DAG.

  • p – Local model edge hints.

Returns

Tuple of undirected graph and DAG.