py-scm
Common
These are common type definitions and interfaces used.
- class pyscm.common.IReasoningModel
Bases:
ABC
Interventional model.
- abstract iquery(Y: str, X: Dict[str, float], samples=1000) float
Conducts an interventional query: E[Y=y|do(X=x)].
- Parameters
Y –
X –
samples – Number of samples.
- Returns
E[Y=y|do(X=x)].
- abstract samples(size=1000) DataFrame
Samples data from the marginals.
- Parameters
size – Number of samples.
- Returns
Samples.
Reasoning
The reasoning engine logic is contained in this module.
- class pyscm.reasoning.ReasoningModel(d: DiGraph, M: Series, C: DataFrame)
Bases:
IReasoningModel
Reasoning model.
- __init__(d: DiGraph, M: Series, C: DataFrame)
ctor.
- Parameters
d – Directed acyclic graph.
M – Means.
C – Covariance matrix.
- cquery(node: Any, factual: Dict[Any, float], counterfactual: List[Dict[Any, float]], n_samples=10000) DataFrame
Conducts a counterfactual query.
- Parameters
node – The target node (e.g. dependent variable, y).
factual – The factual evidence. All variables must be observed!
counterfactual – The counterfactual evidence. Must be parents of target node.
n_samples – Number of samples.
- Returns
Counterfactual results.
- equery(Y: str, X1: Dict[str, float], X2: Dict[str, float], samples=1000) float
Conducts an Average Causal Effect (ACE) query: E[Y=y|do(X=x1)] - E[Y=y|do(X=x2)].
- Parameters
Y –
X1 – X1 (X=x1).
X2 – X2 (X=x2).
samples – Number of samples.
- Returns
ACE.
- iquery(Y: str, X: Dict[str, float], samples=1000) Series
Conducts an interventional query: E[Y=y|do(X=x)].
- Parameters
Y –
X –
samples – Number of samples.
- Returns
E[Y=y|do(X=x)].
- pquery(observations: Union[None, Dict[str, float]] = None) Tuple[Series, DataFrame]
Performs associational/probabilistic inference.
Denote the following.
\(z\) as the variable observed
\(y\) as the set of other variables
- \(\mu\) as the vector of means
\(\mu_z\) as the partitioned \(\mu\) of length \(|z|\)
\(\mu_y\) as the partitioned \(\mu\) of length \(|y|\)
- \(\Sigma\) as the covariance matrix
\(\Sigma_{yz}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|z|\) columns
\(\Sigma_{zz}\) as the partitioned \(\Sigma\) of \(|z|\) rows and \(|z|\) columns
\(\Sigma_{yy}\) as the partitioned \(\Sigma\) of \(|y|\) rows and \(|y|\) columns
If we observe evidence \(z_e\), then the new means \(\mu_y^{*}\) and covariance matrix \(\Sigma_y^{*}\) corresponding to \(y\) are computed as follows.
\(\mu_y^{*} = \mu_y - \Sigma_{yz} \Sigma_{zz} (z_e - \mu_z)\)
\(\Sigma_y^{*} = \Sigma_{yy} \Sigma_{zz} \Sigma_{yz}^{T}\)
- Parameters
observations – Observations.
- Returns
Tuple of means and covariance matrix.
- samples(size=1000) DataFrame
Samples data from the marginals.
- Parameters
size – Number of samples.
- Returns
Samples.
- pyscm.reasoning.create_reasoning_model(d: Union[DiGraph, Dict[str, Any]], p: Union[Parameters, Dict[str, Any]]) ReasoningModel
Create a reasoning model.
- Parameters
d – DAG.
p – Parameters.
- Returns
ReasoningModel.
Associational Query
Associational query logic is contained in this module.
- pyscm.associational.condition(X: List[int], Y: List[int], y: ndarray, m: ndarray, S: ndarray) Tuple[ndarray, ndarray]
Conditions X on Y; Y is the conditioning set (e.g. P(X | Y=y)).
- Parameters
X – Indices.
Y – Indices.
y – The evidence e.g. Y=y.
m – Means.
S – Covariances.
- Returns
Tuple of updated means and covariances.
- pyscm.associational.make_sym_pos_semidef(S: ndarray) ndarray
Make a covariance matrix symmetric positive semidefinite.
- Parameters
S – Covariances.
- Returns
Symmetric positive semidefinite covariances.
Interventional Query
Interventional query logic is contained in this module.
- pyscm.interventional.do_op(X: List[Any], X_val: Union[List[float], ndarray], d: DiGraph, M: Series, C: DataFrame) Tuple[DiGraph, Series, DataFrame]
Do an interventional query. All parents of each x in X are removed.
- Parameters
X – Nodes.
X_val – Values corresponding to nodes.
d – DAG.
M – Means.
C – Covariances.
- Returns
Tuple of DAG, means, and covariances.
- pyscm.interventional.get_causal_effect(Y: int, X: List[int], Z: List[int], x: Dict[int, float], m: ndarray, S: ndarray, samples=1000) Series
Estimate causal effect.
- Parameters
Y – Target.
X – X eg do(X=x).
Z – Z (parents of X).
m – Means.
S – Covariances.
samples – Number of samples.
- Returns
Causal effect.
- pyscm.interventional.remove_parents(children: List[Any], d: DiGraph, M: Series, C: DataFrame) Tuple[DiGraph, Series, DataFrame]
Creates a new model remove all parents of child node.
- Parameters
children – Children.
d – DAG.
M – Means.
C – Covariances.
- Returns
Tuple of new DAG, means and covariances.
Counterfactual Query
Counterfactual query logic is contained in this module.
- pyscm.counterfactual.do_counterfactual(node: Any, factual: Dict[Any, float], counterfactual: List[Dict[Any, float]], d: DiGraph, M: Series, C: DataFrame, n_samples=10000) DataFrame
Estimate the counterfactual.
- pyscm.counterfactual.get_scm(node: Any, d: DiGraph, M: Series, C: DataFrame, n_samples=10000) LinearRegression
Get the Structural Causal Model (SCM).
- Parameters
node – Target node/variable.
d – DAG.
M – Means.
C – Covariances.
n_samples – Number of samples.
- Returns
SCM.
Sampling
Sampling logic is contained in this module.
- pyscm.sampling.sample(M: Series, C: DataFrame, size=1000) DataFrame
Generates samples from a multivariate normal distribution.
- Parameters
M – Means.
C – Covariances.
size – Number of samples.
- Returns
Samples.
Serde
Serialization and deserialization is contained in this module.
- pyscm.serde.dict_to_graph(d: Dict[str, Any]) DiGraph
Convert a dictionary to a graph.
- Parameters
d – Dictionary.
- Returns
nx.DiGraph.
- pyscm.serde.dict_to_model(data: Dict[str, Any]) ReasoningModel
Convert dictionary to model.
- Parameters
d – Dictionary.
- Returns
ReasoningModel.
- pyscm.serde.graph_to_dict(g: networkx.classes.graph.Graph | networkx.classes.digraph.DiGraph) Dict[str, Any]
Convert graph to dictionary.
- Parameters
g – nx.Graph or nx.DiGraph.
- Returns
Dictionary.
- pyscm.serde.model_to_dict(model: ReasoningModel) Dict[str, Any]
Convert model to dictionary.
- Parameters
model – ReasoningModel.
- Returns
Dictionary.
Learn
Learning algorithms are contained in this module.
- class pyscm.learn.Pc(t=0.05, alpha=0.1)
Bases:
object
Learns a Bayesian Belief Network using the PC-algorithm.
- __init__(t=0.05, alpha=0.1)
ctor.
- Parameters
t – Marginal independence threshold. Values lower than this one are considered as independent.
alpha – Conditional independence threshold. Values lower than this one are considered as conditionally independent.
- fit(X: DataFrame)
Learns the DAG.
- Returns
Pc.
- class pyscm.learn.Tree(alpha=0.1)
Bases:
object
Learns a Bayesian Belief Network with a tree structure.
- __init__(alpha=0.1)
ctor.
- Parameters
alpha – Conditional independence threshold. Values lower than this one are considered as conditionally independent.
- fit(X: DataFrame)
Learns the DAG.
- Returns
Tree.
- pyscm.learn.compute_indep(df: DataFrame, t=0.05, alpha=0.1) DataFrame
Computes pairwise conditional independence tests.
- Parameters
df – Data.
t – Threshold for marginal independence.
alpha – Alpha value for conditional independence.
- Returns
Independence test results with columns: x, y, is_indep, p.
- pyscm.learn.get_local_models(Xy: DataFrame) DataFrame
Learns all local models.
- Parameters
Xy – Data.
- Returns
Dataframe with results of local models. Hints on edge orientation.
- pyscm.learn.identify_v_structures(d: DiGraph) List[Tuple[str, str, str]]
Identifies all v-structures (colliders) in the DAG.
- Parameters
d – DAG.
- Returns
List of collider configurations.
- pyscm.learn.learn_local_model(Xy: DataFrame, y_col: str) Dict[str, Any]
Learn a local model from the data with respect to the y variable.
- Parameters
Xy – Data.
y_col – y variable.
- Returns
Dictionary of results of local model.
- pyscm.learn.learn_skeleton(df: DataFrame, t=0.05, alpha=0.1) Graph
Learns the skeleton of the graph.
- Parameters
df – Data.
t – Threshold for marginal independence.
alpha – Alpha value for conditional independence.
- Returns
Undirected graph (skeleton).
- pyscm.learn.learn_tree_structure(df: DataFrame) Graph
Learn a tree structure.
- Parameters
df – Data.
- Returns
Tree.
- pyscm.learn.learn_v_structures(u: Graph, df: DataFrame) DiGraph
Learns the v-structures (colliders) of the graph.
- Parameters
u – Undirected graph (skeleton).
df – Data.
- Returns
Directed acyclic graph (DAG).
- pyscm.learn.orient_by_inference(u: Graph, d: DiGraph, p: DataFrame) Tuple[Graph, DiGraph]
Orient edges by inference.
- Parameters
u – Undirected graph (skeleton).
d – DAG.
p – Local model edge hints.
- Returns
Tuple of undirected graph and DAG.
- pyscm.learn.orient_by_models(u: Graph, d: DiGraph, p: DataFrame) Tuple[Graph, DiGraph]
Orient edges by using hints based on local models.
- Parameters
u – Undirected graph (skeleton).
d – DAG.
p – Local model edge hints.
- Returns
Tuple of undirected graph and DAG.