Overview#
Causal discovery involves identifying causal relationships from data. This process uncovers the underlying causal structure without assuming prior knowledge of the direction or nature of causality. It typically employs statistical and computational methods to determine which variables influence others, providing a foundation for further causal analysis.
The key objective of causal discovery is often to retrieve and visualise a causal graph from observational data. In general, the causal graph for a specific problem can be retrieved relying on:
Domain knowledge: in some cases, there are well-known principles and mechanisms governing the processes under investigation. We might be able to draw causal graphs relying on the expertise of practitioners or on the findings reported in previous studies.
Data-driven methods: when we cannot assume prior knowledge about the process, we can obtain a causal graph using causal discovery algorithms such as the ones discussed in this chapter.
Hybrid approaches: in many cases, the two aforementioned approaches are complementary. For example, we could:
Start with domain knowledge: if experts can only outline an incomplete causal graph, we can include this knowledge in the causal discovery algorithm to discover the missing relationships.
Start with the data: we might derive a tentative causal graph from a causal discovery algorithm and then iteratively validate and refine it through consultation with engineers or practitioners.
With regard to data-driven methods, there are two main approaches:
Independence-based causal discovery: this class of methods involves analys
ing basic DAG structures like chains and forks to interpret how variables influence each other. This approach relies on statistical tests to identify conditional independencies among variables, which can then be used to infer the structure of the causal graph. Methods such as the PC algorithm [SGS01], the Fast Causal Inference (FCI) algorithm [], or the PCMCI algorithm [] fall into this category. One of the main drawbacks of these methods is that, in many cases, they can only identify the Markov equivalence class of the graph. This means that while we can identify sets of relationships that are consistent with the observed independencies, we cannot definitively determine the direction of causality between all the variables. For instance, if we have only two variables, which are independent, this method cannot specify whether one causes the other or if they are causally unrelated.
Semi-parametric causal discovery: this approach involves making some assumptions about the functional form of the relationships between variables but remains flexible by not fully specifying a parametric model. Methods like the Linear Non-Gaussian Acyclic Model (LiNGAM) [SHHyvarinen+06] are examples of this approach. Going beyond the Markov equivalence class comes at the expenses of making more specific assumptions about the functional forms of relationships between variables. These assumptions allow for a more detailed discovery of the causal graph, often identifying specific causal directions rather than just equivalence classes. By assuming certain non-linearities or specific distributions of errors, semi-parametric methods can exploit asymmetries in the data that reveal the direction of causal effects. This approach is more powerful in that it can often discern the actual causal structure rather than just a set of possible structures. However, the trade-off is that these methods require stronger assumptions about the nature of the data and the relationships involved, which may not always hold true.
In the upcoming Chapters, we focus primarily on semi-parametric approaches, inspired by the methodologies presented in Statistical Causal Discovery: LiNGAM Approach. Most of the examples hereby presented have been implemented using the LiNGAM Python package [IIZ+23].
Content of Causal Discovery Chapters#
Chapter |
Description |
---|---|
Linear Models |
How to retrieve the causal graph from the data, if we can assume a linear model with non-Gaussian noise for the data-generating process. |
Nonlinear Models |
Extends the LiNGAM approach by considering nonlinear functional forms, and accomodating for Gaussian noise. |
Time Series Models |
Extends the LiNGAM approach by providing a method for detecting contemporaneous and lagged effects in time series data. |
Structural Breaks |
Highilight the challenges of trying to identifying causal graphs in dynamic and evolving environments. |