A 6 Minute Introduction to Causal AI

https://ift.tt/l3pAS6G A 6-Minute Introduction to Causal AI Quickly gain an understanding of how modern AI systems fail, and how causality...

Decoding LLMs: When to Use Prompting, Fine-tuning, AI Agents, and RAG Systems

11 Insights from Sam Altman on the Future of AI, Jobs, and India

https://ift.tt/l3pAS6G

A 6-Minute Introduction to Causal AI

Quickly gain an understanding of how modern AI systems fail, and how causality can help

Modern AI systems have made it easy to tackle many problems previously thought out of reach of computers. You have possibly heard of some of these successes such as:

GPT-3: Generates paragraphs of human-like text based upon any initial prompt you provide it.
AlphaFold: Predicts how proteins take shape in 3D space. A true breakthrough in modern biology.
DALLE-2: Creates incredibly detailed and realistic images from text descriptions.

These systems are so good that they have convinced even those working on developing them that they are sentient.

However, despite the successes, many of these systems can be thought of as technological parrots. Parrots can mimic their owners, but do not have a true awareness of what they are saying, nor why they are saying it.

Similarly, modern AI systems can mimic the patterns they have learnt from previous data, without having the true context of the problem which is being solved, nor understanding why a given prediction is returned. Modern AI systems are parrots at both massive scale, GPT-3 was trained on approximately 3 billion web pages, and with huge societal implications.

The end result of this parroting is that modern AI systems suffer from the following three B’s:

Blind
Biased
Brittle

These three B’s mean that modern AI systems are flawed at tackling the nuanced, complex and high risk applications which they are being applied to. Let’s explore how a causal approach can help.

Blind

Modern AI systems are blind to the type of relationship between data points and lack context on the problems which they are being used to solve.

To illustrate this consider the relationship between years of experience and income. Typically, someone’s experience is correlated with their income: more experience trends with a higher salary. This is also true in reverse: a higher income trends with more experience. You can call this two-way correlation an associational relationship.

*Figure 1: Graph showing the positive correlation between years of experience and income.*

The other type of relationship is a causal one. In this case, one variable causes the change in another. The income someone earns is because of their years of experience*. Unlike the associational relationship, causality is one-way; an individual’s experience isn’t caused by the income they earn.

Causal techniques provide you with the tools to separate association from causation. By intervening on the system and setting someone’s experience to a given value you can observe how this would change their income. Using interventions you can determine the type of relationship between experience and income (causal or association), and in which direction it flows (experience causes income). You can think of interventions as a way of answering certain types of “what if” questions: What if I was 45, instead of 31, how much would I earn?

Modern AI systems are very good at identifying associations in the data and these relationships are fundamental to their success. However, because these systems have traditionally been blind to causality, they have repeatedly learnt misleading associations from the data. These misleading associations, or spurious correlations, can be pernicious and dangerously harmful to AI systems.

Intuitively, a correlation is spurious when we do not expect it to hold in the future in the same way it held in the past. You can find a great list of spurious correlations here. The elimination of spurious correlations is the basis of the randomised controlled trial; the scientific gold standard for proving a hypothesis.

Causal AI is powerful because it allows you to identify and eliminate spurious correlations using the existing observed data- without the need to run a controlled trial.

Biased

Spurious correlations are everywhere and are regularly learnt by modern AI systems. These correlations frequently introduce harmful bias, as evidenced by the examples below:

To illustrate how causal techniques can help, let’s extend the income prediction example considered before, by adding a number of other variables which are shown in Table 1.

*Table 1: Further data collected to help predict the income of different people.*

Due to historical biases within the observed data shown in Table 1, AI systems trained on it learn to associate the female sex with lower income. In order to ensure your model generates useful and safe predictions this bias needs to be controlled for.

Causal techniques allow the creation of a causal diagram which shows the relationships between the variables. Each arrow within this diagram demonstrates how one variable causally impacts another, e.g. experience has a causal effect on income. This allows you to explicitly represent the biases within the data.

*Figure 2: Example of a simplified causal diagram mapping the relationships between the variables shown in Table 1 above.*

Once you have a causal diagram which you believe accurately represents how the data are related to one another it can be manipulated to control for a range of different factors- including elimination of bias.

One manipulation could be an intervention on the sex of the German female engineer to see how that would impact their income. Alternatively, by controlling for sex, you can remove the influence of sex from the causal diagram. The result is an unbiased estimate of the effect of the other factors on income.

*Figure 3: A manipulated causal diagram, having controlled for sex- removing it’s influence on the other data points.*

Brittle

Modern AI systems are delicate systems, requiring close fine tuning to ensure they are configured correctly. Despite being trained on vast amounts of data, they can still fail in ways which are surprising or trivial from a human perspective. Figure 4, shows how an image processing algorithm fails to recognise a cow when it is on the beach, as opposed to in a field. This is despite the image classifier having been shown thousands of images of cows during its training.

*Figure 4: When applying a modern object detector to images of cows it struggles to identify them when the background is not green- demonstrating how these algorithms can fail.* *Source*.

For the types of modern AI systems referred to in this blog the ability to reliably predict on unseen and unfamiliar data is commonly referred to as generalisation. Causal machine learning approaches generalisation differently, as now both the observed data and the corresponding causal diagram are considered- see Figure 5 below.

Figure 5: Visualising how multiple causal diagrams can describe the same data. Mathematically all of these causal diagrams are valid descriptions of the data, however intuitively only the diagram on the far left is correct.

Therefore, causal models attempt to generalise from behaviour under one set of conditions, to behaviour under another set. Causal models should be selected based upon criteria which test their stability to changing conditions, e.g. when interventions are performed. Scientists follow this mantra when performing controlled trials to identify causal relationships.

The result is that causal models are more robust to changing conditions in the real world and can adapt more rapidly to dramatic shifts in the data. These advantages have led AI researchers to begin embedding these notions of generalisation, taken from causal AI, into the systems they are building.

Conclusion

This was a brief introduction to causal AI, discussing some of the advantages it brings and how these can help to overcome the blind, bias, and brittle nature of modern AI algorithms.

Here are the three key takeaways:

Causal Diagrams: By specifying the relationships between the observed data you can get a better understanding of the problem domain, reduce bias, and perform manipulations on the diagram to simulate a broad range of situations.
Manipulation: Scenarios can be modelled and what-if questions can be answered by manipulating causal AI. These manipulations allow for deeper exploration of the problem, and provide the ability to answer questions about hypothetical situations.
Generalisation: Causal AI generalises better to unseen data as it is built to adapt to changing environments, and not solely changing data.

Notes

*There is clearly more which goes into determining someone’s income level. These would be built out in a more complete model, see the section on “Bias”, but to keep things simple initially we’ll only consider years of experience as impacting income, with the remaining factors kept as hidden variables.

A 6 Minute Introduction to Causal AI was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

from Towards Data Science - Medium https://ift.tt/0RUakPA
via RiYo Analytics

Page Nav

Ads Place

A 6 Minute Introduction to Causal AI

https://ift.tt/l3pAS6G A 6-Minute Introduction to Causal AI Quickly gain an understanding of how modern AI systems fail, and how causality...

A 6-Minute Introduction to Causal AI

Quickly gain an understanding of how modern AI systems fail, and how causality can help

Blind

Biased

Brittle

Conclusion

Notes

Related Posts

No comments

Connect WIth Us

Top of the month

Cloud Service Models: IaaS, PaaS, and SaaS

Building AI-Powered Apps with DeepSeek-V3 and Gradio Using P...

How to Measure the ROI of Your Data Team?

A Beginner’s Guide to Unit Tests in Python

Latest Posts

LLaMA 4 vs. GPT-4o: Which is Better for RAGs?

Cloud Service Models: IaaS, PaaS, and SaaS

A Beginner’s Guide to Unit Tests in Python

Tutorial: Basic Statistics in Python — Probability

How do LLMs like Claude 3.7 Think?

‘A Goofy Movie,’ With a Serious Impact

Decoding LLMs: When to Use Prompting, Fine-tuning, AI Agents...

11 Insights from Sam Altman on the Future of AI, Jobs, and I...

How to Access Llama 4 Models via API

AI Passes the Turing Test: How Are LLMs Like GPT-4.5 Fooling...

Cloud Labels

Search This Blog

Report Abuse

Contributors

Happy To Help You

Popular Tag

Latest Articles

Cloud Service Models: IaaS, PaaS, and SaaS

A Beginner’s Guide to Unit Tests in Python

How to Prepare, Perform, and Follow Up for Data Science Interviews

Claude Sonnet 3.7: Performance, How to Access and More

Popular Posts

Spider-Man: No Way Home Torrents May Contain Crypto Malware, Cybersecurity Firm Warns

Onecoin Victims Petition Bulgaria for Seizure of Assets and Compensation

3air Leverages Blockchain Technology to Deliver Extensive Broadband Connectivity in Africa

AI Applications for Border Transportation