Automation using Causal Inference Modeling System (CIMS) will exponentially accelerate science and the development of longevity technology.


For centuries, the 'database' of human knowledge has been stored in books, journals and in the brains of living scientists.

This database process to facilitate science has been time-consuming and costly. Roughly, the steps are:

  1. Query the database, find knowledge gaps. (Preferably important ones, or those funding agencies find interesting).
  2. Design an experiment or study to fill the knowledge gap.
  3. Apply for funding.
  4. If accepted, gather resources and execute the experiment or study.
  5. Record the results in a paper.
  6. Have the paper peer-reviewed.
  7. Publish it back to the "database" for others to query.

Causal inference models, championed by Turing Award winner Judea Pearl, can formally represent scientific knowledge and data. More detail on this is provided in the FAQ that follows.

How can this process be improved by digitizing the most current and useful knowledge from the "database" into a Causal Inference Model?

Consider each step:

  1. Finding important gaps in our knowledge: FULLY AUTOMATABLE
    To find knowledge gaps, query the model with an important goal such as "how can we cure diabetes?" the result will be a specification of why the inference engine could not continue, ie missing parts of the model that would facilitate answering the question if they were filled.
    TIME SAVED: From weeks or months down to minutes.
  2. Design an experiment or study to fill the knowledge gap: MOSTLY AUTOMATABLE
    As models become more complete, designing experiments will be easier to automate. But having an interactive model will make the process much easier.
    TIME SAVED: From weeks or months down to hours.
    Organizations such as Research Hub are tackling this issue. If funding is pre-allocated to specific knowledge gaps time savings could be great.
    TIME SAVED: Varies.
  4. Execute the experiment or study: PARTLY AUTOMATABLE
    If the necessary Causal Models already exist, many experiments can be done quickly in silico. Meta-studies may be completely automated
    TIME SAVED: Months to years may be saved.
  5. Record the results: NOT AUTOMATED BUT EASIER
    Since a template for the experiment already exists, recording the results on the model will be much simpler.
    TIME SAVED: Varies.
  6. Peer-review: FULLY AUTOMATABLE
    The results entered will be automatically "peer-reviewed".
    TIME SAVED: From weeks or months down to minutes.
  7. Publish: JUST CLICK
    TIME SAVED: From weeks or months down to seconds.

This automation of most of the scientific process will result in a substantial acceleration of progress. However there is a more significant speed-up:

Many of the parts of the process we are automating, such as designing experiments and peer-reviewing, are the parts that required scientists to be PhDs. Automating these eliminates bias and allows for thousands more people to chip away at the knowledge gaps. With an integrated, global scale, decentralized project manager, anyone who wants to help cure a disease or work on other projects can do so.


This FAQ tree will let you drill down from highly conceptual answers to the technical details as needed.

Q: What are Causal Inference Models? Can they really model complex science and data?

Causal Inference Models: Level 1 answer:

One cannot talk about causal models without talking about Judea Pearl.

Judah Pearl won the Turing award in computer science for inventing Bayesian networks. Since then, he has rejected Bayesian networks in favor of causal inference models.

He also won The Frontiers of Knowledge Award for "laying the foundations of modern artificial intelligence, so computer systems can process uncertainty and relate causes to effects"

You can learn how causal models work and how they can model reality here by reading his book "The Book of Why" (2018) or watching this Microsoft Plenary video.

Q: Causal Inference Models vs. Statistical Models vs. Symbolic Models


Causal Inference Models Statistical Models Symbolic Models
Ontology A graph of causal relations Probability Expressions Structured Sentences
Handling of Time Handles time well, even at various scales Models two points: Before & After. Time interperted by a human and not easily automatable Terrible at representing time
Application to individual cases & access real world Handles individual cases well & easily refers to the real world Can't handle individual cases at all. Human interpertation needed to apply to the real world Can't handle statistics well. Human interpertation needed to apply to the real world


Causal Inference Models: Level 3 answer:

The knowledge store we are using in this project uses a language for making models called Proteus.

What does Proteus look like?

"JSON + Time + sparse lists + references/expressions + abstractions"

Given the "mathyness" of the papers on causal models it might seem that Causal Inference models will be hard to understand. Actually, they can be quite intuitive. Causal Inference Models, as we are using them, are created and stored as text files or streams. Let us quickly develop some intuitions about them!

Start by considering how JSON can be used to store the state-at-an-instant of anything. Even quantum systems.

Here is a JSON example from the web:

    “participant”: { 
        “name”: “rose”, 
        “age”:  “17”, 
        “status”: “enrolled”

Here is the same thing in Proteus:
    participant: { 
        name: rose, 
        age:  17, 
        status: enrolled

The obvious difference is that the quotes are gone. Proteus does have strings, but in this case, all the items given can be from abstractions, or "classes" whose behavour or meaning is defined elsewhere.

Notice that these models record something about "rose" at an instant in time. They do not capture that Rose may later change her name, become older or become 'unenrolled' later. Nor does it represent her past.

To represent how a system changes over time we have lists that record how the state of a system has changed over time. That brings up another problem: The list would be too big if they had to list every state an object ever had or has. So we have sparse lists. We also need to populate the lists with expressions and references instead of literal values, making them Turing complete. For the Time lists, the expressions refer to previous values so that they update correctly through time. Representing how a system would change over time in different situations is where Causal Inference Models really shine.

"Proteus = JSON + Time + sparse lists + references/expressions + abstractions" captures the basic intuition about how Proteus-based causal models look.


You can make this faster!

Coming soon