Foundations of CIT - Cognitive Information Theory

Foundations of CITRI

Artificial meets Cognitive Intelligence

Complexity

Definition

Information that can be accurately encoded and whose internal relationships can be determined. Rich, structured, extractable.

AI excels here. When the world is fully structured and compressible, AI outperforms humans. Chess is the canonical example — every legal move is known.

Human

Humans are weaker at raw complexity processing. We make arithmetic errors. We are not the right tool for brute-force pattern extraction.

The ‘good case’ for modern AI. Chess, controlled vision, language prediction.

Entropy

Definition

Information that cannot be accurately encoded or predicted. Small variations can radically alter meaning, as in chaos theory.

AI is largely blind to entropy. It continues producing confident answers even when its model no longer fits reality — ‘confidently wrong.’

Human

Humans have evolved biological mechanisms for entropic uncertainty. We slow down, seek new information, defer decisions when conditions are unpredictable.

The Tesla/lorry collision. The green fence. Distribution shift in real-world deployments.

Ambiguity

Definition

Information that, when encoded, may be interpreted to have more than one meaning, from insufficient data or from too much overlapping detail.

AI is structurally capable of supporting ambiguous reasoning and “collapses” to a single choice, often with a high degree of confidence, leading to AI systems being confidently wrong. The high level of confidence in the selection means confidence thresholds can not be used to detect ambiguity.

Human

Humans are able to hold multiple options for reasoning and often use the POSIT (assume something is correct) and REMIT (reverse if proven to be wrong) logic structure to support ambiguity.

When faced with a task of recognizing hand written numbers, AI will confidently select 1 for a image that even humans would struggle to determine if it was a 1 or 7.

The Journey to Cognitive Information Theory

01 — Introduction

The Cognitive Information Theory Research Institution (CITRI) was born from a journey that began not in the world of artificial intelligence research, but in the rigorous and unforgiving domain of aviation and avionics computing. This document traces the path that led Ken Wenger and Damian Fozard from the development of safety-critical avionics software to the creation of a new theoretical framework for understanding cognition itself.

02 — Origins in Avionics

Where the problem was first encountered

Working in aviation and avionics computing, our first introduction to the concepts of AI came from the perspective of how to reliably execute AI software in avionics systems. Avionics is governed by strict standards — including DO-178C — which prescribe the properties of avionics software and set standards for how it must be specified, designed, implemented, and tested to ensure it is safe for use.

There is a mindset that develops working in this space. As software developers, you come to focus on developing software that is deterministic — you can predict exactly what the software will do at all times — and bounded in time and space, meaning it will respond on time and will not exhaust system resources.

AI models are non-deterministic in the DO-178C sense, and this created a number of difficulties towards the adoption of AI in avionics. While at CoreAVI, Ken and I led the development of SafeAI — a CoreAVI platform that runs AI models deterministically with bounded time and space. SafeAI addressed how AI is executed, not how AI reaches its decisions, and so sits outside the scope of CITRI’s research. The decision-making problem — why and when AI models go wrong — remained open, and became the question that CITRI was eventually formed to answer.

In avionics, determinism is validated by considering the possible range of inputs and testing them against the range of outputs. With an AI model, the range of inputs is so vast and varied that you cannot define them easily, rendering traditional methods of validation ineffective. We knew that if AI was going to work in high-reliability systems, a different method of defining the range of operation and safety for AI models was needed.

Standard reference

DO-178C is the primary software development standard for airborne systems. It defines five software levels (A–E) based on the severity of failure conditions and prescribes verification rigour accordingly. Level A software — where failure could cause a catastrophic accident — demands the highest degree of determinism and traceability.

Three Pillars of CIT

Complexity

Statistical learning over large, well-defined datasets. Where AI excels.

Entropy

Uncertainty and information decay. Where AI is structurally blind.

Ambiguity

Multiple valid interpretations. Requires epistemic reasoning.

Standards Referenced

DO-178C

Airborne software — safety levels A–E

TP/FP

ML precision definition — data-fit only

CIT-P

CIT precision — intent-fit, including OOD

03 — A Different Perspective

Precision as a metric: two definitions, two world views

We noticed that very few people were focusing on AI precision in terms of how consistently and accurately AI met the intent of its use-case. It is important to distinguish between what most Machine Learning data scientists mean by precision, and what precision means in Cognitive Information Theory — because the difference reveals fundamentally different world views.

Precision in ML is mathematically defined as TP / (TP + FP), where TP is true positive and FP is false positive. This measures whether the trained model makes predictive errors, tested against a test dataset. For CIT, this is a subset of precision. We consider not simply the data fit, but how well the model fits with the intent of the system. This means CIT is also concerned with what happens when the model is confronted with input it has not been trained on. We do not assume that AI models are complete — quite the opposite. We know that models can never be complete for anything other than the most bounded of problems.

ML / AI — Precision

CIT — Precision

Definition: TP / (TP + FP) — how often the model avoids false positives on a test dataset.

Definition: How well the model fits the intent of the system, including behaviour on unseen inputs.

Assumes the model’s training distribution is representative of deployment conditions.

Assumes the model will encounter inputs outside its training distribution — this is the norm, not the exception.

Validated by test dataset. Better data → higher precision.

Validated by reasoning about information structure. Identifies where data alone cannot solve the problem.

This different perspective means that in ML/AI there is a prevailing belief that better training and more data would eventually produce models good enough for most applications. For some applications this has proven true. However, in the realm of high-reliability software, AI models are still a long way from meeting the required standard. CIT allows us to determine where data modelling and training are likely to solve a problem with a high degree of precision — and where such assumptions will fail.

04 — Key Milestones

Over a decade of systematic enquiry

There have been a number of key moments on the way to creating Cognitive Information Theory — mostly insights, and sometimes just clues, about the nature of AI model “thinking.” For over ten years, we have systematically tried to understand why AI models make mistakes, and the complexity involved with moving from processing information (the foundation of the Information Age) to understanding information (the foundation of the Autonomous Age).

2020

Autonomous Vehicles Silicon Valley conference

The Tesla and green fence incidents reframe the question from “how to improve data” to “how to detect when a model is not working.”

2020–21

Error clustering theory developed and proven

Mathematical proof that errors cluster at prediction boundaries. Vision paper published.

MARCH 2022

The entropy insight

At a conference in Germany, a pivotal discussion about AI consciousness leads to the overnight formulation of the entropy hypothesis: cognition is about handling the uncertainty of entropy, and AI models are blind to it.

2023+

Cognitive Information Theory formalised

Complexity, Entropy, and Ambiguity established as the three pillars. CITRI founded to extend the framework beyond AI into neuroscience, philosophy, and cognitive science.

05 — Lessons from Autonomous Vehicles

Two collisions and a different question

In 2020, we attended the Autonomous Vehicles Silicon Valley conference in San Francisco. A panel discussed recent experiences with autonomous vehicle failures that had surprised experts. The most well-known example involved a Tesla driving on the highway using Full Self-Drive that collided with an articulated lorry, killing the driver. The trailer was metallic and reflective, and blended with the sky behind it — rendering the lorry effectively invisible to the autonomous driving software.

A second example was shared: an autonomous test vehicle that collided with a green fence. The colour of the fence resembled shrubbery. The software perceived the shape of a fence but the colour of a shrub — and concluded it was neither. The green fence became invisible because it could not be recognised as a known object.

“While many people saw this as a challenge to make the data better, we took a different view: how to detect when an AI model was not working.”

Damian Fozard & Ken Wenger

These anecdotes, combined with our knowledge of the mathematical underpinnings of AI, led us to theorise that errors would likely cluster in AI models around the boundaries of predictions.

06 — Error Clustering and Its Implications

What error clusters revealed about real-world information

It was possible to prove this theory. The results of using clustering as an error detection technique are published in our Vision paper. Two immediate questions followed: first, were the clusters suitable mechanisms for reducing errors in AI models? Second, what caused the error clusters — and could we learn anything about real-world information as a result?

The answer to both questions is yes. The clustering technique has since been applied in our commercial work in vision AI, where products built on these insights perform measurably better in production. CITRI’s interest, however, is in the second question: what error clusters revealed about the structure of real-world information — and how those findings exposed the limitations of current information theories when applied to AI.

Two questions from the clustering work

1. Were the clusters suitable mechanisms for reducing errors in AI models? Yes. The technique has since been applied in our commercial work in vision AI, where it produces measurable improvements in production.

2. What caused the error clusters? The answer pointed to structural properties of real-world information that current information theories did not account for.

07 — The Entropy Insight

A sleepless night in Germany

In March 2022, as we were moving from research to product development, we were attending a conference in Germany. Ken posed a question regarding whether AI could ever be conscious and what that would mean. At the time, there was considerable discussion about generative AI models showing signs of self-awareness. This pivotal discussion moved our thinking from AI specifically to cognition — what does it mean to understand something? We knew that AI did not understand what it was processing — not in the sense that a human understands — but what was the key difference?

I did not sleep that night, and by morning, I was convinced I knew the answer: Entropy.

My background in cryptography meant that I had spent years developing and evaluating pseudo-random and random data sources for cryptographic systems. I had an appreciation for the nuance of entropic mathematics. That morning I formulated a simple hypothesis: cognition is about handling the uncertainty of entropy, and AI models are blind to entropy.

It was such a simple concept that at first it seemed to offer no real insights. But as time went on, our commercial work in vision AI gave us a real-world setting in which to observe the cognitive impact of entropy on decision-making — and made a second source of uncertainty visible: ambiguity. With that, we had the foundations of Cognitive Information Theory. It also revealed another cause of uncertainty: ambiguity. So we had the foundations of Cognitive Information Theory.

Pillar 01

Complexity

The domain of statistical learning. Where current AI is powerful. Pattern recognition over large, well-defined datasets.

Pillar 02

Entropy

The domain of uncertainty and decay. Where AI is blind. Information that may not be fully observable or may not persist.

Pillar 03

Ambiguity

Where multiple valid interpretations exist. Neither purely complex nor purely entropic. Requires epistemic reasoning — not pattern matching.

It turned out that this simple view of the physics of information was incredibly powerful in understanding the process of cognition. It helped us understand how human thinking, self-awareness, and consciousness evolved — as well as identifying the limiting factors in both generative AI and classification models.

08 — Why CITRI

A foundation that reaches beyond AI

We created the Cognitive Information Theory Research Institution because we realised that CIT reaches far beyond AI. It provides new perspectives for evolution, neuroscience, philosophy, and even psychology. It offers deep insight into the processes of cognition. Understanding is fundamental to our world view — and having a strong mathematical basis on which to anchor cognitive research is vital.

We offer CIT as part of that foundation. It builds on the work of others in the fields of information theory, decision theory, and AI, and presents a fundamental “physics” of cognitive information on which we can all build.

Continue exploring

The theory is open. The research is published.

Every claim in Cognitive Information Theory is grounded in observable evidence and published for scrutiny. Browse the full paper archive or review the source literature that underpins the framework.