Home / Initiatives / AI & Human Cognition
Initiative 02

AI & Human Cognition

Does the output carry the same cognitive weight?

Benchmarks measure whether a model is right. They rarely measure whether a human understands it the same way. This initiative works at the intersection of AI and neuroscience, using brain-predictive models to quantify how machine output actually lands in a human mind.

Why it matters

Two translations can both be "accurate" and yet activate the brain in very different ways. A subtitle that is technically correct but cognitively heavier changes the experience. By grounding evaluation in predicted neural activation rather than surface similarity, we can measure meaning closer to how people actually process it.

How we approach it

  • Map audio onto roughly 20,000 cortical vertices using Meta FAIR's TRIBE v2, trained on fMRI from 720 people.
  • Score divergence with Pearson correlation between original and translated activation.
  • Ship it as tooling developers can run in a CI pipeline, not just a research paper.
Project in this initiative

Langdrift

Langdrift

Active

Detects cognitive drift across translations using brain-predicted neural activation. Run langdrift run original.mp3 translation.mp3 and get a drift score back.

TypeScriptPythonModal
Read more →

The throughline

Langdrift is the first probe into a bigger question: can we evaluate any AI output by how it resonates in the brain, not just whether it matches a reference string?

← All initiatives