Skip to main content
Early access β€” new tools and guides added regularly
Core AI

Turing Test

Last reviewed: April 2026

A test proposed by Alan Turing in 1950 where a human evaluator tries to distinguish between a machine and a human based on conversation alone.

The Turing test is a benchmark proposed by mathematician Alan Turing in 1950 to evaluate whether a machine can exhibit intelligent behaviour indistinguishable from a human. In its simplest form, a human evaluator has text conversations with both a human and a machine without knowing which is which, then tries to tell them apart.

The original proposal

Turing's 1950 paper "Computing Machinery and Intelligence" asked: "Can machines think?" Rather than debating the philosophical definition of thinking, Turing proposed a practical test. If a machine could consistently fool human evaluators into thinking they were talking to another human, it could be considered to exhibit intelligence equivalent to a human in that context.

How the test works

  1. A human evaluator communicates via text with two unseen participants β€” one human and one machine.
  2. The evaluator asks whatever questions they want.
  3. Both participants try to convince the evaluator they are human.
  4. After the conversation, the evaluator guesses which is the machine.
  5. If the machine fools the evaluator a significant percentage of the time, it "passes" the test.

Has any AI passed the Turing test?

This depends on how strictly you define "passing." Modern LLMs can maintain convincing human-like conversation on many topics and have fooled evaluators in casual interactions. However:

  • In controlled academic settings with sophisticated evaluators, current AI still reveals itself through various tells.
  • Short, casual conversations are easier to pass than extended, probing ones.
  • The test's conditions (how long the conversation lasts, how skilled the evaluator is) dramatically affect results.

Criticisms of the Turing test

  • It tests imitation, not intelligence: A machine might mimic human conversation without any genuine understanding.
  • It is subjective: Results depend heavily on the evaluator's skill and expectations.
  • Narrow scope: It only tests conversational ability, not the full range of intelligent behaviour.
  • Moving goalposts: As AI improves, critics argue that passing the test is not sufficient evidence of intelligence.

The Turing test's legacy

Despite its limitations, the Turing test fundamentally shaped how we think about AI. It shifted the question from the unmeasurable ("can machines think?") to the observable ("can machines behave indistinguishably from humans?"). Modern AI evaluation has evolved beyond the Turing test, but its core insight β€” evaluate behaviour, not philosophical claims β€” remains foundational.

Modern alternatives

Today, AI capabilities are evaluated through specialised benchmarks: MMLU for knowledge, GSM8K for maths, HumanEval for coding, and many others. These provide more precise, reproducible measurements than the subjective Turing test.

Want to go deeper?
This topic is covered in our Foundations level. Access all 60+ lessons free.

Why This Matters

The Turing test remains culturally significant as the most famous benchmark in AI history. Understanding it helps you contextualise the ongoing debate about AI intelligence and appreciate why modern evaluation focuses on specific capabilities rather than broad claims of human-equivalence.

Related Terms

Learn More

Continue learning in Foundations

This topic is covered in our lesson: What Is Artificial Intelligence (Really)?