Skip to main content
Early access β€” new tools and guides added regularly
Core AI

Foundation Model

Last reviewed: April 2026

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks β€” GPT, Claude, and Llama are all foundation models.

A foundation model is a large AI model trained on broad, diverse data at massive scale that serves as the base for many different applications. The term was coined by Stanford researchers in 2021 to describe models like GPT, Claude, Llama, and Gemini that are not built for one task but can be adapted to thousands.

What makes a model "foundational"

  • Scale β€” trained on enormous datasets (trillions of tokens of text, billions of images)
  • Breadth β€” not specialised for any single task but capable across many
  • Adaptability β€” can be customised for specific applications through prompting, fine-tuning, or other techniques
  • Emergent capabilities β€” displays abilities that were not explicitly trained for, such as reasoning, translation, or code generation

The paradigm shift

Before foundation models, the AI workflow was: collect task-specific data β†’ train a task-specific model β†’ deploy for that one task. This meant building a separate model for each application.

Foundation models invert this: train one massive general model β†’ adapt it to many tasks with minimal effort. This is why a single model like Claude can write marketing copy, debug code, analyse data, and answer questions β€” all without separate training for each.

Key foundation models

  • GPT-4, GPT-4o (OpenAI) β€” text and multimodal generation
  • Claude (Anthropic) β€” text generation with emphasis on safety and helpfulness
  • Gemini (Google) β€” multimodal understanding and generation
  • Llama (Meta) β€” open-weight text generation
  • Stable Diffusion β€” image generation
  • Whisper (OpenAI) β€” speech recognition

Risks and considerations

  • Centralisation β€” a few organisations control the most capable models, creating dependency
  • Homogenisation β€” when everyone uses the same foundation models, their biases and limitations propagate everywhere
  • Cost β€” training foundation models costs millions of dollars, limiting who can create them
  • Opacity β€” the training data and processes of commercial foundation models are often not fully disclosed
Want to go deeper?
This topic is covered in our Essentials level. Access all 60+ lessons free.

Why This Matters

Foundation models are the platform layer of modern AI β€” nearly every AI application you use is built on one. Understanding this helps you evaluate the rapidly evolving landscape of AI products and make strategic decisions about which models and providers to build on for your organisation.

Related Terms

Learn More

Continue learning in Essentials

This topic is covered in our lesson: Choosing the Right AI Model