Foundation Model

A large-scale model trained on broad data that can be adapted to many downstream tasks — like GPT-4, Claude, or Gemini.

Foundation models are large-scale AI systems trained on broad, diverse data that serve as a base for many downstream applications. The term, coined by Stanford's CRFM in 2021, emphasizes that these models are general-purpose starting points, not task-specific solutions.

Examples include GPT-4, Claude, Gemini, LLaMA, and Stable Diffusion. They're trained once at massive expense, then fine-tuned, prompted, or deployed for countless specific tasks.

Key property: emergent capabilities. Large enough foundation models exhibit abilities like reasoning and in-context learning that weren't explicitly trained for.

The foundation model paradigm has reshaped AI economics. A handful of labs train these models; everyone else builds on top. It's the modern version of the "pretrain-then-fine-tune" approach taken to extreme scale.

Related Terms

← Back to Glossary