Foundation Model
A large-scale model trained on broad data that can be adapted to many downstream tasks — like GPT-4, Claude, or Gemini.
Foundation models are large-scale AI systems trained on broad, diverse data that serve as a base for many downstream applications. The term, coined by Stanford's CRFM in 2021, emphasizes that these models are general-purpose starting points, not task-specific solutions.
Examples include GPT-4, Claude, Gemini, LLaMA, and Stable Diffusion. They're trained once at massive expense, then fine-tuned, prompted, or deployed for countless specific tasks.
The foundation model paradigm has reshaped AI economics. A handful of labs train these models; everyone else builds on top. It's the modern version of the "pretrain-then-fine-tune" approach taken to extreme scale.