Large Language Model

A massive AI model trained on text data that can generate, summarize, translate, and reason about language.

A Large Language Model (LLM) is a transformer-based AI model trained on massive amounts of text — often hundreds of billions of words from the web, books, and code. LLMs learn statistical patterns in language, enabling them to predict what token comes next with stunning coherence and knowledge breadth.

Modern LLMs like GPT-4, Claude, and Gemini can write essays, answer questions, summarize documents, write code, translate languages, and reason through complex problems. Their capabilities emerge from scale — more data, more parameters, and more compute leads to qualitatively new abilities.

How they work: LLMs predict the next token in a sequence, one at a time. Despite this simple objective, training on enough text produces models that appear to reason, plan, and understand context.

Key LLM Concepts

Pre-training — trained on vast web text to predict next tokens
Fine-tuning / RLHF — aligned to follow instructions helpfully and safely
Context window — how much text the model can "see" at once
Temperature — controls randomness of outputs

LLMs are the foundation of most modern AI assistants and developer tools. They can be accessed via APIs, run locally with quantization, or fine-tuned on custom data with techniques like LoRA. The field is moving extremely fast — capabilities that seemed impossible in 2022 are routine in 2026.

Related Terms

← Back to Glossary