AI Glossary

The simulation of human intelligence by machines — enabling computers to learn, reason, and make decisions.

AI Agent

An AI system that can perceive its environment, make decisions, use tools, and take autonomous actions to achieve a goal.

Attention Mechanism

A neural network technique that lets a model focus on the most relevant parts of an input when producing an output.

Activation Function

A mathematical function applied inside neural networks that introduces nonlinearity and lets models learn complex patterns.

AI Alignment

The effort to make AI systems behave in ways that are helpful, safe, and consistent with human goals and values.

Autoencoder

A neural network that learns to compress data into a low-dimensional representation and reconstruct it back to the original.

Adversarial Attack

An input crafted to fool a machine learning model into making a wrong prediction, often imperceptible to humans.

Backpropagation

The algorithm used to compute how each model parameter contributed to error so the network can update itself during training.

Batch Size

The number of training examples processed together in one forward/backward pass of the model.

BERT

Bidirectional Encoder Representations from Transformers

Bidirectional Encoder Representations from Transformers — a Google-built language model that reads text in both directions.

Batch Normalization

BatchNorm

A layer that normalizes activations across a batch to stabilize and speed up training of deep neural networks.

Context Window

The maximum amount of text (measured in tokens) that an AI model can process in a single interaction.

CLIP

Contrastive Language-Image Pretraining

A vision-language model that learns shared representations of images and text so they can be compared in the same embedding space.

Computer Vision

The field of AI focused on enabling machines to interpret and understand images and video.

Convolutional Neural Network

CNNConvNet

A neural network architecture specialized for processing grid-like data such as images using convolutional filters.

Chain of Thought

CoT

A prompting technique that asks an LLM to reason step by step before giving a final answer, improving complex reasoning.

Chunking

The process of splitting long documents into smaller pieces that fit into a language model's context window.

Cosine Similarity

A metric that measures similarity between two vectors based on the cosine of the angle between them, commonly used for embeddings.

Cross-Attention

An attention mechanism where queries from one sequence attend to keys and values from a different sequence.

Cross-Validation

Evaluation & Metrics

A technique for evaluating model performance by splitting data into multiple folds and testing on each fold in turn.

Constitutional AI

CAI

An alignment technique developed by Anthropic where an AI model critiques and revises its own outputs using a set of principles.

Deep Learning

A subset of machine learning using multi-layered neural networks to learn complex patterns from large datasets.

Diffusion Model

An AI model that generates images or other media by learning to reverse a gradual noise-adding process.

Distillation

Knowledge Distillation

A technique for transferring knowledge from a large "teacher" model to a smaller "student" model that can run faster and cheaper.

Dropout

A regularization technique that randomly deactivates a fraction of neurons during training to prevent co-adaptation and overfitting.

Data Augmentation

Techniques that expand training data by creating modified versions of existing examples — like rotating images or paraphrasing text.

Embedding

A numerical vector representation of text, images, or other data that captures semantic meaning in a high-dimensional space.

Epoch

One complete pass through the entire training dataset during model training.

Explainability

Interpretability

The degree to which humans can understand why a model made a particular prediction or decision.

Fine-tuning

Training a pre-trained model further on a smaller, task-specific dataset to adapt it for a particular use case.

Function Calling

A capability that lets language models request structured tool or API calls instead of only generating plain text.

Few-Shot Learning

A prompting or training approach where a model is shown a small number of examples before handling a new task.

F1 Score

Evaluation & Metrics

A classification metric that combines precision and recall into a single score, balancing false positives and false negatives.

FlashAttention

A memory-efficient attention algorithm that speeds up transformer training and inference by avoiding materialization of the full attention matrix.

Foundation Model

A large-scale model trained on broad data that can be adapted to many downstream tasks — like GPT-4, Claude, or Gemini.

Generative AI

GenAI

AI that can create new content — text, images, audio, video, and code — rather than just classifying or predicting.

Gradient Descent

An optimization method that updates model parameters in the direction that most reduces prediction error.

Generative Adversarial Network

GAN

A generative model architecture where two neural networks compete: a generator creates data and a discriminator tries to detect fakes.

Hallucination

When an AI model generates confident-sounding but factually incorrect or fabricated information.

Inference

The process of using a trained AI model to generate predictions, classifications, or responses on new input data.

In-Context Learning

The ability of a model to learn patterns from instructions and examples provided inside the current prompt without updating its weights.

Instruction Tuning

A fine-tuning approach where a model is trained on many instruction-and-response examples to improve its ability to follow user requests.

Jailbreak

A prompt or technique that bypasses an AI model's safety training to produce restricted or harmful content.

Knowledge Graph

Data

A structured representation of entities and the relationships between them, used to organize and reason over information.

KV Cache

A memory structure that stores previously computed attention keys and values, allowing LLMs to generate tokens without recomputing from scratch.

Large Language Model

LLMLLMs

A massive AI model trained on text data that can generate, summarize, translate, and reason about language.

Loss Function

A mathematical measure of how wrong a model’s predictions are during training.

LoRA

Low-Rank Adaptation

Low-Rank Adaptation, a parameter-efficient fine-tuning method that updates a small set of low-rank matrices instead of the full model.

Learning Rate

A hyperparameter that controls how much the model weights change with each update during training.

LSTM

Long Short-Term Memory

Long Short-Term Memory — a type of recurrent neural network designed to learn long-range dependencies in sequential data.

Layer Normalization

LayerNorm

A normalization technique that stabilizes training by normalizing activations across features within each sample.

Machine Learning

A subset of AI where systems learn from data to improve performance without being explicitly programmed.

Multimodal AI

AI systems that can process and generate multiple types of data — such as text, images, audio, and video — in a unified model.

Multi-Head Attention

A transformer technique that runs multiple attention operations in parallel so the model can capture different kinds of relationships at once.

Model Weights

The learned parameter values in a neural network that determine how input signals are transformed into outputs.

Model Context Protocol

AI Agents

MCP

An open protocol for connecting AI assistants to external tools, data sources, and systems in a standardized way.

Mixture of Experts

MoE

An architecture where a gating network routes each input to a small subset of specialized sub-models (experts), enabling massive parameter counts efficiently.

Model Card

A standardized document that describes a model's purpose, capabilities, limitations, training data, and intended use.

Neural Network

A computational model loosely inspired by the brain, made of interconnected nodes (neurons) that process information in layers.

Overfitting

When a model learns the training data too specifically and performs poorly on new, unseen data.

Prompt Engineering

The practice of crafting inputs to AI models to elicit better, more accurate, or more useful outputs.

Parameters

The learned numerical values inside a neural network that store what the model has learned from training data.

Prompt Chaining

A workflow pattern where multiple prompts are linked together so the output of one step becomes the input to the next.

Pre-trained Model

A model that has already been trained on broad data and can then be adapted or used for downstream tasks.

Perplexity

Evaluation & Metrics

A metric that measures how well a language model predicts text — lower perplexity means better predictions.

Positional Encoding

A technique for injecting information about token positions into transformer models, which otherwise have no notion of order.

Pretraining

The initial training phase where a model learns general patterns from large amounts of raw data before being fine-tuned for specific tasks.

Prompt Injection

An attack where malicious instructions in user input override an AI system's original instructions.

QLoRA

A LoRA-based fine-tuning method that combines low-rank adapters with quantized base models to reduce memory requirements even further.

Quantization

A technique that reduces model size and inference cost by storing weights and activations with lower numerical precision.

Retrieval-Augmented Generation

RAG

A technique that enhances LLM outputs by first retrieving relevant documents from an external knowledge base before generating a response.

RLHF

Reinforcement Learning from Human Feedback

Reinforcement Learning from Human Feedback — a training technique that aligns AI models with human preferences using human ratings.

Reinforcement Learning

A machine learning paradigm where an agent learns to take actions in an environment to maximize cumulative reward.

Recurrent Neural Network

RNN

A neural network architecture with connections that loop back, allowing it to process sequences and maintain memory of past inputs.

RAG Pipeline

The end-to-end architecture for retrieval-augmented generation, from query through retrieval to final LLM response.

Regularization

Techniques that prevent models from overfitting training data by penalizing complexity or introducing noise.

Residual Connection

Skip Connection

A shortcut that adds a layer's input to its output, enabling much deeper networks by preserving gradient flow.

Reranking

A second-stage retrieval step that reorders initial search results using a more accurate but slower model to improve relevance.

Self-Attention

A form of attention where each token in a sequence looks at every other token in the same sequence to build context-aware representations.

System Prompt

A high-priority instruction that sets the role, behavior, constraints, and goals for an AI model within an application.

Semantic Search

Data

A search approach that finds results based on meaning and intent rather than exact keyword matches.

Supervised Learning

A machine learning approach where models learn from labeled input-output pairs to predict outcomes on new data.

Self-Supervised Learning

SSL

A form of learning where the model creates its own labels from raw data, enabling training on massive unlabeled datasets.

Speculative Decoding

An inference acceleration technique where a small draft model predicts multiple tokens that a larger model then verifies in parallel.

Synthetic Data

Artificially generated data used to train or evaluate AI models, often created by other models or simulations.

Transformer

The neural network architecture behind most modern AI — uses attention mechanisms to process sequences in parallel.

Tokenization

Language & Text

The process of splitting text into smaller units (tokens) that a language model can process.

Temperature

A parameter that controls the randomness of an AI model's outputs — lower values are more deterministic, higher values are more creative.

Token

Language & Text

The basic unit of text processed by a language model, often representing a word, subword, punctuation mark, or symbol.

Text-to-Image

AI generation that creates images from natural language prompts.

Text-to-Video

AI generation that creates video clips from natural language prompts.

Transfer Learning

The practice of reusing knowledge from a model trained on one task to accelerate learning on a different but related task.

Training Data

Training Set

The dataset used to teach a machine learning model the patterns it needs to make predictions or generate outputs.

Tree of Thoughts

ToT

An advanced reasoning technique where the model explores multiple reasoning paths in a tree structure before choosing the best.

Top-K Sampling

A text generation strategy that restricts sampling to the K most likely next tokens at each step.

Top-P Sampling

Nucleus Sampling

A text generation strategy that samples from the smallest set of tokens whose cumulative probability exceeds P.

Underfitting

When a model is too simple or insufficiently trained to capture meaningful patterns in the data.

Unsupervised Learning

Machine learning techniques that find patterns in unlabeled data without explicit target outputs.

Vector Database

Data

A database optimized for storing and querying high-dimensional vector embeddings at scale.

Zero-Shot Learning

A model's ability to perform a task without any examples — just instructions or task descriptions.