Skip to main content
Back to Insights
11 min read

AI Terms to Know

A practical glossary of AI terms organized by topic, from core concepts and prompting to model architecture and training.

Core Concepts

The essential building blocks of AI that every user should understand.

Artificial General Intelligence (AGI)

A theoretical form of AI that could understand, learn, and apply intelligence across any domain at a level equal to or exceeding human capability. Unlike today's narrow AI systems (which excel at specific tasks), AGI would generalize across all cognitive tasks. While no AGI system exists today, it remains a central topic in AI safety discussions and long-term research goals.

Bias (Algorithmic Bias)

Systematic and unfair patterns in AI outputs caused by imbalances or assumptions in training data, model design, or evaluation criteria. Bias can lead to discriminatory outcomes in areas like hiring, lending, and content moderation. Understanding and mitigating AI bias is essential for deploying these systems responsibly.

Generative AI (Gen AI)

A type of AI that can create new, original content like text, images, audio, and video by learning patterns from existing data.

Hallucination

When an AI model generates incorrect or fabricated information that is not based on its training data.

Jagged Intelligence New in 2025

A term coined by Andrej Karpathy describing how LLMs simultaneously exhibit polymath-level sophistication in some domains while failing at tasks that seem trivial to humans. This unevenness is not a bug to be fixed but a structural consequence of how models are optimized. They spike in capability near domains targeted during training and remain surprisingly weak elsewhere.

LLM capability by domain:

Code Writing Counting Analysis Math Spatial

Large Language Model (LLM)

A type of artificial intelligence trained on massive amounts of text data to understand, generate, and respond to human language. See also Small Language Models (SLMs), their more compact counterparts optimized for efficiency and edge deployment.

Machine Learning (ML)

The broader field of computer science that enables systems to learn and improve from data without being explicitly programmed for every scenario. Machine learning is the umbrella discipline that encompasses deep learning, neural networks, and the large language models used in generative AI. When people say "AI," they are often referring to machine learning systems.

Multimodal

Describes AI models capable of processing and generating multiple types of data, such as text, images, audio, and video, within a single system. Multimodal models like GPT-4o, Gemini, and Claude can understand an image and respond with text, or take voice input and produce written output, enabling more natural and versatile interactions.

Neural Network

A computing architecture inspired by the human brain, consisting of interconnected layers of nodes (neurons) that process information by passing signals and adjusting connection weights during training. Neural networks are the foundation for deep learning and the transformer architecture that powers modern LLMs like GPT, Claude, and Gemini.

Simplified neural network:

Input
Hidden
Output

Prompt

The instruction, question, or input provided by a user to guide the AI's response.

Token

The fundamental unit of data that a model processes, which can be a word, part of a word, or a character. LLMs break down text into these tokens to understand and generate human language, with each unique token being assigned a specific numerical ID.

Word-level tokens (6):

The cat sat on the mat

Sub-word tokens (3):

un believ ably

Prompting & Context

How to communicate with and provide information to AI models effectively.

Chain-of-Thought (Reasoning)

The step-by-step logical process where AI models break down complex problems to arrive at conclusions. Advanced prompting techniques like Chain-of-Thought, Tree-of-Thought, and Graph-of-Thoughts explicitly structure this reasoning process, improving model performance on analytical tasks by 2-3× compared to direct answers.

Context Degradation

The phenomenon where an LLM's ability to accurately recall and utilize information decreases as more tokens fill the context window. Also known as context rot, this means that information placed earlier in a long conversation or document may be partially forgotten or deprioritized by the model, making context a finite resource with diminishing returns.

Recall accuracy over context length:

Start End of context

Context Engineering New in 2025

The practice of carefully managing what information is provided to an LLM in each interaction, including retrieval, filtering, structuring, and prioritizing context. Distinct from prompt engineering (which focuses on how you phrase instructions), context engineering focuses on ensuring the model has the right information to work with. Emerged as a critical discipline as AI applications moved into production.

Grounding

The practice of connecting AI outputs to verifiable, authoritative sources to improve accuracy and reduce hallucinations. Grounding techniques include retrieval augmented generation (RAG), citation generation, and fact-checking against known databases. A grounded response is one that can point to specific evidence supporting its claims.

Prompt Injection

An adversarial technique where malicious input is crafted to override or manipulate an LLM's system-level instructions, causing the model to ignore its intended behavior and perform unintended actions. A critical security concept for anyone building AI-powered applications, as it represents one of the primary attack vectors against LLM systems.

Semantic Understanding

The ability to grasp meaning and relationships between concepts beyond literal text matching. In AI contexts, semantic understanding allows models to comprehend intent, context, and connections between ideas. Semantic HTML and structured data help both search engines and AI models better interpret content meaning.

Temperature

A parameter that controls the randomness of an AI model's output. Lower temperature values (closer to 0) produce more deterministic, focused responses, while higher values introduce more variety and creativity. Adjusting temperature is one of the most common ways to tune AI behavior for different use cases, from factual Q&A to creative writing.

0.0 0.5 1.0
Focused Balanced Creative

AI Applications

Real-world implementations and use cases of AI technology.

Agentic AI New in 2025

Describes AI systems and workflows where models autonomously plan, execute multi-step tasks, use tools, and make decisions with minimal human intervention. While an AI agent is a specific system, "agentic" describes the broader paradigm shift toward AI that acts rather than just responds. 2025 saw agentic AI become the dominant industry theme, with frameworks like MCP, A2A, and ACP emerging to support it.

AI Agent

Autonomous system that perceives, reasons, acts, and observes to achieve goals.

Deepfake

AI-generated or AI-manipulated media (video, audio, images) designed to realistically depict people saying or doing things they never actually did. Created using deep learning techniques, deepfakes raise significant concerns around misinformation, fraud, and identity theft, making media literacy and verification tools increasingly important.

Embedding

A numerical representation of text (or other data) as a dense vector of numbers that captures its semantic meaning. Embeddings allow AI systems to measure how similar two pieces of content are by comparing their vectors. They are the foundational technology behind semantic search, recommendation systems, and retrieval augmented generation (RAG).

Generative (or Answer) Engine Optimization (GEO or AEO)

A marketing-based term meant to apply SEO (Search Engine Optimization) like practices to digital content so that AI-powered search tools can more easily cite, summarize, and synthesize it into direct answers.

Guardrails New in 2025

Safety mechanisms, filters, and constraints built into AI systems to prevent harmful, off-topic, or policy-violating outputs. Guardrails can include content filtering, topic restrictions, output validation, and automated monitoring. The term became ubiquitous in enterprise AI during 2025 as organizations sought to deploy AI responsibly at scale.

Retrieval Augmented Generation (RAG)

Enhances LLM prompts by retrieving relevant context from a vector database.

Prompt
Retrieve (Context)
Prompt + Context
Response

Synthetic Data

Artificially generated data that mimics the statistical properties of real-world data without containing actual personal or sensitive information. Used to train AI models when real data is scarce, expensive, or privacy-restricted, synthetic data is also increasingly used to augment training datasets and test AI systems under controlled conditions.

Vector Database

Stores data as numerical vectors, enabling semantic similarity searches.

Vibe Coding New in 2025

A term coined by Andrej Karpathy describing the practice of building software by describing what you want in natural language rather than writing traditional code. Enabled by AI coding assistants, vibe coding democratizes software development by allowing non-programmers to create functional applications through conversation with an AI.

Standards & Protocols

Interoperability standards that enable AI systems to work together.

Agentic Commerce Protocol (ACP) New in 2025

Standard for programmatic commerce flows between buyers, AI agents, and businesses.

Agent2Agent (A2A) New in 2025

Provides a language for agent interoperability regardless of agent frameworks or vendors.

Model Context Protocol (MCP)

Standardizes how LLMs connect and interact with external data sources and tools.

Model Architecture

How AI models are structured and designed to process information.

Context Window

The maximum amount of text (measured in tokens) that an AI model can process in a single interaction. This includes both the input prompt and the generated response. Larger context windows allow models to handle longer documents and maintain coherent conversations over more exchanges.

Context window capacity:

0 tokens 128K limit
Prompt Response

Distillation (Knowledge Distillation) New in 2025

A technique where a smaller "student" model is trained to replicate the behavior and capabilities of a larger "teacher" model. Distillation enables the creation of compact, efficient models that retain much of the larger model's quality at a fraction of the computational cost. The technique gained mainstream attention in early 2025 when DeepSeek demonstrated competitive performance through distillation of larger models.

Teacher
70B params
Student
7B params

~90% quality at 10x lower cost

Foundation Model

A large-scale AI model trained on broad, diverse data that can be adapted for various downstream tasks. Foundation models (like GPT-4, Claude, or Gemini) serve as the base for specialized applications through fine-tuning or prompting, rather than being built for a single specific purpose.

Inference

The process of running a trained AI model to generate an output from a given input. Every time you send a prompt and receive a response, you are performing inference. Inference speed, cost, and efficiency are key factors in deploying AI at scale, and much of the industry's optimization work focuses on making inference faster and cheaper.

Mixture of Experts (MoE)

A model architecture where multiple specialized sub-networks (called "experts") are contained within a single model, with only a subset activated for any given input. MoE enables models to have massive total parameter counts while keeping inference efficient and cost-effective, delivering frontier-level quality at significantly lower computational cost than dense models of equivalent capability.

Input
Router
E1 ✓
E2
E3 ✓
E4
E5
E6

2 of 6 experts active per input

Open Source vs Open Weight

A key distinction in how AI models are shared. Open weight models (like Meta's Llama) release trained model weights for public use but withhold training data, code, and methodology. Truly open source AI shares everything. Most models marketed as "open source" are actually open weight, making this distinction important for evaluating transparency and reproducibility claims.

Open Source
Open Weight
Weights
Training data
Code
Methodology

Small Language Models (SLMs)

More compact language models, typically under 10 billion parameters, designed for efficiency, edge deployment, or domain-specific tasks. SLMs serve as practical complements to their larger counterparts (LLMs), offering faster inference, lower costs, and the ability to run on local hardware, making AI more accessible and deployable in resource-constrained environments.

Test-Time Compute New in 2025

A scaling approach where additional computational resources are spent during inference (when the model generates a response) rather than only during training. By allowing models to "think longer" through extended reasoning traces, test-time compute provides a new lever for improving AI capability. This is the mechanism behind reasoning models like OpenAI's o1 and o3 series.

Transformer

The neural network architecture that underpins virtually all modern large language models. Introduced in the 2017 paper "Attention Is All You Need," the transformer uses a mechanism called self-attention to process all parts of an input simultaneously rather than sequentially, enabling models to capture long-range relationships in text. GPT, Claude, Gemini, and LLaMA are all built on transformer architectures.

Training & Optimization

How AI models learn and improve their capabilities.

Fine-Tuning

The process of adapting a pre-trained model to perform a more specific task or domain.

Pre-training

The initial phase of training a large language model on massive amounts of unlabeled data from diverse sources (websites, books, articles) to learn general language patterns, facts, and reasoning capabilities. This foundational training occurs before any task-specific fine-tuning.

Reinforcement Learning from Human Feedback (RLHF)

A training method to align an AI with human preferences.

RLVR (Reinforcement Learning from Verifiable Rewards) New in 2025

A training methodology that emerged as the most consequential technical development of 2025. RLVR adds a fourth stage to the LLM training pipeline (after pre-training, supervised fine-tuning, and RLHF) where models learn reasoning by training against automatically verifiable rewards in domains like math and code. This approach allows models to spontaneously develop strategies that resemble human reasoning.

LLM training pipeline:

Pre-training
SFT
RLHF
RLVR

Fourth stage added to improve reasoning

Key Takeaway

Understanding AI terminology helps you communicate effectively about AI capabilities and limitations. Whether you're just beginning to use AI tools or building AI-powered systems, these terms provide the foundation for deeper learning and more productive conversations about artificial intelligence. If you're just getting started with AI, consider reading Prompting Fundamentals: The GCSE Framework to learn how to apply these concepts in practice.

Sources & Further Reading

If you'd like to explore more AI concepts and deepen your understanding: