AI Basics April 20, 2026 12 min read

50 AI terms to know before your first rollout

A practical FastAI glossary covering tokens, RAG, agents, guardrails, and AGI before you launch your first AI workflow.

Use this glossary to align your team on the language of AI before kickoff. If you are discussing agents, sales automation, integrations, or model choices, these are the terms worth knowing first.

Glossary

50 core concepts worth knowing before you discuss architecture, rollout scope, or budget.

Neural network Neural Network

A model inspired by the way neurons pass signals to each other. It learns from examples and powers most modern AI systems.

LLM Large Language Model

A large language model trained on massive amounts of text. It predicts the next token so well that it can hold conversations, draft content, and analyze documents.

Parameter Parameter

A numerical weight inside the model that stores what it learned. More parameters can mean more capability, but they also make the model heavier and more expensive to run.

Token Token

The basic text unit a model works with. API pricing, limits, and context windows are usually measured in tokens.

Context window Context Window

The amount of information a model can keep in one session. Once the window is full, earlier parts of the conversation or document start to fall out.

Prompt Prompt

The instruction or request you send to the model. The clearer the context, format, and goal, the more reliable the result.

System prompt System Prompt

The base instruction that defines the model's role and boundaries before the conversation starts. It shapes the default tone, limits, and behavior.

Temperature Temperature

A setting that controls how varied the response should be. Lower temperature makes answers more predictable, while higher values add creativity and more risk.

Hallucinations Hallucinations

Cases where the model confidently invents wrong facts, references, or conclusions. That is why critical information should always be checked against an external source.

Training Training

The process where a model learns from a huge dataset and updates its parameters. It is usually the most expensive and compute-heavy stage in building a large model.

Fine-tuning Fine-tuning

Additional training on top of an existing model for a narrow task or domain. It helps the model match your tone, categories, and workflows more closely.

RAG Retrieval-Augmented Generation

An approach where the model retrieves relevant material from an external knowledge base before it answers. It reduces hallucinations and lets the system use up-to-date information.

API Application Programming Interface

A way to connect a model to your product through code. APIs let AI run inside a CRM, website, messenger, or internal tool instead of only in a public chat UI.

Inference Inference

The moment when the model applies what it learned and generates an answer to your request. In cloud APIs, this is usually what you are paying for.

Multimodality Multimodality

A model's ability to work with more than text, such as images, audio, or video. It enables workflows where AI can read screenshots, listen to voice notes, and answer in text.

Embeddings Embeddings

A numerical representation of text or another object in semantic space. Embeddings power semantic search, recommendations, and RAG pipelines.

Open-source vs closed-source models Open-source vs Closed-source

Closed models are consumed as a service, while open models can be downloaded and run on your own infrastructure. The right choice depends on quality, privacy, and budget constraints.

AI agents AI Agents

Systems that do more than answer in text: they can search for data, call tools, update a CRM, or hand work off to another step. That makes them operational executors rather than simple chatbots.

Chain of thought Chain of Thought

An approach where the model breaks a difficult problem into intermediate steps. It is especially useful for logic, calculations, and multi-step analysis.

Benchmark Benchmark

A standardized test used to compare models. It helps estimate general capability, but it does not replace evaluation on your actual use cases.

Transformer Transformer

The architecture behind modern language models. Its key idea is attention, which helps the model track relationships between distant parts of a sequence.

Diffusion model Diffusion Model

A model type commonly used for image generation. It learns to reconstruct an image from noise step by step, which is why it can create new visuals from scratch.

Pre-training Pre-training

The initial large-scale stage where the model absorbs general knowledge from huge datasets. After pre-training, it knows a lot but is not yet aligned to a specific product.

RLHF Reinforcement Learning from Human Feedback

A method where humans rate model outputs and the system learns to prefer answers that are more helpful and safer. It makes the behavior feel more aligned with user expectations.

Distillation Distillation

The process of transferring behavior from a larger model into a smaller one. It gives you a cheaper and faster model while preserving much of the original quality.

Quantization Quantization

A way to compress a model by using lower-precision numbers. It shrinks the footprint and speeds up execution while often keeping quality at an acceptable level.

Latent space Latent Space

The internal representation space a model uses for data. Objects with similar meaning end up closer together there than unrelated ones.

Zero-shot and few-shot Few-shot / Zero-shot

A way to steer the model without retraining: either with no examples at all or with a few examples inside the prompt. That is often enough for the model to infer the expected format.

Prompt engineering Prompt Engineering

The practice of designing prompts so the model answers more accurately, concisely, or in the structure you need. It matters most when you want reliable output without manual cleanup.

Role Role

The context in which the model interprets a message: system, user, or assistant. Understanding roles helps you build conversations and products with more predictable behavior.

Stop sequence Stop Sequence

A special marker that tells the model to stop generating. It is used to control response length or make sure output ends at the right point.

Streaming Streaming

A mode where the answer arrives piece by piece instead of all at once at the end. It improves perceived speed and lets you start handling the output earlier.

Function calling Function Calling / Tool Use

A mechanism that lets the model request specific tools instead of only generating text. This is what enables AI to search, calculate, update records, and trigger external actions.

In-context learning In-Context Learning

The model's ability to infer a pattern directly from the current prompt without retraining. That is why a few strong examples can dramatically improve the answer.

Overfitting Overfitting

A situation where the model memorizes the training data too closely and performs worse on new cases. That makes the system brittle outside familiar patterns.

Dataset Dataset

The collection of data used to train or evaluate a model. The quality of that dataset directly affects the quality of the resulting system.

Data labeling Data Labeling

The manual or semi-automated annotation that tells the model what matters in the data. Without good labeling, it is hard to get stable performance on a practical task.

Toxicity Toxicity

Offensive, harmful, or otherwise undesirable content in model outputs. Teams reduce it through training, moderation, and external filtering.

Guardrails Guardrails

A set of constraints and checks that keep the system inside acceptable boundaries. Guardrails prevent policy violations, dangerous output, and process failures.

Jailbreak Jailbreak

An attempt to bypass a model's restrictions with a clever prompt. In production systems, it is a serious risk that should shape your safety design.

Vector Vector

A numerical array that represents the meaning of an object in a way that can be compared mathematically. Vectors are the foundation of semantic search.

Vector database Vector Database

A storage system optimized for similarity search over vectors. It is what lets RAG systems quickly find the most relevant chunks of knowledge.

Chunking Chunking

The process of splitting long text into smaller pieces for indexing and retrieval. Good chunking has a direct impact on how useful the supplied context will be.

Pipeline Pipeline

A chain of sequential steps that data or a task moves through. In AI products, pipelines connect retrieval, context preparation, model calls, and post-processing.

Orchestration Orchestration

The coordination of multiple steps, agents, and tools as one system. Orchestration handles sequencing, context handoff, and the final assembled output.

Local model Local Model

A model that runs on your own machine or server instead of a third-party cloud API. It can help with privacy, but it requires your own infrastructure and often involves quality tradeoffs.

GPU Graphics Processing Unit

The graphics processor that can run massive numbers of calculations in parallel. GPUs are the standard hardware for training and serving neural networks efficiently.

Cloud AI Cloud AI

Using the model as a remote service over the internet. This removes infrastructure work on your side, but it ties you to an external provider and its pricing.

AI wrapper AI Wrapper

A product that uses someone else's model through an API and adds its own interface, logic, data, and workflow. Many practical AI businesses are built exactly this way.

AGI Artificial General Intelligence

The hypothetical stage where AI could solve a broad range of intellectual tasks at human level or beyond. Current models are strong in narrow classes of work, but they are not AGI.

FastAI

Want to launch an AI agent without wasting weeks on theory?

We can show you how to move lead qualification, customer replies, and repetitive workflows into a production AI agent.

Request a demo