AI Terms for Developers: What is an LLM?

You've probably used an LLM today, whether through ChatGPT, Claude, or an AI coding assistant. But what actually is an LLM?

The answer is more ordinary than the hype suggests. An LLM (large language model) is a program that predicts text. You give it some words, it gives you the words that should come next. That's the whole job.

The plain English version

An LLM is the engine behind ChatGPT, Claude, Gemini, and every other AI tool you've seen developers ship. Strip away the chat interface and the marketing, and what you're left with is a system that does one thing: takes text in, predicts what text should follow.

The “large” part refers to two things: how much text the model was trained on, and how many internal parameters it uses to make those predictions. Both numbers are absurd. Modern LLMs are trained on hundreds of billions of words pulled from books, articles, code, and the open web. The model itself contains billions of parameters, which are internal weights that shape every prediction it makes.

What's surprising is that “predict the next word” is enough to do almost everything you've seen AI do. Answer questions. Write code. Summarise documents. Translate languages. The model isn't running separate systems for each of those tasks; it's the same prediction engine applied to different inputs.

Your phone's autocomplete predicts the next word in a text message based on what you usually type. An LLM does the same thing, except it was trained on a much, much larger dataset, and it's predicting much longer continuations. Same mechanism, different scale.

In practice

When you call an AI API, you send the model some text, and it sends back its predicted continuation. That's the whole interaction.

Here's what that looks like in practice:

import Anthropic from "@anthropic-ai/sdk"

const client = new Anthropic()

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Explain TypeScript generics in one sentence." },
  ],
})

// message.content[0].text is the predicted continuation
console.log(message.content[0].text)

The pattern doesn't change: text in, text out. What changes is how you shape the call: through parameters like model, messages, and max_tokens. Different ways of shaping the input and constraining the output:

Parameter	What it does	Why you care
`model`	Which LLM to use	Different models, different costs, different capabilities
`messages`	An array of role/content turns	Your prompt goes here. So does chat history
`max_tokens`	Hard limit on output length	Output length = bill size. This is your cost control

Once you've got the shape, the rest of the API is just naming.

What people get wrong

The most common misconception is that LLMs understand what they say. They don't. They don't reason about facts or check their sources. They predict the most likely next words given what came before. Most of the time, those words happen to be correct, which is why the output feels intelligent.

This is also why LLMs sometimes confidently invent things that aren't true. The model isn't lying; it's predicting. Sometimes the most plausible-sounding continuation is also wrong.

A note on images and files

The “text in, text out” framing is a useful starting point, but modern LLMs accept more than just text. Most current models can take images, PDFs, and audio as input alongside your prompt such as describing a photo, summarising a document, or transcribing a meeting.

The mental model still holds, though. The model converts whatever you send, whether text, images, or files, into the same internal representation, then predicts what text should come next. Output is still text. The input got wider.

The takeaway

An LLM is a prediction engine for text. Everything else, whether chat, code, summaries, or reasoning, is what happens when you point that engine at the right input.

What's next

This is the first article in the AI Terms for Developers series: a connected map of the concepts you'll meet most often when building with AI.

At Mezie Labs, we teach the mechanics first, because building with AI gets much easier once the fundamentals click.

Next up: Tokens.