AI Terms for Developers: What is a token?

AI Terms for Developers: What is a token?

Chimezie Enyinnaya
Chimezie Enyinnaya June 3, 2026

A token is the unit AI models actually work in. It’s roughly a word, sometimes part of one, sometimes a punctuation mark, sometimes a single space.

You think you're sending words. You're not. You're sending tokens.

The plain English version

Computers don't read text the way you do. Before an AI model can predict anything, your input has to be broken into pieces it can process. Those pieces are called tokens. The process of splitting text into tokens is called tokenisation, and every model uses a small program called a tokeniser to do it.

Sometimes a token is a whole word. “Cat” is one token. So is “and.” But longer or less common words get split. “Tokenisation” might be three tokens. “Antidisestablishmentarianism” might be eight. Punctuation, spaces, and emojis all count too.

Think of tokens as the Lego bricks of text. You see a word; the model sees the pieces it's built from.

Different AI providers use different tokenisers. The same sentence might be 12 tokens for one model and 14 for another. The differences are small in practice, but they exist.

The thing to internalise: tokens are the unit. Not characters. Not words. Tokens.

In practice

You'll see tokens in three places when you build with AI APIs.

  • Pricing: AI APIs charge per token. Usually a fraction of a cent for input (what you send), more for output (what the model generates). A short request might cost a thousandth of a cent. A long one with a verbose response might cost meaningfully more.

  • Limits: Every model has a maximum number of tokens it can handle in a single request. This is the context window, and it covers your input plus the model's output combined.

  • Response metadata: When the API replies, it tells you exactly how many tokens were used:

const message = await client.messages.create({ ... })
                                              
console.log(message.usage)
// { input_tokens: 12, output_tokens: 48 }

You can log this, alert on it, or charge users accordingly.

A rough rule of thumb: 1 token is about 4 characters of English text, that is, roughly ¾ of a word. A 1,000-word document is around 1,300 tokens. For exact billing, read message.usage.

What people get wrong

The most common misconception is that a token is always a word. It mostly isn't. “Hello, world!” is four tokens: Hello, ,, world, !, not two. Underestimating this is how you blow through context windows faster than expected.

The other trap is forgetting that tokens count both ways. Your input and the model's output count against the same context window and the same bill. A long response from the model can cost you more than the prompt that triggered it.

The takeaway

A token is the unit AI models work in. You think in words, the model thinks in tokens, and the gap between those two things is where pricing surprises live.

What's next

This is the second article in the AI Terms for Developers series: a connected map of the concepts you'll meet most often when building with AI. Tokens are how LLMs actually work under the hood — the unit they're predicting.

Next up: Prompts.