Someone asked me this week how much AI tokens cost. And the honest answer is: it depends. Which I know is a frustrating answer, but bear with me – it’s actually quite interesting once you understand what’s going on under the hood.
SUMMARY
What you pay for AI depends on which AI you’re using, how long your question is, and how long its answer is. Output costs more than input, premium models cost more than budget ones, and the jury is still out on whether being polite costs you more or not. For personal use, don’t worry too much about tokens – for commerical or software use – be direct and concise in your prompts.
So what even is a token?
AI doesn’t read your messages the way you do. It breaks everything down into little chunks called tokens, and that’s what you’re actually paying for. As a rough guide, 1,000 tokens is apparently about 750 words. So the word “hello” is one token, but “tokenization” might get split into two. Punctuation counts. Spaces count. It all adds up.
The reason this matters is that you’re charged for both what you send and what the AI sends back. And the response – the output – almost always costs more than your question. Generating text requires more computational work than reading it, so most providers charge around 3–5x more for output tokens than input tokens.
Why does the price vary so much between AIs?
Different AI models charge very different rates. Budget-friendly options like Google’s Gemini Flash can cost as little as $0.08 per million input tokens, while premium models like GPT-5 (so, arguably doing more in-depth “thinking”) can run to $15 per million – and that’s before you get to the output costs on top. So the AI you choose makes a huge difference, and so does what you’re asking it to do. A simple question costs a lot less than asking it to write you a 2,000 word report, because the output is so much longer.
Tom – doing very technical coding – is always complaining his free limit on ChatGPT runs out quickly, whilst I can bounce social media content ideas around all day and not hit my limit.
It’s also worth knowing that different AIs count tokens differently. The same sentence can produce a different number of tokens depending on which model you’re using, which makes comparison a bit tricky.
Should you be polite to AI?
I’ve heard before that being polite to AI costs it more – it takes more tokens so if you’re on a free tier, you’r ‘re costing the company giving you access more by being polite. So one day I asked ChatGPT (on the free tier) and it said I should do whatever I was comfortable with, which for me, is being polite. I can’t be abrupt! It feels too rude.
And really, OpenAI (the makers of ChatGPT) don’t care – they’re not going to give me more as a result, they’ll just stop me sooner (once I’ve exhausted my free allowance).
But, does it actually cost more? This is where it gets a bit murky. I found two sources (1, and 2) that flatly contradict each other – one says polite prompts use more tokens, another says impolite ones do. They’re both citing what appears to be the same research and somehow arriving at opposite conclusions, which is either a lesson in how statistics get mangled on the Internet, or someone copied someone else’s homework and got it backwards. Either way, I’m not going to pretend I know which one is right!
What they do seem to agree on is that the difference is around 14 tokens per request. And really, at an individual level, that’s nothing – we’re talking about a few words’ worth of processing. It only becomes a talking point at the kind of scale OpenAI operates at, where even tiny inefficiencies across millions of users add up to real money.
That’s why Sam Altman (the head of OpenAI) was quoted a while ago as asking people not to be polite. He apparently wanted people to condense their prompts.
An example of how to save tokens might be changing:
“Can you please help me write a short summary of this article? Thank you so much!”
to something more like:
“Summarise this article.”
The practical takeaway is this: if you’re using a free tier, being chatty might burn through your allowance fractionally faster – in whichever direction the truth falls. And if you’re a developer running AI at scale, it’s worth being concise regardless of manners. But for the rest of us just using it day to day, I say speak however feels natural. I’ll be keeping my pleases and thank yous!