Hallucinations
An LLM can produce information that sounds completely confident and fluent — but is simply false. This is called a hallucination: the model generates content that isn't grounded in real facts or in the input you gave it. Hallucinations are one of the most important limitations to understand when using LLMs, because the wrong answer often looks just as polished as a right one.
💡 In one line: A hallucination is when an LLM confidently states something false or made-up — because it predicts plausible text, not verified truth.
What is a Hallucination?
It's when an LLM generates false, fabricated, or unsupported information and presents it as if it were true. This can be a wrong fact, an invented detail, a fake citation, or a claim that contradicts the source you provided.
Why Do Hallucinations Happen?
The root cause is how LLMs work: they predict the most plausible next token, not the true one. They're optimised for fluent, likely-sounding text — not for verified accuracy. An LLM has:
- No built-in fact-checking.
- No database lookup by default.
- No real sense of "I don't know."
So when its knowledge is missing or fuzzy, it fills the gap with something that sounds right.
Types of Hallucinations
| Type | What it means | Example |
|---|---|---|
| Factual (factuality) | Wrong real-world facts | Wrong date, name, or statistic |
| Faithfulness | Strays from the provided input | Adds claims not in a document it's summarising |
| Fabrication | Invents things that don't exist | A fake citation, quote, or URL |
Why Are They So Confident?
By default, LLMs don't express genuine uncertainty. Because generation is so fluent, a false statement looks just as authoritative as a true one — there's no visible "confidence flag" warning you. This is what makes hallucinations dangerous: they're convincing.
Common Examples
- Inventing a research paper or citation that doesn't exist.
- Stating a wrong date or statistic with total confidence.
- Adding details not present in a document you asked it to summarise.
- Making up a plausible-but-fake API, function, or product name.
How to Reduce Hallucinations
| Technique | How it helps |
|---|---|
| Grounding / RAG | Give the model real source documents to answer from |
| Lower temperature | Less random invention |
| Ask for sources | Encourages it to cite (and reveals weak claims) |
| Allow "I don't know" | Prompt it to admit uncertainty instead of guessing |
| Restrict to provided info | "Answer only from the text below" |
| Verify outputs | Always check facts, numbers, and citations |
Newer models and fine-tuning reduce hallucinations — but don't eliminate them.
Why It Matters
- Never trust LLM output blindly, especially for high-stakes uses (medical, legal, financial, factual).
- Always verify important facts, figures, and references.
- Hallucinations are a core reason RAG and tool use exist — to ground the model in real, checkable information.
Summary
- A hallucination is confident, fluent, but false output.
- It happens because LLMs predict plausible text, not verified truth — with no built-in fact-checking.
- Types include factual errors, unfaithfulness to the input, and fabricated citations or details.
- They're dangerous because they look authoritative with no uncertainty signal.
- Reduce them with grounding/RAG, lower temperature, sources, and verification — and always double-check important outputs.