Definition

Hallucination

Hallucination is when an LLM produces confident output that is factually wrong, fabricated, or inconsistent with its source material.

Hallucination is when an LLM produces confident output that is factually wrong, fabricated, or inconsistent with its source material. The classic example is citing a paper that doesn't exist or calling a library function with a plausible-but-imaginary signature. In coding, hallucinations often show up as invented APIs, wrong argument orders, or references to packages that aren't installed.

Why it matters

Hallucination is the primary reliability failure mode of AI tools. For agentic coding, a hallucinated import or API call usually breaks immediately when the agent runs the tests — which is actually the saving grace: tight feedback loops catch hallucinations fast. Tools like Claude Code, Codex CLI, Qwen Code, and Kimi CLI all self-correct when a test or compiler rejects the hallucinated output.

The dangerous cases are hallucinations the system doesn't catch: plausible wrong docs, invented facts, or silently incorrect logic that happens to pass the tests you wrote. Review habits matter.

How it works

Hallucination is a natural consequence of how LLMs work: they predict the next token from training-data patterns, without a built-in mechanism to distinguish "I know this" from "this sounds like the kind of thing that would go here." The model has no ground-truth database — only weights.

Factors that increase hallucination:

  • Long context windows filled with loosely related material
  • Prompts that ask for specifics (names, dates, exact APIs) the model doesn't know
  • Low-quality or contradictory training data in the domain
  • Very small or heavily quantized models

Factors that reduce it:

  • RAG with authoritative sources
  • Tool use so the model can check facts instead of guessing
  • Strong system prompts telling the model to say "I don't know"
  • Fast verification loops (run tests, lint, type-check)

How it's used (managing it)

Practical mitigation in agentic coding:

  • Let the agent run tests and iterate — hallucinations fail compilation
  • Use read_file and grep tools aggressively so the model cites real code
  • Require citations — "quote the exact line" forces the model to verify
  • In plan mode, have the model state what it will check before editing

See /blog/catching-llm-hallucinations-in-code.

FAQ

Do frontier models still hallucinate?

Yes, just less. Modern Claude, GPT, and Qwen models hallucinate much less on well-known topics but still confidently invent details in long-tail domains.

Is hallucination fixable?

Not completely — it's rooted in how LLMs generate. But it's manageable. Combining retrieval, tools, and test-driven feedback pushes observable hallucination rates to low single-digit percentages on many coding workloads.

Related terms