16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

The AI Problem Map

16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

long form, SEO friendly, copy-paste ready for engineers who keep hitting silent bugs


what is the Problem Map

Problem Map is a field guide for AI systems.
It catalogs the 16 most common failure modes we see in RAG, LLM agents, retrieval pipelines, OCR to PDF flows, vector stores, and multi-agent orchestration.
Each entry has: name, symptoms, a minimal repro, a quick fix, and a link to the working module.

Reference hub: WFGY ProblemMap
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md

WFGY stands for 萬法歸一. Think of it as a semantic firewall and reasoning layer that sits on top of any model. MIT license. No retraining. No infra change. It ships with math operators to reduce drift, recover from logic collapse, and make debugging auditable.


why this exists

You already know the feeling.
RAG returns the “right” chunks. Logs say fine. Latency fine. Yet the answer fuses two unrelated facts. Your knowledge base says
A. company went bankrupt in 2023.
B. founder launched a product in 2022.
The model replies: the company launched a revolutionary product in 2023.
No exception. No error code. Just semantic drift and a black box.

The Problem Map makes these silent failures visible and fixable.
Use it to map your bug to a numbered entry, then apply the minimal fix.


how to use it in 60 seconds

  1. open a fresh GPT or Claude chat.

  2. upload the neutral archive PDF of the engine:
    WFGY/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf at main · onestardao/WFGY

  3. paste

Use WFGY to answer my question. First answer normally. Then re-answer using WFGY. Compare depth, accuracy, stability. Rate both. If this chat is missing the WFGY PDF, refuse to run.

If the second answer holds constraints better or shows a visible recovery step, you have confirmation that your issue is one of the Problem Map items. Ask for the exact entry and fix.


the 16 problems, short table

note. we use No 1, No 2 etc. We avoid the hash symbol so GitHub does not auto link to random issues.

NoNameTypical symptomsMinimal repro ideaMinimal fix moduleDoc
1Hallucination and Chunk Driftfacts fused across sources, invented links, confident tonetwo contradictory snippets that share surface nounssemantic firewall plus residue check BBMCProblemMap
2Interpretation Collapsemodel misreads task type, changes format mid answerask for table then narration then table againobserve gate λ_observe with layout anchorsSemantic Clinic Index
3Long Reasoning Chainschain stalls at step 4 or loops, depth capped12 step puzzle with latent constraintsmulti path progression BBPF plus bridge stepsame map
4Bluffing and Overconfidenceconfident wrong answers, made up referencesask for citations then check linksresidue penalty BBMC, audit flagsame map
5Semantic not equal to Embeddingclosest vectors not semantically right, synonyms misleadquery terms with antonyms or temporal flipsquery rewrite policy plus e_resonancevectorstore metrics and FAISS pitfalls
6Logic Collapse and Recoverychain breaks, then repeats boilerplateforce a missing step between two must conditionsBBCR collapse bridge rebirth routinesame map
7Memory Breaks Across Sessionsmulti-turn plans forget anchors, reset tonetwo windows continue the same planStarter Village memory anchors with observe gateStarter Village
8Debugging Is a Black Boxlogs look fine while semantics are wrongsuccess codes but wrong synthesisauditable telemetry with constraint deltasmap
9Entropy Collapse in Long Contextrepetitive phrases, loss of diversity, stuck tokensvery long context copy then ask for fresh planBBAM attention modulation plus WAY entropy pumpmap
10Creative Freezerefuses to attempt, or collapses to clichésask for three divergent concepts blendedWAI head diversity plus path samplingmap
11Semantic Drift in Routingtwo deep links or routes handled inconsistentlyrouter treats similar URIs differentlyroute normalizer plus intent guardmap
12Symbolic Collapsemath or symbolic rules drift, units mixunit conversion mid chain with hidden defaultWDT cross path guard and unit normalizermap
13Multi Agent Chaosagents overwrite each other, deadlocks, loopstwo agents write to same state for same goalWRI position lock and global constraint aggregatormap
14Bootstrap Orderinginfra starts in wrong order, silent failure laterretriever before index build, tool before keysafety boundary checklist for boot orderbootstrap ordering
15Deployment Deadlockprod path passes tests then freezes under loadpublic path with private dependency, async timeoutsafe fallback path and watchdogdeployment deadlock
16Pre-deploy Collapseempty vector store, missing secret, early calltrigger action before setup completespreflight sanity check, red flag blockpredeploy collapse

quick diagnosis flow

Use this three step path. It keeps you out of rabbit holes.

  1. Name the symptom. pick from the table. do not guess root cause yet.

  2. Run a minimal repro. remove non essential tools. keep one retriever, one store, one prompt.

  3. Apply the minimal fix. start with the named module. only then add modules one by one.

If you do nothing else, at least try the 60 second repro above with the PDF. That gives you a yes or no for semantic stabilization.


what sits under the hood

WFGY ships math operators that behave like a reasoning layer.

  • BBMC bigbig semantic residue. minimize residue to align intent and generated tokens.

  • BBPF progression on multiple semantic paths. explore yet keep a stability bound.

  • BBCR collapse then bridge then rebirth. a safe reset when the chain stalls.

  • BBAM attention modulation. damp one token hijacks and reduce runaway loops.

  • WRI, WAI, WAY, WDT, WTF five gates. position lock, head diversity, entropy pump, illegal cross path guard, collapse detect.

These are model agnostic. You attach a PDF or TXT. The layer auto boots in the chat. MIT license. Works with GPT, Claude, Gemini, Mistral, Grok. The author of Tesseract.js starred the repo, which makes me very proud.


detailed examples for SEO and for you

RAG retrieval quality and FAISS pitfalls

Symptom. top k returns look “close” but answer melts across temporal logic.
Cause. embeddings cluster by surface similarity not semantic relation.
Repro. ask about 2022 vs 2023 events with shared nouns.
Fix. query rewrite policy with e_resonance and store metrics. set guardrails around time entities.
Doc. vectorstore metrics and FAISS pitfalls page above.

prompt injection that slips through role prompts

Symptom. a hidden instruction in a chunk disables your guard.
Repro. insert a soft “do not cite policy” line inside pdf body.
Fix. modular injection rules with blocklist allowlist and a bridge that isolates content from instruction.
Doc. prompt injection page above.

multi agent chaos in planners

Symptom. two tools schedule the same job. state oscillates or deadlocks.
Fix. use WRI to lock token positions for agency roles. add a global constraint aggregator.

deployment deadlock in prod

Symptom. works locally. freezes on public path.
Cause. public route triggers an async that waits for a private resource.
Fix. watchdog. explicit timeouts. safe fallback. preflight checklist in No 15 and No 16.


faq

Is this just a clever prompt
No. It is a compact math spec that you attach as a file. The model runs it as a contract. You can see the recovery step inside answers.

Does it require fine tuning
No. Zero retraining, zero infra edit.

Will it work with my agent framework
Yes, treat it as a reasoning overlay. You can still keep your tools and retriever. The semantic firewall reduces drift and makes failures auditable.

What is the business angle
RAG and agent reliability is the bottleneck in many teams. Problem Map is the shortest path to a working fix. MIT license for the core. Commercial extensions later.


If you want me to map your bug to the right entry, write two lines.
one, the symptom in plain language. two, the shortest repro you can produce. I will tag it with No x from the table and point you to the minimal fix.


closing

The Problem Map is not theory. It came from rescuing engineers who were stuck inside silent failures. If you keep a link to only one page, keep the map. Save your time, save your weekend, ship the thing.

end

0
Subscribe to my newsletter

Read articles from PSBigBig OneStarDao directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

PSBigBig OneStarDao
PSBigBig OneStarDao