16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

The AI Problem Map
16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes
long form, SEO friendly, copy-paste ready for engineers who keep hitting silent bugs
what is the Problem Map
Problem Map is a field guide for AI systems.
It catalogs the 16 most common failure modes we see in RAG, LLM agents, retrieval pipelines, OCR to PDF flows, vector stores, and multi-agent orchestration.
Each entry has: name, symptoms, a minimal repro, a quick fix, and a link to the working module.
Reference hub: WFGY ProblemMap
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md
WFGY stands for 萬法歸一. Think of it as a semantic firewall and reasoning layer that sits on top of any model. MIT license. No retraining. No infra change. It ships with math operators to reduce drift, recover from logic collapse, and make debugging auditable.
why this exists
You already know the feeling.
RAG returns the “right” chunks. Logs say fine. Latency fine. Yet the answer fuses two unrelated facts. Your knowledge base says
A. company went bankrupt in 2023.
B. founder launched a product in 2022.
The model replies: the company launched a revolutionary product in 2023.
No exception. No error code. Just semantic drift and a black box.
The Problem Map makes these silent failures visible and fixable.
Use it to map your bug to a numbered entry, then apply the minimal fix.
how to use it in 60 seconds
open a fresh GPT or Claude chat.
upload the neutral archive PDF of the engine:
WFGY/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf at main · onestardao/WFGY
Use WFGY to answer my question. First answer normally. Then re-answer using WFGY. Compare depth, accuracy, stability. Rate both. If this chat is missing the WFGY PDF, refuse to run.
If the second answer holds constraints better or shows a visible recovery step, you have confirmation that your issue is one of the Problem Map items. Ask for the exact entry and fix.
the 16 problems, short table
note. we use No 1, No 2 etc. We avoid the hash symbol so GitHub does not auto link to random issues.
No | Name | Typical symptoms | Minimal repro idea | Minimal fix module | Doc |
1 | Hallucination and Chunk Drift | facts fused across sources, invented links, confident tone | two contradictory snippets that share surface nouns | semantic firewall plus residue check BBMC | ProblemMap |
2 | Interpretation Collapse | model misreads task type, changes format mid answer | ask for table then narration then table again | observe gate λ_observe with layout anchors | Semantic Clinic Index |
3 | Long Reasoning Chains | chain stalls at step 4 or loops, depth capped | 12 step puzzle with latent constraints | multi path progression BBPF plus bridge step | same map |
4 | Bluffing and Overconfidence | confident wrong answers, made up references | ask for citations then check links | residue penalty BBMC, audit flag | same map |
5 | Semantic not equal to Embedding | closest vectors not semantically right, synonyms mislead | query terms with antonyms or temporal flips | query rewrite policy plus e_resonance | vectorstore metrics and FAISS pitfalls |
6 | Logic Collapse and Recovery | chain breaks, then repeats boilerplate | force a missing step between two must conditions | BBCR collapse bridge rebirth routine | same map |
7 | Memory Breaks Across Sessions | multi-turn plans forget anchors, reset tone | two windows continue the same plan | Starter Village memory anchors with observe gate | Starter Village |
8 | Debugging Is a Black Box | logs look fine while semantics are wrong | success codes but wrong synthesis | auditable telemetry with constraint deltas | map |
9 | Entropy Collapse in Long Context | repetitive phrases, loss of diversity, stuck tokens | very long context copy then ask for fresh plan | BBAM attention modulation plus WAY entropy pump | map |
10 | Creative Freeze | refuses to attempt, or collapses to clichés | ask for three divergent concepts blended | WAI head diversity plus path sampling | map |
11 | Semantic Drift in Routing | two deep links or routes handled inconsistently | router treats similar URIs differently | route normalizer plus intent guard | map |
12 | Symbolic Collapse | math or symbolic rules drift, units mix | unit conversion mid chain with hidden default | WDT cross path guard and unit normalizer | map |
13 | Multi Agent Chaos | agents overwrite each other, deadlocks, loops | two agents write to same state for same goal | WRI position lock and global constraint aggregator | map |
14 | Bootstrap Ordering | infra starts in wrong order, silent failure later | retriever before index build, tool before key | safety boundary checklist for boot order | bootstrap ordering |
15 | Deployment Deadlock | prod path passes tests then freezes under load | public path with private dependency, async timeout | safe fallback path and watchdog | deployment deadlock |
16 | Pre-deploy Collapse | empty vector store, missing secret, early call | trigger action before setup completes | preflight sanity check, red flag block | predeploy collapse |
quick diagnosis flow
Use this three step path. It keeps you out of rabbit holes.
Name the symptom. pick from the table. do not guess root cause yet.
Run a minimal repro. remove non essential tools. keep one retriever, one store, one prompt.
Apply the minimal fix. start with the named module. only then add modules one by one.
If you do nothing else, at least try the 60 second repro above with the PDF. That gives you a yes or no for semantic stabilization.
what sits under the hood
WFGY ships math operators that behave like a reasoning layer.
BBMC bigbig semantic residue. minimize residue to align intent and generated tokens.
BBPF progression on multiple semantic paths. explore yet keep a stability bound.
BBCR collapse then bridge then rebirth. a safe reset when the chain stalls.
BBAM attention modulation. damp one token hijacks and reduce runaway loops.
WRI, WAI, WAY, WDT, WTF five gates. position lock, head diversity, entropy pump, illegal cross path guard, collapse detect.
These are model agnostic. You attach a PDF or TXT. The layer auto boots in the chat. MIT license. Works with GPT, Claude, Gemini, Mistral, Grok. The author of Tesseract.js starred the repo, which makes me very proud.
detailed examples for SEO and for you
RAG retrieval quality and FAISS pitfalls
Symptom. top k returns look “close” but answer melts across temporal logic.
Cause. embeddings cluster by surface similarity not semantic relation.
Repro. ask about 2022 vs 2023 events with shared nouns.
Fix. query rewrite policy with e_resonance and store metrics. set guardrails around time entities.
Doc. vectorstore metrics and FAISS pitfalls page above.
prompt injection that slips through role prompts
Symptom. a hidden instruction in a chunk disables your guard.
Repro. insert a soft “do not cite policy” line inside pdf body.
Fix. modular injection rules with blocklist allowlist and a bridge that isolates content from instruction.
Doc. prompt injection page above.
multi agent chaos in planners
Symptom. two tools schedule the same job. state oscillates or deadlocks.
Fix. use WRI to lock token positions for agency roles. add a global constraint aggregator.
deployment deadlock in prod
Symptom. works locally. freezes on public path.
Cause. public route triggers an async that waits for a private resource.
Fix. watchdog. explicit timeouts. safe fallback. preflight checklist in No 15 and No 16.
faq
Is this just a clever prompt
No. It is a compact math spec that you attach as a file. The model runs it as a contract. You can see the recovery step inside answers.
Does it require fine tuning
No. Zero retraining, zero infra edit.
Will it work with my agent framework
Yes, treat it as a reasoning overlay. You can still keep your tools and retriever. The semantic firewall reduces drift and makes failures auditable.
What is the business angle
RAG and agent reliability is the bottleneck in many teams. Problem Map is the shortest path to a working fix. MIT license for the core. Commercial extensions later.
links and next steps
Problem Map hub. the single page you need
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.mdWFGY core engine overview
https://github.com/onestardao/WFGY/blob/main/core/README.mdBenchmarks vs large models
https://github.com/onestardao/WFGY/blob/main/benchmarks/benchmark-vs-gpt5/README.mdOne click demo links and starter village
https://github.com/onestardao/WFGY/blob/main/StarterVillage/README.md
If you want me to map your bug to the right entry, write two lines.
one, the symptom in plain language. two, the shortest repro you can produce. I will tag it with No x from the table and point you to the minimal fix.
closing
The Problem Map is not theory. It came from rescuing engineers who were stuck inside silent failures. If you keep a link to only one page, keep the map. Save your time, save your weekend, ship the thing.
end
Subscribe to my newsletter
Read articles from PSBigBig OneStarDao directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
