The AI Problem Map

16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

long form, SEO friendly, copy-paste ready for engineers who keep hitting silent bugs

what is the Problem Map

Problem Map is a field guide for AI systems.
It catalogs the 16 most common failure modes we see in RAG, LLM agents, retrieval pipelines, OCR to PDF flows, vector stores, and multi-agent orchestration.
Each entry has: name, symptoms, a minimal repro, a quick fix, and a link to the working module.

Reference hub: WFGY ProblemMap
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md

WFGY stands for 萬法歸一. Think of it as a semantic firewall and reasoning layer that sits on top of any model. MIT license. No retraining. No infra change. It ships with math operators to reduce drift, recover from logic collapse, and make debugging auditable.

why this exists

You already know the feeling.
RAG returns the “right” chunks. Logs say fine. Latency fine. Yet the answer fuses two unrelated facts. Your knowledge base says
A. company went bankrupt in 2023.
B. founder launched a product in 2022.
The model replies: the company launched a revolutionary product in 2023.
No exception. No error code. Just semantic drift and a black box.

The Problem Map makes these silent failures visible and fixable.
Use it to map your bug to a numbered entry, then apply the minimal fix.

how to use it in 60 seconds

open a fresh GPT or Claude chat.
upload the neutral archive PDF of the engine:
WFGY/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf at main · onestardao/WFGY
paste

Use WFGY to answer my question. First answer normally. Then re-answer using WFGY. Compare depth, accuracy, stability. Rate both. If this chat is missing the WFGY PDF, refuse to run.

If the second answer holds constraints better or shows a visible recovery step, you have confirmation that your issue is one of the Problem Map items. Ask for the exact entry and fix.

the 16 problems, short table

note. we use No 1, No 2 etc. We avoid the hash symbol so GitHub does not auto link to random issues.

No	Name	Typical symptoms	Minimal repro idea	Minimal fix module	Doc
1	Hallucination and Chunk Drift	facts fused across sources, invented links, confident tone	two contradictory snippets that share surface nouns	semantic firewall plus residue check BBMC	ProblemMap
2	Interpretation Collapse	model misreads task type, changes format mid answer	ask for table then narration then table again	observe gate λ_observe with layout anchors	Semantic Clinic Index
3	Long Reasoning Chains	chain stalls at step 4 or loops, depth capped	12 step puzzle with latent constraints	multi path progression BBPF plus bridge step	same map
4	Bluffing and Overconfidence	confident wrong answers, made up references	ask for citations then check links	residue penalty BBMC, audit flag	same map
5	Semantic not equal to Embedding	closest vectors not semantically right, synonyms mislead	query terms with antonyms or temporal flips	query rewrite policy plus e_resonance	vectorstore metrics and FAISS pitfalls
6	Logic Collapse and Recovery	chain breaks, then repeats boilerplate	force a missing step between two must conditions	BBCR collapse bridge rebirth routine	same map
7	Memory Breaks Across Sessions	multi-turn plans forget anchors, reset tone	two windows continue the same plan	Starter Village memory anchors with observe gate	Starter Village
8	Debugging Is a Black Box	logs look fine while semantics are wrong	success codes but wrong synthesis	auditable telemetry with constraint deltas	map
9	Entropy Collapse in Long Context	repetitive phrases, loss of diversity, stuck tokens	very long context copy then ask for fresh plan	BBAM attention modulation plus WAY entropy pump	map
10	Creative Freeze	refuses to attempt, or collapses to clichés	ask for three divergent concepts blended	WAI head diversity plus path sampling	map
11	Semantic Drift in Routing	two deep links or routes handled inconsistently	router treats similar URIs differently	route normalizer plus intent guard	map
12	Symbolic Collapse	math or symbolic rules drift, units mix	unit conversion mid chain with hidden default	WDT cross path guard and unit normalizer	map
13	Multi Agent Chaos	agents overwrite each other, deadlocks, loops	two agents write to same state for same goal	WRI position lock and global constraint aggregator	map
14	Bootstrap Ordering	infra starts in wrong order, silent failure later	retriever before index build, tool before key	safety boundary checklist for boot order	bootstrap ordering
15	Deployment Deadlock	prod path passes tests then freezes under load	public path with private dependency, async timeout	safe fallback path and watchdog	deployment deadlock
16	Pre-deploy Collapse	empty vector store, missing secret, early call	trigger action before setup completes	preflight sanity check, red flag block	predeploy collapse

quick diagnosis flow

Use this three step path. It keeps you out of rabbit holes.

Name the symptom. pick from the table. do not guess root cause yet.
Run a minimal repro. remove non essential tools. keep one retriever, one store, one prompt.
Apply the minimal fix. start with the named module. only then add modules one by one.

If you do nothing else, at least try the 60 second repro above with the PDF. That gives you a yes or no for semantic stabilization.

what sits under the hood

WFGY ships math operators that behave like a reasoning layer.

BBMC bigbig semantic residue. minimize residue to align intent and generated tokens.
BBPF progression on multiple semantic paths. explore yet keep a stability bound.
BBCR collapse then bridge then rebirth. a safe reset when the chain stalls.
BBAM attention modulation. damp one token hijacks and reduce runaway loops.
WRI, WAI, WAY, WDT, WTF five gates. position lock, head diversity, entropy pump, illegal cross path guard, collapse detect.

These are model agnostic. You attach a PDF or TXT. The layer auto boots in the chat. MIT license. Works with GPT, Claude, Gemini, Mistral, Grok. The author of Tesseract.js starred the repo, which makes me very proud.

detailed examples for SEO and for you

RAG retrieval quality and FAISS pitfalls

Symptom. top k returns look “close” but answer melts across temporal logic.
Cause. embeddings cluster by surface similarity not semantic relation.
Repro. ask about 2022 vs 2023 events with shared nouns.
Fix. query rewrite policy with e_resonance and store metrics. set guardrails around time entities.
Doc. vectorstore metrics and FAISS pitfalls page above.

prompt injection that slips through role prompts

Symptom. a hidden instruction in a chunk disables your guard.
Repro. insert a soft “do not cite policy” line inside pdf body.
Fix. modular injection rules with blocklist allowlist and a bridge that isolates content from instruction.
Doc. prompt injection page above.

multi agent chaos in planners

Symptom. two tools schedule the same job. state oscillates or deadlocks.
Fix. use WRI to lock token positions for agency roles. add a global constraint aggregator.

deployment deadlock in prod

Symptom. works locally. freezes on public path.
Cause. public route triggers an async that waits for a private resource.
Fix. watchdog. explicit timeouts. safe fallback. preflight checklist in No 15 and No 16.

faq

Is this just a clever prompt
No. It is a compact math spec that you attach as a file. The model runs it as a contract. You can see the recovery step inside answers.

Does it require fine tuning
No. Zero retraining, zero infra edit.

Will it work with my agent framework
Yes, treat it as a reasoning overlay. You can still keep your tools and retriever. The semantic firewall reduces drift and makes failures auditable.

What is the business angle
RAG and agent reliability is the bottleneck in many teams. Problem Map is the shortest path to a working fix. MIT license for the core. Commercial extensions later.

links and next steps

Problem Map hub. the single page you need
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md
WFGY core engine overview
https://github.com/onestardao/WFGY/blob/main/core/README.md
Benchmarks vs large models
https://github.com/onestardao/WFGY/blob/main/benchmarks/benchmark-vs-gpt5/README.md
One click demo links and starter village
https://github.com/onestardao/WFGY/blob/main/StarterVillage/README.md

If you want me to map your bug to the right entry, write two lines.
one, the symptom in plain language. two, the shortest repro you can produce. I will tag it with No x from the table and point you to the minimal fix.

closing

The Problem Map is not theory. It came from rescuing engineers who were stuck inside silent failures. If you keep a link to only one page, keep the map. Save your time, save your weekend, ship the thing.

end

16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

The AI Problem Map

16 failure modes of RAG, LLM agents, and vector stores with reproducible fixes

what is the Problem Map

why this exists

how to use it in 60 seconds

the 16 problems, short table

quick diagnosis flow

what sits under the hood

detailed examples for SEO and for you

RAG retrieval quality and FAISS pitfalls

prompt injection that slips through role prompts

multi agent chaos in planners

deployment deadlock in prod

faq

links and next steps

closing

Subscribe to my newsletter

PSBigBig OneStarDao

PSBigBig OneStarDao