{"results":[{"id":"evidence-beliefs-ablation","text":"Beliefs alone outperform beliefs + expert prompt: Opus 100% vs 94.2% (+5.8pp), Sonnet 94.2% vs 91.8% (+2.4pp). Adding expert prompt hurts — agent trusts its 'expertise' instead of consulting the knowledge base","truth_value":"IN","justification_count":0,"dependent_count":2,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"evidence-expert-vs-baseline","text":"Expert-service with EEM scores 88% A-grade vs agents-python 33% on same 50 questions, 15x faster","truth_value":"IN","justification_count":0,"dependent_count":1,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"evidence-retraction-rate","text":"13-37% of derived beliefs are retracted per review round across multiple expert KBs. Self-correction works — the system finds and removes its own errors","truth_value":"IN","justification_count":0,"dependent_count":1,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"expert-pipeline","text":"Expert pipeline: chunk source material → propose beliefs → human accepts → derive connections → review derivations → export. Value accrues at each stage, with derive producing new knowledge (connections the source doesn't make explicit)","truth_value":"IN","justification_count":1,"dependent_count":1,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"expert-prompt-paradox","text":"Telling an agent it is an expert reduces belief utilization. The humble generic prompt produces better results because the agent consults the knowledge base instead of trusting its 'expertise'","truth_value":"IN","justification_count":1,"dependent_count":1,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"how-agents-use-eem","text":"LLM agents use EEM by: querying beliefs via search/show/explain before answering, citing node IDs for auditability, running derive to generate new beliefs from existing ones, running review-beliefs to self-audit, recording nogoods when contradictions appear. The agent does not need to be told it is an expert — the knowledge base speaks for itself","truth_value":"IN","justification_count":1,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""},{"id":"scale-evidence","text":"EEM scales from small domains (237 beliefs, aap-expert) to large enterprises (12,731 beliefs, redhat-expert). 40+ expert knowledge bases built across code, product, project, and domain-specific experts","truth_value":"IN","justification_count":1,"dependent_count":0,"challenges":[],"last_reviewed":null,"review_result":null,"source_type":""}],"count":7,"limit":20,"offset":0}