RAGFlow
Also known as: RAGFlow engine, infiniflow/ragflow
RAGFlow is maintained by Infiniflow and had accumulated over 70,000 GitHub stars by early 2026. The core problem it solves: most RAG setups fail on messy real-world documents. Standard chunking (splitting a document into fixed-size text blocks) loses context when it cuts across tables, diagrams, or dense structured layouts. RAGFlow uses document-structure-aware parsing that preserves the meaning of tables, scanned pages, slides, and complex formatted files before chunking, resulting in better retrieval accuracy and fewer hallucinated answers.
Beyond document parsing, RAGFlow ships a full visual workflow editor for building agent pipelines, hybrid retrieval that combines vector search (semantic similarity) with BM25 (keyword matching) and custom re-ranking, and built-in citation tracing so every answer can be traced back to the source passage. It also integrates with MCP servers and external tools, making it usable as the knowledge retrieval layer inside a larger multi-agent system, not just a standalone chat-with-documents product.
For builders, RAGFlow is most relevant when the use case is enterprise knowledge bases, internal document Q&A, compliance-sensitive retrieval, or any context where grounded, verifiable answers matter more than fluent generation. It's self-hostable via Docker, which matters for teams with data sovereignty requirements. In March 2026, Infiniflow also released an official RAGFlow skill for OpenClaw, making it accessible as a plug-in knowledge layer for personal AI agent setups.