How It Works
๐
1. Load Docs
Paste any text documents
โ๏ธ
2. Chunk
Split into overlapping segments
โก
3. BM25 Search
Okapi BM25 lexical retrieval
๐ค
4. Generate
Claude Haiku with source citations
Documents
No documents yet.
Add documents and run a query to see the RAG pipeline in action.
Under the hood:
Retrieval: Okapi BM25 (kโ=1.5, b=0.75) โ the same algorithm powering Elasticsearch and Apache Lucene. Production systems often layer neural reranking (cross-encoders) on top of BM25.
Chunking: Paragraph-aware splitting with character-level fallback, ~400-char chunks. Overlap prevents context loss at boundaries.
Generation: Claude Haiku 3.5 with source-citation-enforced system prompt. Zero hallucination risk from knowledge not present in retrieved chunks.