Самые популярные архитектуры в Enterprise RAG Challenge Вот вам краткая выжимка… — @llm_under_hood

Самые популярные архитектуры в Enterprise RAG Challenge Вот вам краткая выжимка того, что люди использовали во время Enterprise RAG Challenge round 2. Она сделана на основе анализа 55 описаний архитектур, которые заполнили команды. 🤗 Спасибо всем, кто участвовал и заполнял! 🤗 Key Takeaways - RAG is near-universal. Almost every approach tries to solve the “long PDF → targeted answer” problem by chunking, storing embeddings, retrieving relevant sections, then letting the model “read” only those sections. - Structured prompts (with JSON/Pydantic) were popular to ensure consistent outputs—particularly for numeric or Boolean questions that required a definite format. - Chain-of-thought or multi-step reasoning is common, sometimes with multiple LLM calls for expansions, validations, or final re-checks. - Performance + Cost trade-offs surfaced: several teams used “fast & cheap” LLMs for search or chunk-labelling, then a heavier model (e.g., GPT-4o) for final answers. Most submissions combined: - Document parsing (Docling, PyMuPDF, or similar), - Vector or keyword-based retrieval (FAISS, Qdrant, BM25, etc.), - Iterative LLM-based reasoning (chain-of-thought or agent-like flows), - Structured response schemas (Pydantic or JSON). Despite the variety of LLM families (OpenAI GPT-4o variants, Llama, Gemini, Qwen, DeepSeek, IBM Granite, Microsoft phi, etc.), the underlying RAG pipeline structure remained strikingly consistent: parse PDFs, embed or index them, fetch relevant chunks, and prompt an LLM to produce carefully formatted answers. А то, насколько хорошо все эти архитектуры показали себя в рамках соревнования - мы узнаем уже в эту пятницу. Ваш, @llm_under_hood 🤗

Из этого канала