nannib/nbmultirag

🏗️ Frameworknannib

A bilingual RAG framework for chatting with multimedia documents including audio, video, images, and OCR.

nbmultirag is an open-source framework designed to bridge the gap between traditional text-based RAG and modern multimodal requirements. Built with Python and Streamlit, it simplifies the pipeline for ingesting, processing, and querying heterogeneous data types. The framework excels at handling multimedia inputs by utilizing OCR for image-based documents and specialized extractors for audio and video content. Key features include a modular architecture that allows for easy customization of the retrieval and generation stages. It supports integration with various LLM backends, enabling developers to deploy sophisticated chatbots that understand context across different media formats. Whether you are building a research assistant that analyzes video transcripts or a document management system that indexes scanned images, nbmultirag provides the necessary abstractions to handle the complexities of multimodal retrieval. Its bilingual support ensures accessibility for both English and Italian-speaking developers and end-users.

💡Highlights

├─Multimodal RAG (Audio, Video, OCR)
├─Bilingual support (EN/IT)
└─Streamlit-based UI integration

🎯For

├─AI Developers
└─Data Engineers

🔗Links

└─GitHub Repository