quarkiverse/quarkus-docling

🏗️ Frameworkquarkiverse

A Quarkus extension for seamless document processing and advanced PDF parsing within Java-based RAG pipelines.

The quarkus-docling extension bridges the gap between high-performance Java applications and advanced document processing. Built on top of the Docling library, it provides a streamlined way to handle document ingestion, layout analysis, and text extraction directly within a Quarkus environment. This is particularly critical for enterprise RAG applications where parsing multi-column layouts, tables, and nested document structures from PDFs is a common bottleneck. Key features include automated document parsing, support for multiple file formats, and seamless integration with existing Quarkus AI components. By abstracting the complexities of document transformation, it allows developers to focus on building intelligent agents and search systems rather than managing low-level parsing logic. The extension is designed for cloud-native deployment, ensuring that document processing can scale horizontally alongside other microservices in a Java-based infrastructure.

💡Highlights

├─Native Quarkus integration
├─Advanced PDF layout parsing
└─Optimized for RAG pipelines

🎯For

├─Java Developers
├─AI Engineers
└─Enterprise Architects

🔗Links

└─GitHub Repository