
data-hunters/metadata-digger
📦 Open Source Projectdata-hunters
A scalable Big Data tool for automated metadata extraction, deep learning enrichment, and analysis.
Metadata-digger is a specialized tool engineered for the high-throughput extraction and enrichment of metadata. Written in Scala, it leverages the power of Apache Spark to process massive volumes of files, making it highly suitable for big data pipelines. The tool excels at parsing Exif data from images and other media, while its modular architecture allows for the integration of deep learning models to perform advanced enrichment tasks, such as image classification or object detection, directly within the data processing flow. Key features include support for diverse formats like CSV, JSON, and GPS-tagged media, alongside seamless integration with search engines like Solr for indexing and analysis. By combining traditional metadata extraction with modern AI-driven enrichment, it provides a comprehensive solution for organizations needing to transform raw file dumps into structured, searchable intelligence.
💡Highlights
- ├─Scalable Spark-based processing
- ├─Deep learning metadata enrichment
- └─Exif, GPS, and JSON support
🎯For
- ├─Data Engineers
- ├─OSINT Researchers
- └─AI/ML Practitioners