
jorgemunozl/Synthetic-Data
📦 Open Source Projectjorgemunozl
An agentic synthetic data generator that orchestrates LLM workflows to produce structured flowcharts and Mermaid diagrams.
The jorgemunozl/Synthetic-Data repository provides a sophisticated framework for generating synthetic data with a focus on visual workflow representation. At its core, the project utilizes LangGraph to manage complex, multi-step agentic workflows, ensuring that data generation is both logical and scalable. By integrating with OpenAI's LLM API, the system can interpret requirements and generate high-quality, context-aware content.
Key technical features include a FastAPI-based backend that exposes endpoints for easy integration into existing data pipelines. The system excels in converting abstract process descriptions into standardized Mermaid.js syntax, enabling immediate visualization of generated workflows. This approach is particularly useful for teams needing to document processes, simulate business logic, or create training datasets for visual reasoning models. The asynchronous architecture ensures high performance during data generation tasks, making it a versatile tool for automated documentation and synthetic dataset creation.
💡Highlights
- ├─LangGraph-based agent orchestration
- ├─Automated Mermaid diagram generation
- └─FastAPI integration for pipelines
🎯For
- ├─Data Engineers
- ├─AI Developers
- └─Technical Writers