
kennethleungty/Data-Centric-AI-Competition
📦 Open Source Projectkennethleungty
Top 5% solution code for Andrew Ng's Data-Centric AI Competition, focusing on data quality over model architecture.
The Data-Centric AI Competition challenged participants to improve model performance by focusing exclusively on the dataset rather than modifying the underlying model architecture. This repository provides a comprehensive look at the techniques employed to reach the top tier of the leaderboard. Key features include data cleaning pipelines, systematic error analysis, and strategies for handling noisy labels. By documenting the iterative process of data refinement, the project highlights the importance of data quality in deep learning workflows. The code is implemented in Jupyter Notebooks, making it highly accessible for data scientists and researchers interested in understanding how to systematically improve datasets to achieve state-of-the-art results without increasing model complexity.
💡Highlights
- ├─Top 5% competition ranking
- ├─Data-centric methodology focus
- └─Jupyter Notebook implementation
🎯For
- ├─Data Scientists
- └─Machine Learning Engineers