EleutherAI/pythia-70m-deduped
🧠 AI ModelEleutherAI
70M parameter GPT-NeoX model trained on deduplicated The Pile for interpretability research.
Pythia-70M-deduped is part of the Pythia suite of models created by EleutherAI for scientific research on language model training. It is a 70 million parameter transformer with a GPT-NeoX architecture, trained on the deduplicated subset of The Pile (EleutherAI/the_pile_deduplicated). The model is fully open-source, providing weights, training code, and data details. It is particularly useful for studying the effects of deduplication on learning, memorization, and generalization. The small size allows for extensive experimentation on a single GPU, making it accessible for academic research.
💡Highlights
- ├─70M parameters, GPT-NeoX architecture
- ├─Trained on deduped The Pile dataset
- └─Open-source for reproducibility
🎯For
- ├─AI researchers
- ├─Interpretability community
- └─ML students