
the-crypt-keeper/can-ai-code
🏗️ Frameworkthe-crypt-keeper
Self-evaluating coding interview for AI models using LLMs and transformers.
A self-evaluating interview system for AI coders. It runs coding tasks from benchmarks like HumanEval against various LLM backends (Llama.cpp, Transformers, LangChain) and automatically scores their performance. The project provides a standardized, repeatable way to measure the code generation abilities of different models, with results displayed in a clean interface.
💡Highlights
- ├─600+ GitHub stars
- ├─Supports Llama.cpp, Transformers, LangChain
- └─Automated HumanEval scoring
🎯For
- ├─AI researchers
- ├─ML engineers
- └─Developers evaluating LLMs