the-crypt-keeper/can-ai-code

🏗️ Frameworkthe-crypt-keeper

Self-evaluating coding interview for AI models using LLMs and transformers.

A self-evaluating interview system for AI coders. It runs coding tasks from benchmarks like HumanEval against various LLM backends (Llama.cpp, Transformers, LangChain) and automatically scores their performance. The project provides a standardized, repeatable way to measure the code generation abilities of different models, with results displayed in a clean interface.

💡Highlights

├─600+ GitHub stars
├─Supports Llama.cpp, Transformers, LangChain
└─Automated HumanEval scoring

🎯For

├─AI researchers
├─ML engineers
└─Developers evaluating LLMs

🔗Links

└─GitHub Repository