Skills tagged with "evaluation"
Build evaluation frameworks for agent systems that measure performance through multi-dimensional rubrics and continuous testing
Implement comprehensive evaluation strategies for LLM applications with automated metrics