All about evaluating Large language models
The emergence of LLMs has opened up a venue for tackling problems that were earlier thought impossible. The plethora of LLM-based applications is proof of this. But the one question still remains a mystery, how to effectively evaluate LLM-based appli...
explodinggradients.com11 min read
Shivam Sharma
PhD student
Great Piece. Thanks! One clarification though... you've shown "# {'ragas_score': 0.860, 'context_relavency': 0.817, # 'factuality': 0.892, 'answer_relevancy': 0.874}" as part of your ragas.py github snippet.
Does ragas indeed compute "factuality" score as well? Could see all others in the source code, but not this one....