Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

Discussion on "MT-Bench: LLM-as-a-judge benchmark" | Hashnode

FeedDiscussion

Vikas Bhandary

Personal blog

Nov 14, 2025

MT-Bench: LLM-as-a-judge benchmark

With the rapid growth of research on large language models (LLMs), we now have a diverse array of models capable of performing various tasks. Current benchmarks for LLMs only focus on evaluating models on close-ended questions, with short responses. ...

vikasbhandary.com.np3 min read

Responses

No responses yet.