Can Large Language Models be Trusted for Evaluation? Meet SCALEEVAL: An Agent-Debate-Assisted Meta-Evaluation Framework that Leverages the Capabilities of Multiple Communicative LLM Agents – MarkTechPost
Can Large Language Models be Trusted for Evaluation? Meet SCALEEVAL: An Agent-Debate-Assisted Meta-Evaluation Framework that Leverages the Capabilities of Multiple Communicative LLM Agents – MarkTechPost