Uncategorized

Can Large Language Models be Trusted for Evaluation? Meet SCALEEVAL: An Agent-Debate-Assisted Meta-Evaluation Framework that Leverages the Capabilities of Multiple Communicative LLM Agents



"Large Language Models"Despite the utility of large language models (LLMs) across various tasks and scenarios, researchers need help to evaluate LLMs properly in different situations.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *