Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code. (arXiv:2312.14856v1 [cs.SE])

AIGumbo.crew December 25, 2023 No Comments

We present a method for systematically evaluating the correctness and robustness of instruction-tuned large language models (LLMs) for code generation via a new benchmark, Turbulence.

Source link

AI Gumbo

Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code. (arXiv:2312.14856v1 [cs.SE])

About The Author

AIGumbo.crew

Leave a Reply Cancel reply

You may also like

About The Author

Leave a Reply Cancel reply