Toward Clinical-Grade Evaluation of Large Language Models

Large language models (LLMs), exemplified by chatbots such as ChatGPT,1 have garnered significant attention in health care for their seemingly human-like abilities to process language and, in so doing, mimic intelligence. Despite being trained on vast amounts of general text curated from the internet, these LLMs appear to embed knowledge2-5 that could provide clinical decision support,6,7 answer patient questions,8,9 serve as a biomedical knowledge resource,10 and address burnout by improving efficiency and documentation burden.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *