The Rising Role of Large Language Models in Identifying Social Determinants of Health

The Rising Role of Large Language Models

In the evolving landscape of healthcare, the potential of artificial intelligence, specifically Large Language Models (LLMs), is being increasingly recognized. These computational models are being utilized to identify Social Determinants of Health (SDoH) from Electronic Health Records (EHRs). SDoH are the conditions in which people are born, grow, live, work, and age, and have significant impacts on health outcomes. However, the documentation of SDoH within EHRs is often scarce, thus posing a challenge for healthcare providers and researchers.

Performance of LLMs in Extracting SDoH

Recent studies have demonstrated that LLMs can enable high throughput extraction of SDoH from EHRs, supporting both research and clinical care. Not only do these models outperform structured diagnostic codes and traditional BERT classifiers, but they also exhibit less bias compared to models from the ChatGPT-family. Specifically, the fine-tuned Flan T5 XL and Flan T5 XXL models have shown remarkable performance for any and adverse SDoH mentions respectively.

Overcoming Data Limitations with Synthetic Text Generation

One of the challenges faced by the LLMs in identifying SDoH is class imbalance and data limitations. To overcome this, synthetic text generation has been explored, which was found to improve the performance of smaller Flan T5 models. This approach further underscores the potential of LLMs in enhancing real-world evidence concerning SDoH, which can assist in identifying patients who could benefit from resource support.

Addressing Algorithmic Bias

One critical aspect of developing and implementing these models is the consideration of algorithmic bias. The study found that fine-tuned models were less likely than ChatGPT to change their prediction when race, ethnicity, and gender descriptors were added to the text. This suggests less algorithmic bias and more reliable predictions. The reduction of bias in these models is essential to ensure equitable healthcare outcomes and resource allocation.

Standardization and Integration of SDoH Data

Despite the advances in using LLMs to identify SDoH from EHRs, there remains a lack of standardized SDoH data. This poses challenges in integrating this data into healthcare systems. Collaboration among organizations and standardization of data can help overcome this hurdle, leading to improved identification and addressing of SDoH factors.

Conclusion

In conclusion, the use of LLMs in healthcare has demonstrated promising potential, particularly in the identification of SDoH from EHRs. These models can enhance real-world data collection, support patient care, and facilitate proactive resource allocation. However, ongoing efforts should focus on overcoming data limitations, reducing algorithmic bias, and standardizing SDoH data for seamless integration into healthcare systems.

Source link