Uncategorized

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives



  • Ouyang L, Wu J, Jiang X, Almeida, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems. 2022; 35:730–744.


    Google Scholar
     

  • Kalyan KS, Rajasekharan A, Sangeetha S. Ammu: a survey of transformer-based biomedical pretrained language models. Journal of biomedical informatics. 2022;126:103982.

    Article 
    PubMed 

    Google Scholar
     

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention Is All You Need. 2017. arXiv:1706.03762.

  • Open AI. ChatGPT release note. Available at: https://help.openai.com/en/articles/6825453-chatgpt-release-notes#h_4799933861 Last Accessed: December 22, 2023.

  • Tian S, Jin Q, Yeganova L, Lai P-T, Zhu Q, Chen X, Yang X, Chen, Kim W, Comeau DC, Islamaj R, Kapoor A, Gao X, Lu Z. Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health- arXiv:2306.10070. (2023).

  • Radford A, Narasimhan K. Improving Language Understanding by Generative Pre-Training. 2018. https://api.semanticscholar.org/CorpusID:49313245.

  • Cao Z, Wong K, Lin CT. Weak Human Preference Supervision for Deep Reinforcement Learning. IEEE Trans Neural Netw Learn Syst. 2021;32(12):5369–5378. doi: https://doi.org/10.1109/TNNLS.2021.3084198.

    Article 
    PubMed 

    Google Scholar
     

  • Rafailov R, Sharma A, Mitchell E, Ermon S, Manning CD, Finn C. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. arXiv:2305.18290 (2023).

  • Gao L, Biderman S, Black S, Golding L, Hoppe T, Foster C, Phang J, He H, Thite A, Nabeshima N, Presser S, Leahy C. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv:2101.00027 (2020).

  • Meta AI Request Form. Available at: https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform Last Accessed: December 22, 2023.

  • Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. Cureus. 2023;15(6):e40895. doi: https://doi.org/10.7759/cureus.40895.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Microsoft Bing Blog. Available at: https://blogs.bing.com/search/november-2023/our-vision-to-bring-microsoft-copilot-to-everyone-and-more. Last Accessed: December 24, 2023.

  • ZDNET Information. Available at: https://www.zdnet.com/article/what-is-copilot-formerly-bing-chat-heres-everything-you-need-to-know/. Last Accessed: December 24, 2023.

  • Avanade Insight. Available at: https://www.avanade.com/en/blogs/avanade-insights/health-care/ai-copilot. Last Accessed: December 24, 2023.

  • OpenAI. GPT-4 Technical Report. arXiv:2303.08774 (2023).

  • The decoder. Available at: https://the-decoder.com/gpt-4-architecture-datasets-costs-and-more-leaked/ Last Accessed: December 22, 2023

  • Lee P, Bubeck S, Petro J. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med. 2023;388(13):1233–1239. doi: https://doi.org/10.1056/NEJMsr2214184.

    Article 
    PubMed 

    Google Scholar
     

  • Bhayana R, Bleakney RR, Krishna S. GPT-4 in Radiology: Improvements in Advanced Reasoning. Radiology. 2023;307(5):e230987. doi: https://doi.org/10.1148/radiol.230987.

    Article 
    PubMed 

    Google Scholar
     

  • Jang D, Yun TR, Lee CY, Kwon YK, Kim CE. GPT-4 can pass the Korean National Licensing Examination for Korean Medicine Doctors. PLOS Digit Health. 2023;2(12):e0000416. doi: https://doi.org/10.1371/journal.pdig.0000416.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Guerra GA, Hofmann H, Sobhani S, Hofmann G, Gomez D, Soroudi D, Hopkins BS, Dallas J, Pangal DJ, Cheok S, Nguyen VN, Mack WJ, Zada G. GPT-4 Artificial Intelligence Model Outperforms ChatGPT, Medical Students, and Neurosurgery Residents on Neurosurgery Written Board-Like Questions. World Neurosurg. 2023;179:e160-e165. doi: https://doi.org/10.1016/j.wneu.2023.08.042.

    Article 
    PubMed 

    Google Scholar
     

  • Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-Depth Patient Education Prior to Interventional Radiological Procedures: A Comparative Analysis. Cardiovasc Intervent Radiol. 2023 Oct 23. doi: https://doi.org/10.1007/s00270-023-03563-2.

  • Spies NC, Hubler Z, Roper SM, Omosule CL, Senter-Zapata M, Roemmich BL, Brown HM, Gimple R, Farnsworth CW. GPT-4 Underperforms Experts in Detecting IV Fluid Contamination. J Appl Lab Med. 2023;8(6):1092–1100. doi: https://doi.org/10.1093/jalm/jfad058.

    Article 
    PubMed 

    Google Scholar
     

  • Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge. Nature. 2023;620(7972):172–180. doi: https://doi.org/10.1038/s41586-023-06291-2.

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Madaan A, Tandon N, Gupta P, Hallinan S, Gao L, Wiegreffe S, Alon U, Dziri N, Prabhumoye S, Yang Y, et al. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651 (2023).

  • Singhal K, Tu T, Gottweis J, Sayres R, Wulczyn E, Hou L, Clark K, Pfohl S, Cole-Lewis H, Neal D, Schaekermann M, Wang A, Amin M, Lachgar S, Mansfield P, Prakash S, Green B, Dominowska E, Aguera y Arcas B, Tomasev N, Liu Y, Wong R, Semturs C, Mahdavi S. Towards Expert-Level Medical Question Answering with Large Language Models. arXiv:2305.09617v1 (2023).

  • Tu T, Azizi S, Driess D, Schaekermann M, Amin M, et al. Towards Generalist Biomedical AI. arXiv:2307.14334v1 (2023).

  • Hippocratic AI. Available at https://www.hippocraticai.com/. Last Accessed: December 24, 2023.

  • Hugging Face. MPT-B. Available at: https://huggingface.co/mosaicml/mpt-7b. Last Accessed: December 24, 2023.

  • Kauf C, Ivanova AA, Rambelli G, Chersoni E, She JS, Chowdhury Z, Fedorenko E, Lenci A. Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely. Cogn Sci. 2023;47(11):e13386. doi: https://doi.org/10.1111/cogs.13386.

    Article 
    PubMed 

    Google Scholar
     

  • Touvron H, Martin L, et al. LLaMA-2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 (2023).

  • Ainslie J, Lee-Thorp J, de Jong M, Zemlyanskiy Y, Lebron, Sanghai S. 2023. GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4895–4901, Singapore. Association for Computational Linguistics.

  • Jiang AQ, Sablayrolles A, Mensch A, Bamford C, Singh Chaplot D, de las Casas D, Bressand F, Lengyel G, Lample G, Saulnier L, Lavaud LR, Lachaux MA, Stock P, Le Scao T, Lavril T, Wang T, Lacroix T, El Sayed W. Mistral-7B arXiv:2310.06825.

  • An END-to-END guide on how to finetune a LLM(Mistral-7B) into a Medical Chat Doctor using Huggingface. Available at: https://medium.com/@SachinKhandewal/finetuning-mistral-7b-into-a-medical-chat-doctor-using-huggingface-qlora-peft-5ce15d45f581 Last Accessed: December 22, 2023.

  • Mistral AI. Available at: https://mistral.ai/news/mixtral-of-experts/ Last Accessed: December 24, 2023.

  • Nijkamp E, Xie T, Hayashi H, Pang B, Xia C, Xing C, Vig J, Yavuz S, Laban P, Krause B, Purushwalkam S, Niu T, Kryściński W, Murakhovs’ka L, Choubey PK, Fabbri A, Liu Y, Meng R, Tu L, Bhat M, Wu C-S, Savarese S, Zhou Y, Joty S, Xiong C. XGen-7B Technical Report. arXiv:2309.03450.

  • Peng C, Yang X, Chen A, Smith KE, PourNejatian N, Costa AB, Martin C, Flores MG, Zhang Y, Magoc T, Lipori G, Mitchell DA, Ospina NS, Ahmed MM, Hogan WR, Shenkman EA, Guo Y, Bian J, Wu Y. A study of generative large language model for medical research and healthcare. NPJ Digit Med. 2023;6(1):210. doi: https://doi.org/10.1038/s41746-023-00958-w.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cunningham H, Ewart A, Riggs L, Huben R, Sharkey R. Sparse Autoencoders Find Highly Interpretable Features in Language Models. arXiv:2309.08600 (2023).

  • Anthropic. Available at: https://www.anthropic.com/ Last Accessed: December 22, 2023.

  • Gemini Team, Google. Gemini: A Family of Highly Capable Multimodal Models. Available at: https://assets.bwbx.io/documents/users/iqjWHBFdfxIU/r7G7RrtT6rnM/v0 Last Accessed: January 31, 2023.

  • Yasunaga M, Leskovec J, Liang P. LinkBERT: Pretraining Language Models with Document Links. arXiv:2203.15827 (2022).

  • Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT – Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605–607. doi: https://doi.org/10.12669/pjms.39.2.7653.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare (Basel). 2023;11(6):887. doi: https://doi.org/10.3390/healthcare11060887.

    Article 
    PubMed 

    Google Scholar
     

  • Eysenbach G. The Role of ChatGPT, Generative Language Models, and Artificial Intelligence in Medical Education: A Conversation With ChatGPT and a Call for Papers. JMIR Med Educ. 2023;9:e46885. doi: https://doi.org/10.2196/46885.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cascella M, Cascella A, Monaco F, Shariff MN. Envisioning gamification in anesthesia, pain management, and critical care: basic principles, integration of artificial intelligence, and simulation strategies. J Anesth Analg Crit Care. 2023;3(1):33. doi: https://doi.org/10.1186/s44158-023-00118-2.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Haque A, Chowdhury N-U-R. The Future of Medicine: Large Language Models Redefining Healthcare Dynamics. TechRxiv. November 22, 2023. doi: https://doi.org/10.36227/techrxiv.24354451.v2.

  • Gurrapu S, Kulkarni A, Huang L, Lourentzou I, Batarseh FA. Rationalization for explainable NLP: a survey. Front Artif Intell. 2023;6:1225093. doi: https://doi.org/10.3389/frai.2023.1225093.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios. J Med Syst. 2023;47(1):33. doi: https://doi.org/10.1007/s10916-023-01925-4.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Birkun AA, Gautam A. Large Language Model (LLM)-Powered Chatbots Fail to Generate Guideline-Consistent Content on Resuscitation and May Provide Potentially Harmful Advice. Prehosp Disaster Med. 2023;38(6):757–763. doi: https://doi.org/10.1017/S1049023X23006568.

    Article 
    PubMed 

    Google Scholar
     

  • Zúñiga Salazar G, Zúñiga D, Vindel CL, Yoong AM, Hincapie S, Zúñiga AB, Zúñiga P, Salazar E, Zúñiga B. Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat. Cureus. 2023;15(9):e45473. doi: https://doi.org/10.7759/cureus.45473.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • MIT Technology Review. Why Meta’s latest large language model survived only three days online. https://www.technologyreview.com/2022/11/18/1063487/meta-large-language-model-ai-only-survived-three-days-gpt-3-science/ Last Accessed: December 22, 2023.

  • Batarseh FA, Freeman L, Huang C-H. A survey on artificial intelligence assurance. J Big Data 2021;8,7. doi:https://doi.org/10.1186/s40537-021-00445-7.

  • Manathunga S, Hettigoda I. Aligning Large Language Models for Clinical Tasks. arXiv:2309.02884 (2023).

  • Benary M, Wang XD, Schmidt M, Soll D, Hilfenhaus G, Nassir M, Sigler C, Knödler M, Keller U, Beule D, Keilholz U, Leser U, Rieke DT. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw Open. 2023;6(11):e2343689. doi: https://doi.org/10.1001/jamanetworkopen.2023.43689.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1:206–215. doi: https://doi.org/10.1038/s42256-019-0048-x.

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Madsen A, Reddy S, Chandar S. Post-hoc Interpretability for Neural NLP: A Survey. ACM Computing Surveys. 2022;55(8):1–42. doi: https://doi.org/10.1145/3546577.

    Article 

    Google Scholar
     

  • Tran D, Liu J, Dusenberry MW, Phan D, Collier M, Ren J, Han K, Wang Z, Mariet Z, Hu H, Band N, Rudner TJG, Singhal K, Nado Z, van Amersfoort J, Kirsch A, Jenatton R, Thain N, Yuan H, Buchanan K, Murphy K, Sculley D, Gal Y. Plex: towards reliability using pretrained large model extensions. Preprint at https://doi.org/10.48550/arXiv.2207.07411 (2022).

  • Brown T, et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020;33:1877–1901.


    Google Scholar
     

  • Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning. Preprint at: https://doi.org/10.48550/arXiv.2104.08691 (2021).

    Article 

    Google Scholar
     

  • Liang P. et al. Holistic evaluation of language models. Preprint at: https://doi.org/10.48550/arXiv.2211.09110 (2022).

    Article 

    Google Scholar
     

  • Hippocratic AI. Available at https://www.hippocraticai.com/. Last Accessed: December 24, 2023



  • Source link

    Leave a Reply

    Your email address will not be published. Required fields are marked *