Innovations

ChatGPT or DeepSeek – Are They Useful for Diagnosing Asian Patients?

Published on 26 8 月, 2025

Large language models (LLMs) such as ChatGPT (e.g., GPT‑4) and emerging tools like DeepSeek offer support in symptom interpretation and triage. However, their effectiveness for Asian patients may be limited by training data that is predominantly based on Western criteria or information. Variations in disease presentation and prevalence among Asian populations pose unique challenges.

Racial and Ethnic Biases in LLMs

A 2024 study published in Nature Communications found that both GPT‑3.5‑turbo and GPT‑4 exhibited biases in medical report generation, attributing certain diseases and treatment recommendations differently based on race, and assigning longer predicted hospitalisation and higher costs to White patients.

Use in Clinical Care

In a 2024 Lancet Digital Health report, GPT‑4 does not always include accurate information based on demographics or racial groups, and may worsen health inequities with racial and gender biases.

This highlights the importance of finding correct use of ChatGPT in localized Asian setting.

A useful example is using AI tools to give medical discharge instructions across different racial groups found no sentiment or stylistic differences when race/ethnicity was varied, suggesting GPT‑4 maintains consistency in communication. Similar findings can be used to provide medical discharge summaries for patients in Singapore.

Use in Health Care Settings

A Harvard‑led evaluation of GPT‑4 and another LLM (Gemini) using real-world pain cases found no differences in opioid prescribing based on patient race/ethnicity or sex across 480 cases.

However, other studies highlight continued bias: LLMs sometimes perpetuate stereotyped assumptions (e.g., on lung function, skin thickness) for Black patients, and biases based on dialect and language use have also been documented.

Disease Patterns Specific to Asian Populations

Asian patients may exhibit higher risk for certain conditions, such as nasopharyngeal carcinoma, may develop type 2 diabetes at a lower BMI, and metabolise drugs differently due to pharmacogenomic differences. Such differences are often underrepresented in datasets used to train LLMs, increasing the risk of misdiagnosis or incorrect risk assessment.

For example:

Type 2 diabetes tends to occur at lower BMI levels in East Asians with normal body weight.
Nasopharyngeal carcinoma is more prevalent among Southern Chinese and Southeast Asians, yet rare in Europeans.
CYP2C19 polymorphisms affect drug metabolism in Southeast Asians, altering response to medications like clopidogrel and proton pump inhibitors.

Conclusion

Large language models (LLMs) like ChatGPT and DeepSeek can support clinical workflows to summarise symptoms and suggest possible differential diagnoses for Asian patients.

Although their performance in controlled scenarios appears consistent across ethnic groups, inherent racial or disease-based biases remain a concern. Many Asian‑specific disease profiles may also be under-represented in AI tools, which will reduce diagnostic accuracy.

AI tools with LLMs are used to assist but cannot replace professional clinical judgement in Asian patients. More research that includes diverse, ethnicity-specific data in AI tools will improve their use for healthcare information in the Asian populations.

This article was produced solely for the purpose of healthcare and medical knowledge. Not all innovations are available or approved for clinical use. AsiaMD may receive financial or non-financial sponsorship from the companies or institutions involved in these innovations. However, AsiaMD does not endorse any specific product or services in the article, in addition to the Terms and Conditions for the use of our AsiaMD.com website. Please consult your healthcare professional if you need more information.