Home / News / Although they are widely consulted, most popular AI chatbots often give poor health advice

News

Although they are widely consulted, most popular AI chatbots often give poor health advice

April 18, 2026

A new study reveals that nearly 50% of the answers provided by AI chatbots on health topics are potentially problematic. The investigation, in the magazine BMJ Openexamined five popular models and evaluated their ability to answer 250 questions on misinformation-prone health topics.

The researchers used an audit framework that included evaluation of responses by experts in each query category. Response accuracy was rated as not problematic, somewhat problematic, or highly problematic, with more than one-third of responses found to be medically deficient.

The research found that the performance of the models did not vary significantly. However, high error rates were identified in the representation of medical information, with the Grok model generating the most highly problematic responses. It was also observed that the quality of the citations was generally low, affecting the credibility of the information provided.

Impact of open vs. open questions closed

Analyzes revealed that open-ended questions generated a higher proportion of problematic responses compared to closed-ended ones. This implies the importance of properly structuring queries when interacting with chatbots to obtain more accurate information.

The findings underscore the need to educate the public about the use of AI chatbots as sources of medical information.

The urgency of always consulting health professionals before following any advice obtained through these tools is emphasized. Lack of oversight and low quality of information could compromise public health if not adequately addressed.

Like for chatbots and preferences

Among the benefits that chatbot users highlight are the speed of response and lower cost than specialists. However, these same qualities worry doctors, who fear that the technology could lead to the denial of insurance coverage without human intervention, he says. KFF Effectively Being News after a consultation.

On the other hand, recent surveys highlight that many Americans turn to AI for health advice.

“I just let ChatGPT know about my status and how I feel,” he told Related Press (AP) a 42-year-old woman resident of Mesquite, Texas. “I use it for whatever I’m experiencing.”

According to a West Effectively Being–Gallup Center poll on American healthcare released Wednesday, turning to artificial intelligence tools for health advice has become a habit for Americans. The survey, conducted in late 2025 and supported by at least three other recent surveys with similar results, found that about a quarter of American adults had used an AI tool for health information or advice in the past 30 days.

According to the survey, about 7 in 10 American adults who have used AI for health research are looking for quick answers, additional information or were simply curious. Most used it for research before seeing a doctor or after an appointment.

Signs that information may be unreliable

There are several clear signs that indicate when information from an AI chatbot might be unreliable. Recognizing them helps users verify data on their own.

Excessive overconfidence. Chatbots tend to state answers with great confidence, even in the face of obvious errors, without giving hints of doubt as humans would. This creates a false perception of precision, especially in complex topics.

Hallucinations and inventions. They frequently invent facts, quotes or bibliographical references that seem real, but do not exist, such as false study titles. In medical or news contexts, this distorts critical information.

Lack of transparent fonts. They do not always cite verifiable origins or fail to obtain reliable sources, with problems in up to 72% of cases in some models. The opacity makes it impossible to trace the veracity.

Confusion between facts and opinions. They make it difficult to distinguish facts from opinions, altering news in Forty five% of analyzed responses. They also replicate biases or avoid clear positions despite evidence.

Errors on sensitive topics. In health, up to 50% of answers are inaccurate or incomplete, with serious risks in open questions. To verify, ask for specific sources or check with trusted sites.

You may also be interested in:

· The dangers of looksmaxxing: well-liked influencer suffers an alleged “overdose” during live broadcast
· The danger of promoting steroid use and extreme body transformations on social networks
· Vigorexia or the obsession with building muscle