Fresh juice


Google DeepMind's Med-Gemini AI outperforms GPT-4 in medical trials

In a groundbreaking development in the field of artificial intelligence, researchers at Google DeepMind have unveiled a new AI model specifically designed for medical applications. Named Med-Gemini, this cutting-edge technology has demonstrated remarkable proficiency in diagnosing patients based on dialogues and providing invaluable assistance to doctors during various medical tests.



The findings, published on the arXiv pre-publication server, detail how Med-Gemini has surpassed the capabilities of OpenAI's highly acclaimed GPT-4 series in a series of rigorous trials. This achievement has sent shockwaves through the AI community and has ignited hopes for a revolutionary transformation in the medical sector.

At the core of Med-Gemini lies Gemini, Google DeepMind's most advanced AI model to date. Available in multiple versions, Gemini boasts the ability to process and interpret data across various formats, including text, images, video, and audio. Recognizing the immense potential of AI in the medical field, Google's researchers have meticulously adapted and enhanced Gemini, giving birth to Med-Gemini, a specialized version tailored for medical applications.

The training process for Med-Gemini was no small feat. Google employed MedQA, a multilingual database comprising tens of thousands of multiple-choice questions from the US Medical License Exam (USMLE), a comprehensive test designed to evaluate the clinical skills of doctors in training. Spanning three languages – English, Simplified Chinese, and Traditional Chinese – MedQA provided a robust foundation for Med-Gemini's medical knowledge.

However, the researchers did not stop there. In an effort to expand the model's capabilities further, they developed two significant extensions to the MedQA database. The first, MedQA-R (R for "reasoning"), incorporates generated explanations of reasoning, known as "thought chains," offering a detailed rationale for each answer. The second extension, MedQA-RS (RS for "reasoning and search"), enables the model to conduct web searches and leverage the retrieved information to enhance the accuracy of its responses.


The results of Med-Gemini's performance in MedQA tests were nothing short of remarkable. It achieved an impressive 91.1% accuracy, surpassing the capabilities of OpenAI's GPT-4 series. This accomplishment underscores the model's exceptional understanding of large and complex datasets, a crucial attribute in the medical field.

To further assess Med-Gemini's prowess in handling vast amounts of intricate medical information, the researchers utilized a large public database containing anonymous medical records. Approximately 200 datasets, each containing between 200,000 and 700,000 words, were prepared from this source. Med-Gemini's task was to identify and extract accurate information, such as specific medical conditions and symptoms, evaluate their relevance, classify the data, and determine whether it reflected the patient's medical history related to the disease or symptoms. According to the researchers, the experiment yielded promising results.

However, the true test of Med-Gemini's capabilities came in practical trials, where its multimodal capabilities were put to the test in real medical scenarios. In one instance, a patient sought advice from the model regarding an itchy skin growth. Med-Gemini requested an image of the formation to facilitate an accurate assessment. After receiving the image, the model posed additional questions to clarify the diagnosis, ultimately correctly identifying a rare lesion and providing recommendations for further action.

In another trial, a doctor utilized Med-Gemini to interpret a chest X-ray before an official radiologist's report was available. Not only did the model offer an interpretation of the image, but it also formulated a simplified version of the conclusion in language comprehensible to the patient.

While these findings are undoubtedly encouraging, the researchers acknowledge that further research is necessary to enhance Med-Gemini's clinical conversation capabilities. The ultimate goal is to ensure the model's reliability and effectiveness across all scenarios of medical use, paving the way for a future where AI plays a pivotal role in improving healthcare outcomes.


As the medical community eagerly awaits further developments, one thing is clear: Google DeepMind's Med-Gemini AI has set a new benchmark for the integration of artificial intelligence in the realm of medicine, and its impact is poised to reverberate across the healthcare landscape for years to come.

Share with friends:

Write and read comments can only authorized users