Generative Artificial Intelligence, or genAI, has taken the world by storm. Its potential applications across fields, including scientific communications, cannot be denied. The idea of harnessing algorithms to quickly generate beautiful visual representations is immensely appealing. But how useful is it really for medical and scientific illustration and animation? There are legitimate concerns about inaccurate AI-generated images making it past peer review and into scientific journals. Here is a recent high-profile example. Is it possible to make accurate, useful scientific images using GenAI?
We’re a team that specializes in the visual communication of science and medicine. Therefore, we’re curious about the possibilities and limitations of genAI in our line of work. In what follows, we share our explorations in AI image generation. We do not, however, explore the legal and ethical issues surrounding generative AI imagery in this post. That is a big topic and one we’re discussing and monitoring as well.
GenAI for Medical Illustration?
The two frontrunners in AI image generation are currently Midjourney and DALL-E, so we decided to test them out. Excited by the potential of genAI, we started off using several prompt variations to try to get them to generate an anatomical knee:
generate a lateral view of anatomical human knee, showing meniscus, patella, and bursa
knee joint anatomy pictures generated by DALL-E 3 and Midjourney 6; the images are anatomically inaccurate
The results were simultaneously impressive and disappointing. Although recognizable as a knee, neither DALL-E nor Midjourney succeeded in capturing the accuracy necessary for medical illustration. On the left, DALL-E included a bunch of nonsensical labels on a knee that is far from accurate. While Midjourney did a better job on the right, it produced several convincing-looking pictures that don’t hold up under close inspection. These included one fleshy monstrosity. In the top right Midjourney output, the patella is fused into the femur and the meniscus is represented by a metallic-looking disc. The lateral tibial condyle appears to be cracking off. And there is a strange growth coming out the back of the knee joint.
Real ‘looking’ is not the same as accurate
What’s concerning is that these errors are not immediately obvious to the untrained eye. Midjourney’s realistic-looking depictions can come across as perfectly accurate to an untrained layperson. This is especially so, given that photorealism is so often conflated with scientific accuracy. It requires someone with anatomical training to be able to spot the inaccuracies.
We then asked the genAI to produce an image of a human heart:
generate an anatomically accurate human heart
heart anatomy pictures generated by DALL-E 3 and Midjourney 6; the images are anatomically inaccurate
Again, at first glance, these outputs look pretty good. Upon closer examination, we can see inaccuracies persist in both the DALL-E and Midjourney result. There are extra vessels sprouting from where they shouldn’t be, and DALL-E once again made up confusing labels for structures.
A few more prompt variations were run to see how flexible genAI is at generating alternative, non-standard views of the heart. We prompted them to show a posterior view, which they were completely unable to do. We also tried a cross-section, which they produced with much stylization. As you can see, the results were clearly anatomically incorrect:
generate a posterior view of an anatomically accurate heart
posterior heart anatomy pictures generated by DALL-E 3 and Midjourney 6; the images are anatomically inaccurate, showing instead an incorrect depiction of anterior heart anatomy instead
generate a realistic cross-section of a human heart showing the atria and ventricles
cross-sectional heart anatomy pictures generated by DALL-E 3 and Midjourney 6; the images are anatomically inaccurate, showing instead grotesque interpretations of a heart within a heart.
Can GenAI Do Molecular Visualization for Medical and Scientific Communications?
We also tried various molecular visualization prompts with the DALL-E and Midjourney. We found that neither can comprehend molecular visualization terminology. For prompts, we used basic biomedical terms such as antibody, receptor, and lipid membrane. These produced images that are more reminiscent of alien coral reefs than of molecular biology. Although very cool at first glance, the result wasn’t quite what we were looking for.
What was alarming was the confidence with which ChatGPT presented its objectively inaccurate DALL-E-generated image as fact. We tried to ask ChatGPT to generate HER2 receptors on a cell membrane. Unprompted, it added “Here’s the image [you asked for], designed to reflect scientific accuracy” in response to one request. This type of AI “hallucination”, or presenting incorrect information as truth, was a common occurrence as we tested scientific prompts. AI-generated images, as inaccurate as they are, might be useful for stylistic inspiration, but are not yet capable of portraying real science.
generate HER2 receptors spread on a cell membrane
an attempt to generate images based off of the prompt “HER2 receptors spread on a cell membrane” by DALL-E 3 and Midjourney 6.0. The images are wildly inaccurate, looking like psychedelic protrusions reminiscent of a coral reef.
All That Glitters
While genAI presents itself as an innovative solution promising efficiency and accuracy, its current applications in medical and scientific communication are restricted by challenges — the most significant being accuracy. Despite advancements, the current state of genAI lacks the precision required for scientific integrity. This can lead to misleading or full-on incorrect representations of medical and scientific concepts, potentially compromising the understanding of stakeholders and the credibility of the information presented.
Can GenAI Take Art Direction?
Another challenge for genAI is consistently adapting to art direction. For example, medical illustration demands meticulous attention to detail and adherence to scientific standards. Oftentimes, stakeholder input requires incremental revisions to an image. Maintaining consistency during these revisions is notoriously difficult with genAI. Because of the way diffusion model-based image generation works, each new prompt runs the risk of producing something completely different from the previous output. This inconsistency makes incorporating feedback and art direction essentially impossible with current genAI tools.
screenshot of a 3D animation showing day 6 of a chick embryo inside of an egg.
Our chicken embryo animation (© AXS Studio) shown above was created with careful scientific research to ensure accuracy, and has been viewed over 36 million times in classrooms and institutions all over the world. These were our attempts to recreate it using genAI:
Chicken embryo development images generated by DALL-E 3 and Midjourney 6.0. The DALL-E output showcases a mutant chicken with clawed hands, a rat tail, and an exposed spinal cord, while the Midjourney outputs show fully grown chicks encased in thin translucent bubbles.
Bait and Switch
Consider a scenario in which a client engages a medical animator based on the promise of generative AI in their pitch. It’s a situation that is becoming more and more common as fully-rendered genAI outputs replace hand-drawn concept art in pitch decks. Excited by the prospect of cutting-edge technology, the client proceeds with this animator, only to discover that the actual execution falls short of expectations. Despite the initial allure of AI-generated imagery, the final product lacks the precision and quality needed for effective medical communication. This situation isn’t just a waste of time and resources, but it also undermines the credibility of the company and erodes trust in AI-driven solutions.
Staying on the Lookout
At AXS Studio, we recognize the potential of generative AI to enhance the medical science visualization workflow. The goal of this is to realize efficiencies that benefit clients. However, we remain vigilant in navigating its current limitations in medical and scientific communications. Rather than relying on AI-generated imagery, we prioritize a collaborative approach that leverages human expertise and technological advancements in tandem. Our team of certified medical illustrators and scientifically-trained medical animators combine artistic proficiency with scientific accuracy to deliver compelling visual narratives that meet the highest standards of quality and precision. We will continue to monitor the progression of genAI as it may one day become an effective tool in our toolkit.
If you are looking for a scientific illustration or medical animation for yourself or your company, it’s important to critically evaluate the capabilities and potential pitfalls of AI-driven solutions. By partnering with experienced and scientifically trained professionals, you can navigate the complexities of biomedical visualization with confidence, knowing that your visual communications are in capable hands.