Which AI image generator should I use?
9 leading models put to the test
It’s been over a year since my last comparative look at AI image generators. The landscape has changed dramatically since then with the likes of Reve, Seedream, Nano Banana Pro and ChatGPT Images delivering leaps forward in realism, text rendering and the ability to make fine-grained edits using natural language.
Whilst the latest versions of these and other models continue to duke it out in the arena leaderboards, what are the practical pros and cons of the leading generators and which is right for your specific use case?
I gave nine leading models the same 10 prompts, designed to test different aspects of their capability.
1. Respecting anatomy and physics
Prompt: close-up candid shot of a smiling female cyclist performing a handstand on a BMX bike whilst giving a thumbs up to camera
Whilst none of the models totally nailed this tough brief, Nano Banana Pro got the closest, with the rider plausibly balancing on her handlebars with one hand, whilst giving a thumbs up with the other. I initially thought it had messed up by only giving the bike one brake but it turns out Gemini knows more about BMXs than I do.
ChatGPT Images nailed most elements but the image lacks realism thanks to its studio-lit, gravity-defying superwoman rider. The other models all struggled with the physics of the bike and/or rider, with the wooden spoon going to FLUX.2, which generated like it’s 2024, furnishing its rider with an extra arm.
Winner: Nano Banana Pro
2. Maintaining likeness
Prompt: a three-part image showing selfies of this man 1.) as a coal miner 2.) as a fighter pilot 3.) as a lollipop man
This prompt was designed to test the models’ ability to segment an image into three and depict a consistent subject in different guises whilst maintaining likeness. I also threw in a UK colloquialism (lollipop man) to see how the different models would cope.
FLUX.2 did a decent job of maintaining my likeness, but didn’t dress me correctly for any of my chosen vocations (the giant candy lollipop is a particular doozie).
ChatGPT Images rendered the most convincing lollipop man and sign, but slightly lost my likeness with the miner and the fighter pilot.
All three of Reve’s images were close but just a little bit off (the fighter pilot sitting the wrong way round in the cockpit is particularly off-putting).
Nano Banana Pro maintained my likeness across all three images, which were all convincingly contextualised.
Winner: Nano Banana Pro
3. Rendering lots of text
Prompt: A photo of this text handwritten in chalk on a blackboard:
“The power of the less powerful begins with honesty”
“Friends, it is time for companies and countries to take their signs down”
“We are in the midst of a rupture, not a transition.”
“A world of fortresses will be poorer, more fragile and less sustainable.”
“We are no longer relying on just the strength of our values, but also on the value of our strength.”
“Middle powers must act together because if we’re not at the table, we’re on the menu.”
“Nostalgia is not a strategy.“
“The powerful have their power. But we have something too - the capacity to stop pretending, to name reality, to build our strength at home and to act together.”
Mark Carney, Davos, January 2026
I cherry-picked some quotes from Carney’s recent Davos speech, so the text couldn’t simply be parroted from training data.
FLUX.2, Reve and Seedream all accurately recreated the text but only Nano Banana Pro and ChatGPT Images rendered all but one of the quotation marks. ChatGPT Images just takes it for having the more plausible handwriting.
Winner: ChatGPT Images
4. Designing a logo
Prompt: design an original logo for hot sauce “Dan T’s Inferno”
Quite the spectrum on this one, from classic AI text mangling (Adobe Firefly and Midjourney), to strong kebab shop menu vibes (FLUX.2 and Recraft) to the perhaps overly elaborate (Nano Banana Pro and ChatGPT Images).
To be fair, there wasn’t much creative direction in my prompt.
In picking a winner, I’m torn between Ideogram’s simple (and slightly retro) mascot and typography and Nano Banana Pro’s more ornate design, which is remarkably similar to ChatGPT Images’ effort, although looks less obviously AI-generated.
Nano Banana Pro almost loses it on the duplicate ‘inferno’ (so hot they named it twice), although the ease of removing it with a follow up prompt negates the issue, securing another victory for Nano Banana Pro.
Winner: Nano Banana Pro
5. Annotating a diagram
Prompt: a diagram of a gibson les paul deluxe with all the key parts correctly labelled
A challenging prompt, spotlighting a common weak spot for AI image generators - accurately annotating diagrams.
Nano Banana Pro mangled the Bridge Pickup label and didn’t connect the Bridge or Pickup Selector Switch labels with the correct parts, but otherwise did a bang-up job. The drawing of the headstock is particularly impressive, nailing the Gibson branding and Les Paul’s signature.
Winner: Nano Banana Pro
6. Incorporating reference images
Prompt: proportionally size and arrange these objects on the table
I uploaded eight photos from my phone’s camera roll and asked the models to combine them into a single image, with the objects proportionally sized.
Nano Banana Pro smashed this one. Whilst it guessed wrong on a couple of relative sizes, it managed to render all seven objects on the table, including preserving the in-progress game of Scrabble.
ChatGPT Images included all seven objects but subtly distorted each of them and - less subtly - replaced the Scrabble letters with symbols.
Reve decided to box up the Scrabble, whilst Seedream rebranded Linkee and FLUX.2 swapped the hot chocolate maker for a silver spoon.
The wooden spoon, however, must go to Midjourney which did render seven objects on a table. However they bore zero resemblance to the uploaded images.
Winner: Nano Banana Pro
7. Altering a real photo
Prompt: make the animals in this wallpaper real and 3D, emerging from the wall into the room
Another challenging brief: to take a photo of the wallpaper in my kids’ bedroom (yes, really) and bring it to life.
Midjourney, Ideogram and Firefly all diverged wildly from the uploaded image. Seedream cartoonified the animals, whilst FLUX.2 and Reve picked a few animals to apply a 3D effect to.
Nano Banana Pro showed the leopard and toucan emerging into the room, but changed the composition and left the other animals fairly 2D.
ChatGPT Images did the best job of giving the whole scene more depth.
Winner: ChatGPT Images
8. Surgical edits
Prompt: make these 3 edits to this image:
1. replace the man with a 50-something woman
2. change ‘ChatGPT’ to ‘Claude’
3. change ‘47’ to ‘48’
leave everything else unchanged
Wanting to edit some elements of an image whilst leaving others unchanged is a common scenario. This prompt tested making precise copy changes and a major visual edit.
Nano Banana Pro won this one by a country mile, acing all of the edits, whilst preserving realism and stylistic continuity.
ChatGPT Images was a close second, but failed to align/justify the altered title text, leaving the ‘3rd edition’ copy stranded on the right-hand side.
All of the other models mangled some text and bodged the subject swap, with the exception of Seedream, which attempted a more major redesign, with some success but some duplicate text (Claude, 48 and 3rd edition).
Winner: Nano Banana Pro
9. Adopting a non-photographic style and handling potential IP infringement
Prompt: an oil painting of a 50 foot statue of an animated italian plumber in Piazza della Signoria
This was a multipurpose prompt, designed to test the models’ proficiency at generating non-photographic output, their ability to integrate a fictional object into a real world setting and their approach to handling IP.
I was expecting Adobe Firefly to be the model that would refuse to generate an output for this prompt, but it was actually ChatGPT Images. Meanwhile Firefly generated a model with zero resemblance to any popular animated characters, thanks to its licensed training data.
Ideogram also avoided any obvious IP infringement, but failed to generate a plausible oil painting or a recognisable Piazza della Signoria.
Whilst both clearly infringing IP, Midjourney and Nano Banana Pro generated the most painterly images, with Nano Banana Pro just edging it on the strength of the resemblance to Piazza della Signoria (and going to the effort of framing the painting and hanging it in a gallery).
Winner: Nano Banana Pro
10. Countering visual biases in training data
Prompt: photo of a glass of red wine full to the brim, next to a plate of green strawberries, underneath an antique clock showing quarter to seven
This prompt was designed to test the models’ ability to counter strong visual biases in their training data (wine glasses are rarely pictured full to the brim, strawberries are normally photographed when ripe and clocks typically show ten past ten in commercial photography).
It didn’t create the most aesthetic image, but ChatGPT Images was the only model not tripped up by any of the gotchas in this prompt.
Winner: ChatGPT Images
Conclusions
Although I was expecting Nano Banana Pro to come out on top in this test, I wasn’t expecting it to be quite such a runaway victory (7 out of 10). The December update to ChatGPT Images has made it a decent alternative and definitely worth trying when Nano Banana Pro disappoints, although it’s still painfully slow.
Whilst some professional creatives may still favour Midjourney for its high degree of style customisation, I’d suggest Nano Banana Pro is currently the best option for most people most of the time.
Other models are worth turning to for specific use cases (e.g. Ideogram for typography, Recraft for SVGs), but Nano Banana Pro is a very strong all-rounder, with its ability to factor in real world knowledge being a real boon.
The biggest limitations of Nano Banana Pro right now are:
Its limited free availability. After 2 or 3 generations, non-paying Gemini users will be routed to regular Nano Banana (still a very capable model).
The visible watermark on images generated via Gemini (with the exception of the eye-wateringly expensive Google AI Ultra plan).
Whilst there are websites that will remove the Gemini watermark from images (I persuaded Gemini to build me a tool to do this), it’s an annoying extra workflow step, especially if you’re shelling out £18.99pm for Google AI Pro.
If you’re generating a high volume of images that you’d like to be high resolution and watermark free, I’d suggest using a 3rd party platform that includes access to Nano Banana Pro (e.g. Freepik, Krea, Higgsfield, Adobe Firefly).
If you are generating via Gemini, then it’s worth stating the aspect ratio in your prompt (16:9 for landscape, 9:16 for portrait, 1:1 for square) and selecting ‘Thinking’ mode for more considered generations.
Right, magnifiers at the ready, here’s the full comparison matrix.
Dan’s Media & AI Sandwich is free to read. If you’ve found this post interesting, please consider liking and/or sharing it. If you’d like to chip in to support my writing, you can do so by becoming a paid subscriber. Contributions make it easier for me to dedicate more time to writing. Thanks to those of you who’ve already become paid subscribers. If you’d prefer to make a one-off contribution you can do so at buymeacoffee.com/dantaylorwatt















Superb overview, Dan, thanks. I hope you didn't show those renders of the wallpaper to the kids - that's nightmare material right there!
Love this analysis - thanks! I'm interested in the copyright implications. In your image of Mario, the IP infringement is clear. But what if a prompt you use generates something similar to existing IP, but the user has no idea as they are not familiar with that IP? E.g. someone generating an image for a blog post mistakenly uses a character from a 90s cartoon they are unfamiliar with. Is there a way of checking IP infringement before using generated images?