Move over Midjourney?
There’s a new AI image generator in town and it’s faster and better at rendering text
AI image generators tend to be poor at rendering text. Ideogram is a notable exception to this, although it can’t consistently match Midjourney for image quality or coherence (see my earlier comparison of AI image generators).
Last week, Black Forest Labs emerged from stealth and released a new suite of models called FLUX.1 which look like they might just offer the best of both worlds/models, with high image quality and coherence and decent (but by no means perfect) text rendering.
They’re also much quicker than Midjourney (~5 seconds vs ~30 seconds in my tests).
FLUX.1 doesn’t yet have a nice friendly product interface, although you can have a play with the non-commercial models on fal.ai (the non-commercial models are open-source which means developers will be able to build on them).
Below is its response to the prompt ‘Photo of a man in a black hoodie holding a spray can next to an old stone wall with graffitied text which reads "This is not a photo". The paint is running slightly’.
Not bad, although the eagle-eyed will notice a rogue fifth finger - a perennial issue for AI image generators (although Midjourney has got much better on this).
Black Forest Labs haven’t disclosed what content FLUX.1 was trained on but from the images I’ve been able to generate (e.g. Super Mario and Darth Vader in combat) it’s safe to assume the training set included copyrighted material.
A text-to-video model is in the works and the company has just secured $31 million seed funding. However, it’s not clear to me what their moat is. They’ve combined some innovative approaches but it seems inevitable other models will follow suit.
Still, it’s just got easier to create images like this:
You’re welcome.