I’ve spent the better part of a year testing AI image generation models, trying to figure out which one can produce beautiful illustrated pages for personalised books for grandparents. The answer is that none of them do it perfectly. But one gets closest.
That’s Seedream 4.5 from ByteDance. Here’s how I got there.
Midjourney: built for artists, not products
The first model I used seriously was Midjourney, version 7. It produces genuinely beautiful images. But a lot of those images had almost nothing to do with the prompts that generated them. That’s not an accident. Midjourney is still largely built around a human with taste plucking the good outputs from a batch of generations, discarding what doesn’t work, and trying again. What the community calls “prompt adherence” is less the model following your instructions and more you getting lucky with a beautiful accident.
For personal artistic work that’s completely fine. For a product that needs to generate a specific scene from a specific person’s life and get it right, it doesn’t work. You can’t build a pipeline around hoping.
There’s also the API problem. As of 2026, Midjourney still doesn’t have an official public API. Third-party wrappers exist, but they use browser automation and violate the terms of service. Your account can be banned at any point. Midjourney has made a deliberate choice to optimise for artistic merit and manual creation rather than enabling products. I respect that. But it ruled them out on day one.
Nano Banana (Google’s Gemini image models)
Google’s Nano Banana models (Nano Banana 2 and Nano Banana Pro) are built on top of Gemini’s image generation capabilities.
The prompt adherence is good. When you describe something specific, they execute it. For a use case where the prompt is doing heavy lifting (describing hair colour, clothing, era, setting, multiple characters), that matters.
Two things stopped me from using them.
First, the aesthetic. Look at the autumn campus image below. The scene is well composed. The colours are nice. But there’s an evenness to everything: every element rendered to the same level of finish, no paper showing through, no soft edges where pigment might bleed. It looks like what most people picture when they think “AI image.” For a book people are going to keep on a shelf, that’s a problem.
Second, the cost. Nano Banana 2 is $0.08 per image, Nano Banana Pro is $0.15. Seedream 4.5 is $0.04. When you’re generating 24+ illustrations per book and offering up to 50 free regenerations during the review phase, that difference compounds fast. I tried Nano Banana Pro briefly for rendering page titles into the illustrations, but at $0.15 per attempt for something that inconsistent, you’re better off writing CSS and using a font you can actually control.
Flux
Flux Pro has a pleasant illustration style and reasonable prompt adherence. But look at the figure below: well composed, nice rendering of fabric, completely generic face. It took inspiration from the reference photos rather than learning from them. For a product where the book is supposed to look like the specific person it’s about, that’s a fundamental failure.
Flux 2 is worse in a different way. The outputs look like AI slop: too smooth, too clean, every line in the right place and none of the imperfect details that make a watercolour feel handmade. Compare the three life-stage panel below to the Seedream 4 Alpine image further down. One of them looks painted. The other looks generated.
Grok / Aurora
The output is technically decent: reasonable prompt adherence and an aesthetic that works better than most. But I have a personal aversion to funding Elon Musk’s ventures. His platforms have spent years amplifying hateful content, he’s shown active enthusiasm for political destabilisation in Europe, and his general direction of travel is one I’d rather not contribute to financially. Other people will weigh that differently, and it’s a legitimate model to test. It’s off my list for reasons that have nothing to do with image quality.
Landing on Seedream 4
Seedream 4 was where I found something I hadn’t seen in any other model: it actually looked like watercolour.
A note on the images that follow: they’re all of me, across different ages. The boy is me too.
There’s a warmth and softness to the rendering that gives pages a handmade quality I haven’t found elsewhere. Look at the Alpine image below: the mountain background dissolves rather than terminates, the grass at his feet bleeds into the slope. For a book a grandparent is going to hold and pass to her grandchildren, that matters more than technical fidelity.
The problem was reliability. Seedream 4 would produce a beautiful page and then, on the next generation, give you something anatomically wrong. The water gun fight below is a good example: lovely composition, great summer energy, the warmth of late afternoon light. The boy in the foreground also has three arms.
Any time the prompt described a complex arrangement, multiple characters, unusual angles, active scenes, it could fall apart. And when reference photos are 50 years old, scanned from an actual photobook, the model has less to work with and the results get less reliable. That’s not an edge case for this product. Most of the people we’re making books about lived before the era of digital photography.
Moving to 4.5
I held off on the upgrade longer than I should have. My first outputs from 4.5 didn’t grab me the way version 4 had, and I initially read that as a quality drop. It wasn’t. The aesthetic is slightly different, and I needed to adjust to it. What actually changed is prompt adherence: the anatomical failures and composition collapses that plagued version 4 are materially reduced in 4.5.
The campus image below shows what that looks like in practice. He’s actually moving, the likeness is closer, and the watercolour quality is still there: the splashes in the sky, the negative space on the path, the way the background figures are suggested rather than fully rendered.
Not that the problems are gone entirely, which is why we built a full review pipeline where a human checks every page before anything goes to print. People are trusting us with their family stories. A grandparent receiving a book where she’s depicted with three hands is not acceptable.
One complication I didn’t anticipate: ethnicity drift. Seedream is trained predominantly on Asian data, and without explicit guidance, the model tends to render faces with East Asian features by default. For a book about someone’s grandmother who grew up in rural Bavaria or suburban London, that creates an obvious mismatch: not because there’s anything wrong with those features, but because they don’t represent who these people actually were.
The fix was an optional intake step where customers can describe their family background, alongside a vision model that analyses reference photos and extracts physical descriptors to pass into the image prompt. It works reasonably well. But collecting ethnic background data means collecting a special category under GDPR, which requires explicit consent and more careful handling than a standard data field. A small technical fix that opened a non-trivial compliance question.
Why Seedream 5 didn’t work
I tested it. The outputs were disappointing.
ByteDance made a deliberate trade-off with 5.0 Lite: it’s designed for commercial product photography, advertising layouts, poster design. The photorealism is better. But look at that image and compare it to the Seedream 4 Alpine scene above: every inch of the canvas is filled. The jacket has individually rendered buttons. The town below has fully detailed rooftiles. There’s no negative space, no paper showing through, nowhere for the eye to rest. It feels like a photo with paint brushed over the top of it rather than something actually painted.
ByteDance have said the full 5.0 release will revisit this. As of the Lite version, it’s not the right tool for illustrated family books.
Where things stand
Seedream 4.5 is my current answer. Not a perfect one: no model is, which is why the editing feature exists and why every page needs a human in the loop before it gets anywhere near a printer.
The model comparison that actually matters in this space isn’t benchmark scores. It’s how the output looks to a 70-year-old holding a book about her life. Most of these models have never been optimised for that audience. Seedream 4.5 gets closer than anything else I’ve tried.
I’m building Memolio — personalised illustrated books for grandparents, made from real photos and real memories. Every other personalised grandparent book puts her name in a made-up story. Memolio puts her actual story in a book. If that sounds like something you’d want for your family, join the waitlist.







