Krea 2: SOTA open-weights 12B image model (krea.ai)

187 points by mattnewton a day ago

mattnewton a day ago

Hi HN, we're releasing weights for our latest text to image model and publishing this writeup on how we trained it in quite a bit of depth.

I hope there is something in the report for everyone, we included a fair bit on the actual training and data infrastructure usually not written about much, that I think will be interesting to people here. There's more that didn't fit, happy to answer questions!

vunderba 21 minutes ago

Neat! Between Ideogram4, Flux2, Qwen-Image, ZiT, and Krea - there's been a lot of positive movement in the open-weights space.

The original Flux.1 Krea is actually in my GenAI Showdown benchmark site from all the way back in July of last year (which feels like a lifetime in this space), so I’m looking forward to putting this new one through its paces.

ttul 5 hours ago

This is a massive technical report for an open weights image gen model. As someone who has followed this space closely, it’s really cool to read about the behind-the-scenes experimentation and effort that went into the final product. I hope you will release some of the find tuning tools so the community can experiment with them as well and really push what the model’s capable of.

dvrp an hour ago

Thanks! You should definitely check out the r/stablediffusion sub-reddit; people are going crazy over it!

We also had 0-day support from people like Ostris and ComfyUI from the open source community

mattnewton 2 hours ago

You can find some links and details in the GitHub readme for finetuning / LoRA support. Ostiris, musubi tuner, fal and hugging face diffusers are all day-0 supported :) https://github.com/krea-ai/krea-2

We recommend training off the undistilled, Raw checkpoint, and then applying the LoRA to the Turbo model for inference.

ttul 4 minutes ago

dvrp an hour ago

Hello HN,

I am Diego Rodriguez, Co-founder & CTO at Krea.

We are releasing the weights and a _juicy_ technical report---at least given current industry standards. In it we describe data curation/captioning, model architecture, post-training, RL pipelines, prompt expansion, style references, and our infrastructure in great detail.

When it comes to theweights themselves, there's actually 2 releases:

* Krea 2 Turbo. This model is both guidance- and timestep- distilled for faster inference.

* Krea 2 RAW. This model is actually meant to be hackable/fine-tunable

One of the things we think the (open) LLM community does well is release models in different sizes and also at different stages of the training pipelines; we are releasing two checkpoints at both the mid-training and post-training stage. This is rare in the image & multimedia community, so we can't help it but to feel proud of this release.

We are on par with Nano Banana in terms of image quality as per Artificial Analysis text-to-image benchmarks (https://artificialanalysis.ai/image/leaderboard/text-to-imag...).

We also attached a permissive license for individuals and small businesses.

Useful links:

- Marketing page around the OSS release: https://www.krea.ai/krea-2-open-source

- Huggingface model: https://www.krea.ai/krea-2/huggingface

- GitHub repository: https://www.krea.ai/krea-2/github

- Reddit AMA: https://www.reddit.com/r/StableDiffusion/comments/1udnm0a/we...

- Technical report: https://www.krea.ai/blog/krea-2-technical-report Thank you and I hope you enjoy this release---happy hacking!

Some of our team members will be answering questions since we are at the front page for now (thank you HN!).

Happy hacking!

ACCount37 44 minutes ago

Good to have more open weight models, and I really appreciate the in-depth write-up.

I also like the "keep the manifold wide" approach of trying to make a model capable of many styles as opposed to getting it "dialed in" for a dozen of style presets.

But it does feel very much like "fighting the past war" - now that advanced "image-to-image"/"agentic composition" models like Nano Banana 2 or Images 2.0 are out there in force.

I seriously doubt that the basic Qwen 3 VL in cross can get anywhere near that level of I2I. And robust I2I is very desirable - editing, adjustment, character consistency, the generalization of whatever you're doing with style transfer now (underexplained BTW).

Trying to hit that level of I2I is not by any means easy, but it's pretty clear to me that this is where the next frontier for image models lies. Feels like Ideogram might be building up to it, but I'm yet to see it anywhere else in open weight space.

dvrp 27 minutes ago

I appreciate the skepticism but we find internally that this model is used more than Nano Banana for many cases like moodboarding (also, 4x cheaper than NBP never hurts). Agentic workflows are compatible with Krea 2 so I’m not sure I follow there. If you are talking about an edit model, that’s coming too.

Also, we are on par with them in t2i benchmarks, check the artificial analysis link I posted in my top comment.

And you cannot re-train nano banana or ChatGPT to understand your brand, which is what our customers complain about constantly.

Plus open-source! It’s hard to do an apple to apple comparison.

refulgentis 30 minutes ago

This model does image to image; whats the issue with Qwen 3 VL; is style transfer unexplained? " reference" is mentioned 11 times on the page (more specifically, I read it and it seemed to discuss it a lot)

justinclift 5 hours ago

Interesting item on the careers page btw. For anyone that knows what older school Mellanox was about, it might be your kind of thing: https://jobs.ashbyhq.com/krea/ebe94024-eef6-4306-a019-10072a... :D

kodablah 4 hours ago

pwython 2 hours ago

Looking forward to playing with Krea 2, I use Z-Image Turbo daily -- it has replaced my stock photo subscriptions, for realism and illustrations.

May I ask how much did the training cost you?

sangwulee 2 hours ago

A lot of coffee for sure. Regarding the training cost, it's hard to give a good estimate because we used a shared kubernetes cluster with inference + research workloads.

BoredPositron 3 hours ago

It's a good model sadly the use of the qwen vae is a bit of a downer.

mattnewton 2 hours ago

Krea 2 Large (on the website and api) was trained with the FLUX 2 VAE, if you want to test it out and push realism. After working with both I think the flux VAE has a slight edge in learning realistic textures but it's smaller than you might think, the Qwen VAE was overall very good in ablations and good at learning to produce a diverse set of styles.

BoredPositron 2 hours ago

[flagged]

dang an hour ago

mattnewton 2 hours ago

mobiuscog 3 hours ago

It's been mentioned by some that using the wan2.1 vae instead solves this. I haven't personally had time to try yet.

dvrp an hour ago

There is a lot of discourse about it on Reddit. Check the AMA link I put in the comment above for learning more. The basics is it wasn’t released when we started and we use it for internal models and hope to do further open source releases.