Hello HackerNews. My name is Sangwu Lee . I work for Krea and I led the research efforts around the post-training for this model. I'll try to answer any questions you may have, but I recommend you read the technical report I wrote on our site (https://www.krea.ai/blog/flux-krea-open-source-release).
I also see that my colleagues already commented here, but I'll try to answer questions you may have.
Regarding this part:
> Since flux-dev-raw is a guidance distilled model, we devise a custom loss to finetune the model directly on a classifier-free guided distribution.
Could you go more into detail on the loss used for this and other possible tips for finetuning those? I remember the general open source ai art community had a hard time with finetuning the original distilled flux-dev so I'm very curious about that.
Hello HackerNews. My name is Sangwu Lee . I work for Krea and I led the research efforts around the post-training for this model. I'll try to answer any questions you may have, but I recommend you read the technical report I wrote on our site (https://www.krea.ai/blog/flux-krea-open-source-release).
I also see that my colleagues already commented here, but I'll try to answer questions you may have.
The model looks incredible!
Regarding this part: > Since flux-dev-raw is a guidance distilled model, we devise a custom loss to finetune the model directly on a classifier-free guided distribution.
Could you go more into detail on the loss used for this and other possible tips for finetuning those? I remember the general open source ai art community had a hard time with finetuning the original distilled flux-dev so I'm very curious about that.
Best you comment on the bigger discussion (106 points, 41 comments) https://news.ycombinator.com/item?id=44745555
Relevant links:
- Model Technical Report: https://www.krea.ai/blog/flux-krea-open-source-release
- Huggingface model card: https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev
- Black Forest Labs announcement: https://bfl.ai/announcements/flux-1-krea-dev
- Reddit discussion: https://www.reddit.com/r/StableDiffusion/comments/1me2l80/ne...
hey hn! I'm one of the founders at Krea.
we prepared a blogpost about how we trained FLUX Krea if you're interested in learning more: https://www.krea.ai/blog/flux-krea-open-source-release
can you explain the TPO technique or perhaps point me to a reference please?