Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But we don’t know how much larger the models will have to be, how large the data sets or how much trianing is needed, do we? They could have to be inconceivably large.

If you want to correct for this particular problem you might be better off training a face detector, an eye detector and a model that takes two eyes as input and corrects for this problem. Process then would be:

- generate image

- detect faces

- detect eyes in each face

- correct reflections in eyes

That is convoluted, though, and would get very convoluted when you want to correct for multiple such issues. It also might be problematic in handling faces with glass eyes, but you could try to ‘detect’ those with a model that is trained on the prompt.



> They could have to be inconceivably large.

The opposite might also be true. Just having better, well curated data goes a long way. LAION worked for a long time because it's huge, but what if all the garbage images were filtered out and the annotations were better?

The early generations of image and video models used middling data because it was the only data. Since then, literally everyone with data has been working their butts off to get it cleaned up to make the next generation better.

Better data, more intricate models, and improvements to the underlying infrastructure could mean these sorts of "improvements" come mostly "for free".


ADetailer does exactly that. Feels like this large thread above is non-practicing for the most part.

There’s no eyes module in it by default, but it’s trivial-ish to add, and a hires eyes dataset isn’t hard to collect either.

Just found eyes model on https://civitai.com/models/150925/eyes-detection-adetailer (seems anime only)


I feel like a GAN method might work better, building a detector, and training the model to defeat the detector.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: