"a picture of a girl wearing glasses" / "a picture of a girl with red hair"
It might make sense to optimize those separately an then pick from the joint distribution
https://colab.research.google.com/drive/1lUfuFPDuK1qQInTKj4tGF5gmd7G9pLCA?usp=sharing notebook, based on nagolincs initial CLIP one
Works decently when going for one attribute that is reasonably common, not so well for more complex prompts or rare attributes
Lot of avenues for improvement, still #nc
@halcy inexplicably, fucking Junior Xenosaga showing up in the results