yeah I naïvely thought that nobody would care about using a vision model, since image classifiers seem non controversial to me? I wouldn’t dare touch a diffusion model since I agree those suck in every conceivable way. anyway clearly people don’t see it that way
Confession time: Those alt-text started as ChatGPT image recognition results. I occasionally use it to create alt-text, when I want to capture all details I otherwise would miss. Usually it requires some minimal editing, but otherwise it's very good. The only good AI use case in my opinion.