cam 🌊🌙🕯️: …how did people think the classification would work if it didn’t use ML lol

Deleted.

yeah I naïvely thought that nobody would care about using a vision model, since image classifiers seem non controversial to me? I wouldn’t dare touch a diffusion model since I agree those suck in every conceivable way. anyway clearly people don’t see it that way

…how did people think the classification would work if it didn’t use ML lol

I guess that’s a silly question, there’s definitely a lot of required knowledge for understanding the solution spaces for these problems and not everyone can be expected to have that knowledge

I don’t think people realized it was an image classifier. I naively assumed (never assume!) it was a random number generator or maybe a deterministic hash of a display name or something. Samuel updated the description though, so people can make an informed decision if they prefer not to opt in.

yeah I apologise that it wasn't there from the start, lesson learnt

Hey, you had a great idea, people clearly love the Kiki vs Bouba (faux?) rivalry. OpenAI and GenAI in general are just very polarizing topics and full-stop deal breakers for many people. 🫣

that I certainly have discovered!

I’m sorry you had to discover that the hard way 😅 I hope your week goes better from here!

thanks :)

I'm no programmer or anything, but I did program an dice roller in Q basic in high school. Wouldn't it be way less time, bandwidth, and resource intensive to just have a number generator?

For sure! RNG is much less resource intensive. If the goal is to get the same answer for the same user every time, another low-resource option would be something like a hashing algorithm. But idk what the goal was in the implementation, if some analysis of an image was the whole point, etc 🤔

luckily pinging something like GPT4o (which is designed to be really efficient relative to prev models) consumes very few resources to the point where it's negligible, and libraries out there make it very easy for the developer to build. doing the actual classification part prob took 1hr or less

afaict, the process is: 1. send a very low res version of your avatar along with a message that's like "based off these definitions of kiki and bouba, , what do you think this image is more like?" 2. API responds with "looks kiki" or "looks bouba" 3. labeler applies label

you could definitely do it randomly! and that would absolutely be more efficient. but then you lose out on it having some slight level of meaning (rounder/glorpier profile pics would no longer be more likely to be bouba, for example) so depends on what you're optimizing for

Hmm, I’d be curious how you’re defining “negligible.” Surely the CPU, GPU, and memory consumption of a GPT4o generation is significantly higher than generating a random number or hash (some hashes will vary). GPT4o may be fast but resource consumption is still very high.

my guess is "several OOMs above generating a random number" and "at least one OOM below your smartphone cloudsyncing 2 photos"

It’s not a traditional image classifier (like an MLP network or whatever), though, it’s still using a large generative multimodal model. There’s nothing wrong with that, ofc, but the model architectures and techniques used are extremely different.

yeah, just from people talking about it I assumed it would be a consistent hash of the DID with a pepper

Yeah same here, never even thought it would be an image classifier. I thought it was completely random and that you could re-roll if you unliked and liked again.

Post