if your problem with lack of data is at such a scale that the entire corpus of public domain writing isn’t enough, how the fuck are you going to pay people to generate enough to ever make up for that?
see everyone talks about how much data is on the internet, but if we’re already “running out” of “quality” data to train AI on maybe we should leave AI alone for a bit and go back to just, ya know, letting humans write shit
You know what they say: "Sell a magazine an article, it'll publish for a day. Teach an AI to write, and that magazine will crank out unreadable horseshit for *years*."
I've had multiple people approach me to train the AI they think will destroy my livelihood. So I can prove this is a thing.
The pay is about where you would imagine.
I have worked to see if AI could tutor on math with the question and the answer and distractor responses with reasons and it did such a horrible job that my company dropped the project altogether.
The "too complicated" grift has been commonplace in Silicon Valley for 30+ years. It's part of the "engineering can do anything better than you" trope that's popular among folks who, though smart enough, are oddly incapable of effective communication.
This is not going to work unless you do a literal warehouse full of people clacking away and another warehouse of people checking the work.
Even then it's a drop in the bucket compared to how much content even a small-scale LLM would need.
Facebook gets flooded with these ads for different types of writers. And the comments sometimes have people saying "I sent a resume but I still haven't heard back about the job". So I gotta wonder if they're just training the AI on resume submissions.