Stejormur: So the challenge there is you kind of need to add a lot of low quality text to your training set - to use a human analogy, someone who has only ever read engineering textbooks will be a terrible writer *even of engineering textbooks*. But LLM-type AI can't distinguish between text and knowledge

How can social media organisations possibly be expected to compete with bots when they have become so startlingly realistic?

Facebook is good again

Seriously though, I find it so difficult to understand why the largest website in the world can’t see that a new account whose only post is ‘daddy, I lonely tonight and need a strong man for company, DM me’ might be worthy of a temporary activity pause?

One of the most fascinating things about generative AI is the businesses and states that are using it are largely doing “we’ve sped up this lengthy process”, whereas the tech companies’ largely seem to be going “we made this thing you like worse”.

An eye-opener for me was a call with colleagues in... let's just call it "country with much worse work-life balance rules". I've been quite critical of gen AI tools we've been testing, because the output quality is unacceptable and cleaning it up is as much work as writing from scratch. But...

Difficult (for me at least) to resist the conclusion they kind of need to start again by which I mean train on much more tightly defined bodies of information. Ofc that implies a more limited product but c’est la vie.

So the challenge there is you kind of need to add a lot of low quality text to your training set - to use a human analogy, someone who has only ever read engineering textbooks will be a terrible writer *even of engineering textbooks*. But LLM-type AI can't distinguish between text and knowledge

Reading Kurt Vonnegut has probably made me a better writer even for boring scientific texts, but that's because I can learn from the style of a great writer without simultaneously learning that soldiers in WWII can become unstuck in time and get abducted by aliens

Presumably this is why we’re seeing tie ups with eg the FT. High quality journalism provides both text and knowledge.

Even there, you need two different layers.* You could build a Stephen Bush simulator that will output "Rishi Sunak is not up to the job" forever, but it would struggle to analyse Starmer as PM in 2025. Its idea of what political writing looks like is tied up with what writing about Sunak looks like

* AIUI something like a grammar/style layer that understands words in terms of groups they belong to and how they fit together, and then a Bayesian knowledge/logic layer that encodes concepts and their relationships. A translator then gives the style layer input based on the logic layer's output

Post