The Stylite: Difficult (for me at least) to resist the conclusion they kind of need to start again by which I mean train on much more tightly defined bodies of information. Ofc that implies a more limited product but c’est la vie.

How can social media organisations possibly be expected to compete with bots when they have become so startlingly realistic?

Facebook is good again

Seriously though, I find it so difficult to understand why the largest website in the world can’t see that a new account whose only post is ‘daddy, I lonely tonight and need a strong man for company, DM me’ might be worthy of a temporary activity pause?

One of the most fascinating things about generative AI is the businesses and states that are using it are largely doing “we’ve sped up this lengthy process”, whereas the tech companies’ largely seem to be going “we made this thing you like worse”.

An eye-opener for me was a call with colleagues in... let's just call it "country with much worse work-life balance rules". I've been quite critical of gen AI tools we've been testing, because the output quality is unacceptable and cleaning it up is as much work as writing from scratch. But...

Difficult (for me at least) to resist the conclusion they kind of need to start again by which I mean train on much more tightly defined bodies of information. Ofc that implies a more limited product but c’est la vie.

Yes the genAI and indeed much of the ML community have been frankly childish in their willy-nilly use of data. The use of focused smaller language models (based on curated data sets relevant to a domain) is in my mind the important next step.

It strikes me as a classic sales phenomenon. Plus my hunch is there’s a deeper problem relating to efforts to continue to expand a very overvalued US tech bubble. Also see the increased politicization of some of the tech folks.

absolutely this. I think it is part of a continuum with the enshittification of financial products since the mid-80s. (currently trying to formulate these vague parallels into a cohesive narrative....)

That doesn’t really play out though does it? Even in scenarios where you’re literally just feeding it meticulously accurate code libraries - it still produces meaningless incorrect statistical hodge podges - garbage code. They’re statistical autocomplete machines. The garbage bullshit is baked in.

So the challenge there is you kind of need to add a lot of low quality text to your training set - to use a human analogy, someone who has only ever read engineering textbooks will be a terrible writer *even of engineering textbooks*. But LLM-type AI can't distinguish between text and knowledge

Reading Kurt Vonnegut has probably made me a better writer even for boring scientific texts, but that's because I can learn from the style of a great writer without simultaneously learning that soldiers in WWII can become unstuck in time and get abducted by aliens

Presumably this is why we’re seeing tie ups with eg the FT. High quality journalism provides both text and knowledge.

Even there, you need two different layers.* You could build a Stephen Bush simulator that will output "Rishi Sunak is not up to the job" forever, but it would struggle to analyse Starmer as PM in 2025. Its idea of what political writing looks like is tied up with what writing about Sunak looks like

* AIUI something like a grammar/style layer that understands words in terms of groups they belong to and how they fit together, and then a Bayesian knowledge/logic layer that encodes concepts and their relationships. A translator then gives the style layer input based on the logic layer's output

This is what my (uninformed, largely ignorant) brain has kind of dwelled on now and again - more controlled, specialist, legally safe models seem like… something? But I guess they’ll never get the trillion dollar cash-grab valuations.

There might be a problem with saying “we only use verified data” because [goose meme] who verified it, motherfucker? If you say “it’ll search all of human writing” you can wash your hands of copyright laws to a certain degree - it just reads the web!

And i suppose leading on from Stephen’s point about Lowe’s - the Telegraph or Times or FT have large, proprietary data sets that could be used to train eg the automatic application of the style guide. But that’s not as sexy to investors.

My huh hunch is that the best application for it is not answering yes / no ‘truth’ questions but in powering believable virtual reality / gaming environments.

I think the thing it can speed up is turning an address written in text in an email into a database entry, no matter where people have put commas or line breaks. But also: i could have written you a programme to that in 1998.

And ultimately the problem with AI is that is has no worthwhile use case beyond anything i could have written in a programme in 1998.

It has lots of worthwhile use cases, just they are mostly quite boring.

Eh, my experience is GitHub copilot is decent. The autocompletes can be helpful if you know what you’re doing. Also good for writing throwaway scripts faster.

More DIVERSE products. Training costs still need to drop a bit more, and software suites to generate synthetic data need to improve a bit more still, but once there, we'll be able to build LLMs and build a whole internet of them. Web3.0 or maybe not even "web" anymore. More detail follows.

Large language models will not just yield monolithic systems that try to distill every piece of content that it can. Personal models will be a major component of the way we interact with content in the future.

Transforming the Web with Artificial Intelligencemedium.com The future of the web is a living representation of our content.

Post