brightshadow is living más 🔔: So you’re saying the trick is to use the lie machine but not trust it any farther than you can throw it?

Nothing is sacred anymore.

can’t wait for it to make up shit and students to fail their papers for not reading their citations

That's not a huge problem when dealing with a limited corpus. The LLM can piece together words which you can then tokenize and vectorize and then compare to the same info found in the limited dataset it is supposed to talk about. If the LLM goes off into la-la land have it try again.

So you’re saying the trick is to use the lie machine but not trust it any farther than you can throw it?

YES! That is a nice way of putting it. Checking the output isn't easy but we're coming up with ways of doing it. Kind of exciting!

That’s where I get lost; the part where it lies and you have to say “no you’re lying stop that and try again” which requires that you know when it is lying If I already know when it’s lying, why did I need to ask in the first place?

so it can lie and make stuff up? which means students can just believe it and not check their sources? soooo…

Yeah if I was able to just ask jstor to find me evidence supporting my hypothesis that the earth is flat, and then copy garbage it spat out, checked the references it generates to make sure the url loads and call it good I might have graduated having learned literally nothing but to use a chatbot

The computer checks its own output. Example being if you had a corpus of recipes. Documents will be strong in dimensions related to edible things. Tomatoes? Yes. Cheese? Yes. Wheat? Yes. Pepperoni? Yes. Glue? That's an office supply, not part of a recipe. Computer tells bot to try again.

This also is how police stay incorruptible

It's always cute seeing an enthusiastic idiot

but can it be wrong

Of course, but for it to spit out a wrong response it has to align with a vector that exists in the corpus close enough for the system to consider it true. What "angle" is allowed off ground truth is subjective.

Ah, that's part of the computer's job, not the user. FIguring out how close you are to "ground truth" is hard in a generic context but easier when dealing with a limited corpus.

Post