Post

Avatar
Nothing is sacred anymore.
Avatar
can’t wait for it to make up shit and students to fail their papers for not reading their citations
Avatar
That's not a huge problem when dealing with a limited corpus. The LLM can piece together words which you can then tokenize and vectorize and then compare to the same info found in the limited dataset it is supposed to talk about. If the LLM goes off into la-la land have it try again.
Avatar
So you’re saying the trick is to use the lie machine but not trust it any farther than you can throw it?
Avatar
YES! That is a nice way of putting it. Checking the output isn't easy but we're coming up with ways of doing it. Kind of exciting!
Avatar
That’s where I get lost; the part where it lies and you have to say “no you’re lying stop that and try again” which requires that you know when it is lying If I already know when it’s lying, why did I need to ask in the first place?
Avatar
so it can lie and make stuff up? which means students can just believe it and not check their sources? soooo…
Avatar
Yeah if I was able to just ask jstor to find me evidence supporting my hypothesis that the earth is flat, and then copy garbage it spat out, checked the references it generates to make sure the url loads and call it good I might have graduated having learned literally nothing but to use a chatbot
Avatar
The computer checks its own output. Example being if you had a corpus of recipes. Documents will be strong in dimensions related to edible things. Tomatoes? Yes. Cheese? Yes. Wheat? Yes. Pepperoni? Yes. Glue? That's an office supply, not part of a recipe. Computer tells bot to try again.
Avatar
This also is how police stay incorruptible
Avatar
It's always cute seeing an enthusiastic idiot
Avatar
Avatar
Of course, but for it to spit out a wrong response it has to align with a vector that exists in the corpus close enough for the system to consider it true. What "angle" is allowed off ground truth is subjective.
Avatar
Ah, that's part of the computer's job, not the user. FIguring out how close you are to "ground truth" is hard in a generic context but easier when dealing with a limited corpus.