
Had a chat the other day about using AI to extract documents with certain characteristics from a large set of documents, and how this differs from question answering and text generation. If the corpus is very large, false positives and negatives might not matter much — so there’s less at risk —
— less at risk than the consequences of fact errors that occur in QA or text generation. The human labor shifts to a different part of the operation, though: training the system to recognize those diverse documents you hope to extract.