Hallucinations are a predictable byproduct of the way models are tested and rewarded during training. Models guess rather than admit uncertainty because testing in training rewards accuracy - not honesty.
This report is a milestone because it shifts hallucinations from a mystical flaw that's something inevitable and poorly understood to a solvable engineering problem. It means LLMs can be trained to say "I donโt know", if, and only if, we reward them for it.
Now that we know why, ๐ธ๐ฆ ๐ค๐ข๐ฏ ๐ข๐ค๐ต๐ถ๐ข๐ญ๐ญ๐บ ๐ฅ๐ฐ ๐ด๐ฐ๐ฎ๐ฆ๐ต๐ฉ๐ช๐ฏ๐จ ๐ข๐ฃ๐ฐ๐ถ๐ต ๐ช๐ต.