Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Language Models (Mostly) Know What They Know (arxiv.org)
43 points by PaulHoule on July 13, 2022 | hide | past | favorite | 6 comments


Authors from Anthropic, a splinter group from OpenAI

> With this fundraise, we’re going to explore the predictable scaling properties of machine learning systems, while closely examining the unpredictable ways in which capabilities and safety issues can emerge at scale

https://venturebeat.com/2022/06/27/10-new-ai-unicorns-flying...

Estimating confidence in predictions is an important safety issue. It's usually very hard to do correctly for small models. Fortunately large models have read everything so they have less unknown unknowns.


What I’ve found for simpler models is that probability calibration is the difference between a model you can use and ‘nothing more to see folks, please move on…’

I’ve found that calibration frequently isn’t that hard to do, but mostly people don’t do it (e.g. they announce that their model is 99% accurate for a disease that occurs 1 in 10,000 to which the answer is ‘you can beat that accuracy by saying nobody has the disease) or maybe the model sucks (e.g. a calibrated full text retrieval system will never claim its results are better than 70% likely to be relevant.)


I was expecting the proposed approach to be a prompt hack that gets the model to output a well-calibrated confidence value as text.

"How confident are you in your answer? Let's calibrate carefully."



While I appreciate findings like this it feels more like alchemy than science. "Oh, we found this prompt X and it works better than prompt Y but pretty likely there is at least one prompt Z which we don't know yet that works even better". In addition to this, benchmark evaluation data sets are cool and everything but they represent real world LM application environments to a very limited extent only.


> it feels more like alchemy than science.

I’d say it feels more like science than engineering to capture the same idea. The whole point is that we don’t know how it all works yet, hence the science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: