Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How did you run the benchmarking, zero-shot or few-shot? I think a fair comparison would be Llama-7B which got an average ~35% for 5-shot.


5-shot prompting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: