How did you run the benchmarking, zero-shot or few-shot? I think a fair comparis...

		mnkv on April 19, 2023 \| parent \| context \| favorite \| on: StableLM: A new open-source language model How did you run the benchmarking, zero-shot or few-shot? I think a fair comparison would be Llama-7B which got an average ~35% for 5-shot.

5-shot prompting.