Sphinx: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting.
CoRR(2024)
摘要
Despite the remarkable success of LLMs in English, there is a significant gap in performance in non-English languages. In order to address this, we introduce a novel recipe for creating a multilingual synthetic instruction tuning dataset, sPhinX, which is created by selectively translating instruction response pairs from English into 50 languages. We test the effectiveness of sPhinx by using it to fine-tune two state-of-the-art models, Mistral-7B and Phi-Small and then evaluating them across a comprehensive suite of multilingual benchmarks that test reasoning, question answering, reading comprehension and machine translation. Our results show that Mistral-7B and Phi-Small fine-tuned with sPhinX perform better on an average by 5 compared to the base variants of these models. We also devise a strategy to incorporate N-shot examples in each fine-tuning sample which further boosts the performance of these models by 9 to vanilla fine-tuning. To show efficacy of our data curation approach, we also directly translate our original dataset to the target languages, and observe an increase of 7 other multilingual instruction tuning datasets in both efficiency and diversity, reducing dataset creation costs. It also maintains strong performance on standard English LLM benchmarks, with minimal regression.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn