Towards Collecting Royalties for Copyrighted Data for Generative Models
IEEE International Conference on Web Services(2024)
摘要
Addressing issues of copyrighted data in the context of generative models has become an important issue for content creators, publishers, organizations training generative models, and those who deploy generative models for particular applications. Copyright holders want to ensure that they are fairly compensated for their work and users of training data and models do not want to expose themselves to litigation. However, traditional models of bulk-licensing data fit only poorly the context of model training. In this paper, we want to discuss why a traditional data license is not always a good fit, how data is used in the life-cycle of generative models and which impact data has on model output. This can be used as a foundation for a pay-per-(model)use compensation based how data contributes to a model’s output. Having a way to compensate copyright holders in this way reduces risk for model trainers, avoids large investments upfront, and encourage a lively data ecosystem in which the creation and distribution of original work is encouraged and fairly compensated.
更多查看译文
关键词
Generative AI,large language models,copyright,royalty,licensing,data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn