Privacy-Aware Visual Language Models
CoRR(2024)
摘要
This paper aims to advance our understanding of how Visual Language Models
(VLMs) handle privacy-sensitive information, a crucial concern as these
technologies become integral to everyday life. To this end, we introduce a new
benchmark PrivBench, which contains images from 8 sensitive categories such as
passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this
benchmark and observe a generally limited understanding of privacy,
highlighting a significant area for model improvement. Based on this we
introduce PrivTune, a new instruction-tuning dataset aimed at equipping VLMs
with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa
and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability
to recognize sensitive content, outperforming even GPT4-V. At the same time, we
show that privacy-tuning only minimally affects the VLMs performance on
standard benchmarks such as VQA. Overall, this paper lays out a crucial
challenge for making VLMs effective in handling real-world data safely and
provides a simple recipe that takes the first step towards building
privacy-aware VLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn