PEM: Prototype-based Efficient MaskFormer for Image Segmentation

Computer Vision and Pattern Recognition（2024）

引用 0|浏览32

摘要

Recent transformer-based architectures have shown impressive results in thefield of image segmentation. Thanks to their flexibility, they obtainoutstanding performance in multiple segmentation tasks, such as semantic andpanoptic, under a single unified framework. To achieve such impressiveperformance, these architectures employ intensive operations and requiresubstantial computational resources, which are often not available, especiallyon edge devices. To fill this gap, we propose Prototype-based EfficientMaskFormer (PEM), an efficient transformer-based architecture that can operatein multiple segmentation tasks. PEM proposes a novel prototype-basedcross-attention which leverages the redundancy of visual features to restrictthe computation and improve the efficiency without harming the performance. Inaddition, PEM introduces an efficient multi-scale feature pyramid network,capable of extracting features that have high semantic content in an efficientway, thanks to the combination of deformable convolutions and context-basedself-modulation. We benchmark the proposed PEM architecture on two tasks,semantic and panoptic segmentation, evaluated on two different datasets,Cityscapes and ADE20K. PEM demonstrates outstanding performance on every taskand dataset, outperforming task-specific architectures while being comparableand even better than computationally-expensive baselines.

查看译文

关键词

Computer Vision,Efficient Neural Network,Image Segmentation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

您的评分 :

暂无评分

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn