CoNVOI: Context-aware Navigation Using Vision Language Models in Outdoor and Indoor Environments

2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)（2024）

引用 0|浏览98

摘要

We present ConVOI, a novel method for autonomous robot navigation inreal-world indoor and outdoor environments using Vision Language Models (VLMs).We employ VLMs in two ways: first, we leverage their zero-shot imageclassification capability to identify the context or scenario (e.g., indoorcorridor, outdoor terrain, crosswalk, etc) of the robot's surroundings, andformulate context-based navigation behaviors as simple text prompts (e.g.“stay on the pavement"). Second, we utilize their state-of-the-art semanticunderstanding and logical reasoning capabilities to compute a suitabletrajectory given the identified context. To this end, we propose a novelmulti-modal visual marking approach to annotate the obstacle-free regions inthe RGB image used as input to the VLM with numbers, by correlating it with alocal occupancy map of the environment. The marked numbers ground imagelocations in the real-world, direct the VLM's attention solely to navigablelocations, and elucidate the spatial relationships between them and terrainsdepicted in the image to the VLM. Next, we query the VLM to select numbers onthe marked image that satisfy the context-based behavior text prompt, andconstruct a reference path using the selected numbers. Finally, we propose amethod to extrapolate the reference trajectory when the robot's environmentalcontext has not changed to prevent unnecessary VLM queries. We use thereference trajectory to guide a motion planner, and demonstrate that it leadsto human-like behaviors (e.g. not cutting through a group of people, usingcrosswalks, etc.) in various real-world indoor and outdoor scenarios.

查看译文

关键词

Visual Question Answering,Object Recognition,Geovisualization,Image Captioning,Language Understanding

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

您的评分 :

暂无评分

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn