Enhancing V-SLAM Keyframe Selection with an Efficient ConvNet for Semantic Analysis

Abstract

Selecting relevant visual information from a video is a challenging task on its own and even more in robotics, due to strong computational restrictions. This work proposes a novel keyframe selection strategy based on image quality and semantic information, which boosts strategies currently used in Visual-SLAM (V-SLAM). Commonly used V-SLAM methods select keyframes based only on relative displacements and amount of tracked feature points. Our strategy to select more carefully these keyframes allows the robotic systems to make better use of them. With minimal computational cost, we show that our selection includes more relevant keyframes, which are useful for additional posterior recognition tasks, without penalizing the existing ones, mainly place recognition. A key ingredient is our novel CNN architecture to run a quick semantic image analysis at the onboard CPU of the robot. It provides sufficient accuracy significantly faster than related works. We demonstrate our hypothesis with several public datasets with challenging robotic data.

Publication
International Conference on Robotics and Automation (ICRA 2019).
Luis Riazuelo
Luis Riazuelo
Assistant professor