Dorian Gálvez López
Software: DBoW2: Enhanced hierarchical bag-of-word library for C++

Download

DBoW2-1.0.2.tar.gz (2013-02-07, 633 KB) , includes demo application

Documentation

Updates

Version 1.0.2:

  • Compatible version with OpenCV 2.4.3.
  • Bug with the score flag when copying vocabularies solved.

Version 1.0.1:

  • Bug when loading a database fixed.
  • Minor problems when installing lib files solved.

Overview

DBoW2 is an improved version of the DBow library, an open source C++ library for indexing and converting images into a bag-of-word representation [3]. It implements the hierarchical tree described by [4] for approximating nearest neighbours in the image feature space and creating a visual vocabulary. DBoW2 also implements an image database with inverted and direct files to index images and enabling quick queries and feature comparisons. The main differences with the previous DBow library are:

  • DBoW2 classes are templated, so it can work with any type of descriptor.
  • DBoW2 is shipped with classes to directly work with SURF64 or BRIEF descriptors.
  • DBoW2 adds a direct file to the image database to do fast feature comparison. This is used by DLoopDetector.
  • DBoW2 does not use a binary format any longer. On the other hand, it uses the OpenCV storage system to save vocabularies and databases. This means that these files can be stored as plain text in YAML format, making compatibility easier, or compressed in gunzip format (.gz) to reduce disk usage.
  • Some pieces of code have been rewritten to optimize speed. The interface of DBoW2 has been simplified.
  • For performance reasons, DBoW2 does not support stop words.

DBoW2 requires OpenCV and the Boost::dynamic_bitset class in order to use the BRIEF version.

DBoW2, along with DLoopDetector, has been tested on several real datasets, yielding an execution time of 3 ms to convert the BRIEF features of an image into a bag-of-words vector and 5 ms to look for image matches in a database with more than 19000 images. Check [1] to obtain more information.

License

DBoW2 is published under a CC-BY-NC-SA license. To obtain a commercial license, contact me.

Citing

If you use this software in an academic work, please cite:

@ARTICLE{GalvezTRO12,
    author={Galvez-Lopez, Dorian and Tardos, J. D.}, 
    journal={IEEE Transactions on Robotics},
    title={Bags of Binary Words for Fast Place Recognition in Image Sequences},
    year={2012},
    month={October},
    volume={28},
    number={5},
    pages={1188--1197},
    doi={10.1109/TRO.2012.2197158},
    ISSN={1552-3098}
}

Installation notes

DBoW2 requires OpenCV and the Boost::dynamic_bitset class in order to use the BRIEF version. You can install Boost by typing:

  $ sudo apt-get install libboost-dev
  

To build the library and try the demo, just type:

  $ make
  $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./lib
  $ ./demo
  
To install the library in your system (/usr/local by default), type:
  $ make install
  

Usage notes

Weighting and Scoring

DBoW2 implements the same weighting and scoring mechanisms as DBow. Check them here. The only difference is that DBoW2 scales all the scores to [0..1], so that the scaling flag is not used any longer.

Save & Load

All vocabularies and databases can be saved to and load from disk with the save and load member functions. When a database is saved, the vocabulary it is associated with is also embedded in the file, so that vocabulary and database files are completely independent.

You can also add the vocabulary or database data to any file opened with a cv::FileStorage structure.

You can save the vocabulary or the database with any file extension. If you use .gz, the file is automatically compressed (OpenCV behaviour).

Implementation notes

Template parameters

DBoW2 has two main classes: TemplatedVocabulary and TemplatedDatabase. These implement the visual vocabulary to convert images into bag-of-words vectors and the database to index images. These classes are templated:

  template<class TDescriptor, class F>
  class TemplatedVocabulary
  {
    ...
  };
  
  template<class TDescriptor, class F>
  class TemplatedDatabase
  {
    ...
  };
  
Two classes must be provided: TDescriptor is the data type of a single descriptor vector, and F, a class with the functions to manipulate descriptors, derived from FClass.

For example, to work with SURF descriptors, TDescriptor is defined as std::vector<float>, where each vector contains 64 or 128 float values. When features are extracted from an image, a std::vector<TDescriptor> must be obtained. In the case of BRIEF, TDescriptor is defined as boost::dynamic_bitset<>.

The F parameter is the name of a class that implements the functions defined in FClass. These functions get TDescriptor data and compute some result. Classes to deal with SURF and BRIEF descriptors are already included in DBoW2. (FSurf64, FBrief).

Predefined Vocabularies and Databases

To make it easier to use, DBoW2 defines two kinds of vocabularies and databases: Surf64Vocabulary, Surf64Database, BriefVocabulary, BriefDatabase. Please, check the demo application to see how they are created and used.

Related publications

[1] Dorian Gálvez-López and Juan D. Tardós
Bags of Binary Words for Fast Place Recognition in Image Sequences
IEEE Transactions on Robotics, Volume 28, Number 5, Pages 1188-1197, October 2012
[Bibtex] [PDF] [Video 1, 89MB] [Video 2, 236MB]

@ARTICLE{GalvezTRO12,
    author={Galvez-Lopez, Dorian and Tardos, J. D.}, 
    journal={IEEE Transactions on Robotics},
    title={Bags of Binary Words for Fast Place Recognition in Image Sequences},
    year={2012},
    month={October},
    volume={28},
    number={5},
    pages={1188--1197},
    doi={10.1109/TRO.2012.2197158},
    ISSN={1552-3098}
}

[2] Dorian Gálvez-López, Juan D. Tardós
Real-Time Loop Detection with Bags of Binary Words
International Conference on Intelligent Robots and Systems, September 2011
[Bibtex] [PDF]

@INPROCEEDINGS{GalvezIROS11,
    author={Galvez-Lopez, Dorian and Tardos, Juan D.},
    booktitle={Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on},
    title={Real-time loop detection with bags of binary words},
    year={2011},
    month={sept.},
    volume={},
    number={},
    pages={51 -58},
    keywords={},
    doi={10.1109/IROS.2011.6094885},
    ISSN={2153-0858}
}

[3] Josef Sivic and Andrew Zisserman
Efficient Visual Search of Videos Cast as Text Retrieval
2009 IEEE Transactions on Pattern Analysis and Machine Intelligence

[4] D. Nistér and H. Stewénius
Scalable Recognition with a Vocabulary Tree
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Dorian Gálvez López - Robotics, Perception and Real Time Group - Universidad de Zaragoza. Last update: 2013-02-07 19:55 CET