José M. Fácil
Short bio:
I am currently PhD. Student in Computer Science at the University of Zaragoza advised by Dr. Javier Civera and Dr. Luis Montesano. My research interests now are: Structure from Motion, SLAM, CNN, Recurrent Neural Networks...
I finished my Bs. in Computer Science at April 2015 in Zaragoza. I got the Research Initiation Scholarship funded by the Spanish Ministry for Education and Science at April 2014, my work during this Scholarship was advisored by Dr. Javier Civera and Dr. Luis Montesano and I researched about Monocular SLAM.
During my Ms. in Computer Science a I was offered with a position as Research Fellow in the University of Zaragoza to start a researching about Visual SLAM and Deep Learning. I defended my Master thesis in September 2016.
For a more detailed information, you can download my CV.
|
News:
- Our paper CAM-Convs is been acceptet at CVPR 2019!! Check out! (Project Page)
- Our first steps towards Condition-Invariant Multi-View Place Recognition!! Check out! (Project Page)
- Partitioned Nordland Dataset Available!! (Project Page)
- PanoRoom! Extended abstrac at ECCV 2018 Workshop: 3D Reconstruction meets Semantics (LINK)
- Paper accepted at IROS 2018 Workshop -- PPNIV18! (arXiv)
- Paper accepted at RA-L 2018 and IROS presentation! (arXiv)
- Paper accepted at IROS 2017 Workshop -- Learning for Localization and Mapping!
- Paper accepted at RA-L 2017!
- Our first steps towards fusing deep learning and multi-view geometry for monocular mapping (arXiv).
|
|
CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth
José M. Fácil, Benjamin Ummenhofer,Huizhong Zhou,Luis Montesano,Thomas Brox* and Javier Civera*
* Equal Contribution
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2019
abstract |
bibtex |
arXiv |
project website
Single-view depth estimation suffers from the problem that a network trained on images from one camera does not generalize to images taken with a different camera model. Thus, changing the camera model requires collecting an entirely new training dataset. In this work, we propose a new type of convolution that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns. Experiments confirm that this improves the generalization capabilities of depth prediction networks considerably, and clearly outperforms the state of the art when the train and test images are acquired with different cameras.
@inproceedings{facil2019camconvs,
author = {Facil, Jose M. and Ummenhofer,
Benjamin and Zhou, Huizhong and
Montesano, Luis and Brox, Thomas and Civera, Javier},
title = {{CAM-Convs: Camera-Aware
Multi-Scale Convolutions for Single-View Depth}},
booktitle = {IEEE Conference on
Computer Vision and Pattern Recognition (CVPR)},
year = {2019},
url = "https://webdiis.unizar.es/~jmfacil/camconvs/"
}
|
|
Corners for Layout: End-to-End Layout Recovery from 360 Images
Clara Fernandez-Labrador*, José M. Fácil*, Alejandro Perez-Yus,Cédric Demonceaux, Javier Civera and José J. Guerrero
* Equal Contribution
Preprint In Submission , February 2019
abstract |
bibtex |
arXiv |
project website
The problem of 3D layout recovery in indoor scenes has been a core research topic for over a decade. However, there are still several major challenges that remain unsolved. Among the most relevant ones, a major part of the state-of-the-art methods make implicit or explicit assumptions on the scenes -- e.g. box-shaped or Manhattan layouts. Also, current methods are computationally expensive and not suitable for real-time applications like robot navigation and AR/VR. In this work we present CFL (Corners for Layout), the first end-to-end model for 3D layout recovery on 360 images. Our experimental results show that we outperform the state of the art relaxing assumptions about the scene and at a lower cost. We also show that our model generalizes better to camera position variations than conventional approaches by using EquiConvs, a type of convolution applied directly on the sphere projection and hence invariant to the equirectangular distortions.
@article{fernandez2019CFL,
title={Corners for Layout: End-to-End Layout Recovery
from 360 Images},
author={Fernandez-Labrador, Clara and Fácil, José M
and Perez-Yus, Alejandro and Demonceaux, Cédric and
Civera, Javier and Guerrero, José J},
journal={arXiv:1903.08094},
year={2019}
}
|
|
Condition-Invariant Multi-View Place Recognition
José M. Fácil, Daniel Olid, Luis Montesano and Javier Civera
Preprint In Submission , February 2019
abstract |
bibtex |
arXiv |
project website
Visual place recognition is particularly challenging when places suffer changes in its appearance. Such changes are indeed common, e.g., due to weather, night/day or seasons. In this paper we leverage on recent research using deep networks, and explore how they can be improved by exploiting the temporal sequence information. Specifically, we propose 3 different alternatives (Descriptor Grouping, Fusion and Recurrent Descriptors) for deep networks to use several frames of a sequence. We show that our approaches produce more compact and best performing descriptors than single- and multi-view baselines in the literature in two public databases.
@article{facil2019condition,
title={{Condition-Invariant Multi-View Place Recognition}},
author={Fácil, José M. and Olid, Daniel and Montesano, Luis
and Civera, Javier},
journal={arXiv preprint arXiv:1902.09516},
year={2019}
}
|
|
PanoRoom: From the Sphere to the 3D Layout
Clara Fernandez-Labrador, José M. Fácil, Alejandro Perez-Yus,Cédric Demonceaux and José J. Guerrero
3D Reconstruction meets Semantics Workshop, ECCV 2018 , July 2018
●Spotlight Presentation●
abstract |
bibtex |
arXiv |
poster
We propose a novel FCN able to work with omnidirectional images that outputs accurate probability maps representing the main structure of indoor scenes, which is able to generalize on different data. Our approach handles occlusions and recovers complex shaped rooms more faithful to the actual shape of the real scenes. We outperform the state of the art not only in accuracy of the 3D models but also in speed.
@article{fernandez2018panoroom,
title={{PanoRoom: From the Sphere to the 3D Layout}},
author={Fernandez-Labrador, Clara and Facil,
Jose M and Perez-Yus, Alejandro and
Demonceaux, Cedric and Guerrero, Jose J},
journal={arXiv preprint arXiv:1808.09879},
year={2018}
}
|
|
Single-View Place Recognition under Seasonal Changes
Daniel Olid, José M. Fácil, Javier Civera
PPNIV Workshop at IROS 2018, July 2018
abstract |
bibtex |
arXiv |
poster |
video |
project web
Single-view place recognition, that we can define as finding an image that corresponds to the same place as a given query image, is a key capability for autonomous navigation and mapping. Although there has been a considerable amount of research in the topic, the high degree of image variability (with viewpoint, illumination or occlusions for example) makes it a research challenge.
One of the particular challenges, that we address in this work, is weather variation. Seasonal changes can produce drastic appearance changes, that classic low-level features do not model properly. Our contributions in this paper are twofold. First we pre-process and propose a partition for the Nordland dataset, frequently used for place recognition research without consensus on the partitions. And second, we evaluate several neural network architectures such as pre-trained, siamese and triplet for this problem. Our best results outperform the state of the art of the field. A video showing our results can be found in
@inproceedings{olid2018single,
Author = {Olid, Daniel and Fácil, José M.
and Civera, Javier},
Title = {Single-View Place Recognition
under Seasonal Changes},
Booktitle = {PPNIV Workshop at IROS 2018},
Year = {2018}
}
|
|
DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes
Berta Bescós, José M. Fácil, Javier Civera, José Neira
IEEE Robotics and Automation Letters, July 2018
Also presented at Workshop: RCW at ICRA 2018
abstract |
bibtex |
arXiv |
video |
github |
journal
The assumption of scene rigidity is typical in SLAM algorithms. Such a strong assumption limits the use of most visual SLAM systems in populated real-world environments, which are the target of several relevant applications like service robotics or autonomous vehicles.In this paper we present DynaSLAM, a visual SLAM system that, building on ORB-SLAM2 , adds the capabilities of dynamic object detection and background inpainting. DynaSLAM is robust in dynamic scenarios for monocular, stereo and RGB-D configurations. We are capable of detecting the moving objects either by multi-view geometry, deep learning or both. Having a static map of the scene allows inpainting the frame background that has been occluded by such dynamic objects.We evaluate our system in public monocular, stereo and \mbox{RGB-D} datasets. We study the impact of several accuracy/speed trade-offs to assess the limits of the proposed methodology. DynaSLAM outperforms the accuracy of standard visual SLAM baselines in highly dynamic scenarios. And it also estimates a map of the static parts of the scene, which is a must for long-term applications in real-world environments.
@article{bescos2018dynslam,
title={{DynaSLAM}: Tracking, Mapping
and Inpainting in Dynamic Scenes},
author={Besc{\'o}s, Berta and F{\'a}cil,
Jos{\'e} M and Civera, Javier and Neira,
Jos{\'e}},
journal={arXiv preprint arXiv:1806.05620},
year={2018}
}
|
|
Removing Dynamic Objects from 3D Maps using Geometry and Learning
Berta Bescós, José M. Fácil, Javier Civera, José Neira
Learning for Localization and Mapping Workshop at IROS 2017, September 2017
abstract |
bibtex |
draft |
video
In this work we present a new approach for the 3D reconstruction of a scene from RGB-D sequences containing dynamic objects. This challenging problem includes the detection of such objects, as well as the reconstruction of those parts of the scene occluded by them. We use a combination of computer vision geometry (detection and tracking of dynamic keypoints and associated image regions) and machine learning techniques (Fully Convolutional Neural Networks and Generative Adversarial Networks), which allows us to detect not only objects that are known to be dynamic (for e.g. people) but also other elements that change place in the scene (e.g. books carried by people).
Our system detects them and also reconstructs the hidden parts of the scene in some images using information from alternative images.
@article{bescos2016dealing,
title={{Dealing with Dynamic Objects
by Combining Geometry and Learning}},
author={Bes{\'o}s, Berta and F{\'a}cil, Jos{\'e} M
and Civera, Javier and Neira,
Jos{\'e}},
journal={Learning for Localization and
Mapping Workshop at IROS 2017},
year={2017}
}
|
|
Single-View and Multi-View Depth Fusion
José M. Fácil, Alejo Concha, Luis Montesano, Javier Civera
IEEE Robotics and Automation Letters, vol. 2(4), pp. 1-8, October 2017
abstract |
bibtex |
arXiv |
journal |
video
Dense 3D mapping from a monocular sequence is a key technology for several applications and still a research problem. This paper leverages recent results on single-view CNN-based depth estimation and fuses them with direct multi-view depth estimation. Both approaches present complementary strengths. Multi-view depth estimation is highly accurate but only in high-texture and high-parallax cases. Single-view depth captures the local structure of mid-level regions, including textureless areas, but the estimated depth lacks global coherence.
The single and multi-view fusion we propose has several challenges. First, both depths are related by a non-rigid deformation that depends on the image content. And second, the selection of multi-view points of high accuracy might be difficult for low-parallax configurations. We present contributions for both problems. Our results in the public datasets of NYU and TUM shows that our algorithm outperforms the individual single and multi-view approaches.
@article{facil2016single,
title={{Single-View
and Multi-View
Depth Fusion}},
author={F{\'a}cil, Jos{\'e} M
and Concha, Alejo and Montesano,
Luis and Civera, Javier},
journal={IEEE Robotics and Automation Letters},
year={2017}
}
|
|