On semantic solutions for efficient approximate similarity search on large-scale datasets

Alexander Ocsa, Jose Luis Huillca, Cristian José Lopez Del Alamo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.

Original languageEnglish
Title of host publicationProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 22nd Iberoamerican Congress, CIARP 2017, Proceedings
EditorsSergio Velastin, Marcelo Mendoza
PublisherSpringer Verlag
Pages450-457
Number of pages8
ISBN (Print)9783319751924
DOIs
StatePublished - 2018
Externally publishedYes
Event22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017 - Valparaiso, Chile
Duration: 7 Nov 201710 Nov 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10657 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017
Country/TerritoryChile
CityValparaiso
Period7/11/1710/11/17

Bibliographical note

Funding Information:
Acknowledgements. This project has been partially funded by CIENCIA-ACTIVA (Perú) through the Doctoral Scholarship at UNSA University, and FONDECYT (Perú) Project 148-2015.

Funding Information:
This project has been partially funded by CIENCIA-ACTIVA (Perú) through the Doctoral Scholarship at UNSA University, and FONDECYT (Perú) Project 148-2015.

Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.

Keywords

  • Approximate similarity search
  • Deep learning
  • Fractal theory
  • Multidimensional index

Fingerprint

Dive into the research topics of 'On semantic solutions for efficient approximate similarity search on large-scale datasets'. Together they form a unique fingerprint.

Cite this