COMPUTER VISION

ML APPLICATIONS

Attention-Based Query Expansion Learning

August 21, 2020

Abstract

Query expansion is a technique widely used in image search consisting in combining highly ranked images from an original query into an expanded query that is then reissued, generally leading to increased recall and precision. An important aspect of query expansion is choosing an appropriate way to combine the images into a new query. Interestingly, despite the undeniable empirical success of query expansion, ad-hoc methods with different caveats have dominated the landscape, and not a lot of research has been done on learning how to do query expansion. In this paper we propose a more principled framework to query expansion, where one trains, in a discriminative manner, a model that learns how images should be aggregated to form the expanded query. Within this framework, we propose a model that leverages a self-attention mechanism to effectively learn how to transfer information between the different images before aggregating them. Our approach obtains higher accuracy than existing approaches on standard benchmarks. More importantly, our approach is the only one that consistently shows high accuracy under different regimes, overcoming caveats of existing methods.

Download the Paper

AUTHORS

Written by

Albert Gordo

Filip Radenovic

Tamara Berg

Publisher

ECCV

Research Topics

Computer Vision

Related Publications

November 20, 2024

CONVERSATIONAL AI

COMPUTER VISION

Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations

Jianfeng Chi, Ujjwal Karn, Hongyuan Zhan, Eric Smith, Javier Rando, Yiming Zhang, Kate Plawiak, Zacharie Delpierre Coudert, Kartikeya Upasani, Mahesh Pasupuleti

November 20, 2024

November 11, 2024

COMPUTER VISION

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

Sherry Xue, Romy Luo, Changan Chen, Kristen Grauman

November 11, 2024

October 31, 2024

HUMAN & MACHINE INTELLIGENCE

ROBOTICS

Digitizing Touch with an Artificial Multimodal Fingertip

Mike Lambeta, Tingfan Wu, Ali Sengül, Victoria Rose Most, Nolan Black, Kevin Sawyer, Romeo Mercado, Haozhi Qi, Alexander Sohn, Byron Taylor, Norb Tydingco, Gregg Kammerer, Dave Stroud, Jake Khatha, Kurt Jenkins, Kyle Most, Neal Stein, Ricardo Chavira, Thomas Craven-Bartle, Eric Sanchez, Yitian Ding, Jitendra Malik, Roberto Calandra

October 31, 2024

October 16, 2024

SPEECH & AUDIO

COMPUTER VISION

Movie Gen: A Cast of Media Foundation Models

Movie Gen Team

October 16, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.