June 18, 2018
The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation when learning vision models for a large number of concepts. However, a key challenge when learning from hashtags is that they are inherently subjective because they are provided by users as a form of self-expression. As a consequence, hashtags may have synonyms (different hashtags referring to the same visual content) and may be polysemous (the same hashtag referring to different visual content). These challenges limit the effectiveness of approaches that simply treat hashtags as image-label pairs. This paper presents an approach that extends upon modeling simple image-label pairs with a joint model of images, hashtags, and users. We demonstrate the efficacy of such approaches in image tagging and retrieval experiments, and show how the joint model can be used to perform user-conditional retrieval and tagging.
November 10, 2022
Unnat Jain, Abhinav Gupta, Himangi Mittal, Pedro Morgado
November 10, 2022
November 06, 2022
Filip Radenovic, Abhimanyu Dubey, Dhruv Mahajan
November 06, 2022
October 25, 2022
Mustafa Mukadam, Austin Wang, Brandon Amos, Daniel DeTone, Jing Dong, Joe Ortiz, Luis Pineda, Maurizio Monge, Ricky Chen, Shobha Venkataraman, Stuart Anderson, Taosha Fan, Paloma Sodhi
October 25, 2022
October 22, 2022
Naila Murray, Lei Wang, Piotr Koniusz, Shan Zhang
October 22, 2022
April 30, 2018
Yedid Hoshen, Lior Wolf
April 30, 2018
December 11, 2019
Eliya Nachmani, Lior Wolf
December 11, 2019
April 30, 2018
Yedid Hoshen, Lior Wolf
April 30, 2018
November 01, 2018
Yedid Hoshen, Lior Wolf
November 01, 2018
Foundational models
Latest news
Foundational models