TEG: image theme recognition using text-embedding-guided few-shot adaptation

Published in ArXiv, 2024

Jikai Wang, Wanglong Lu, Yu Wang, Kaijie Shi, Xianta Jiang, Hanli Zhao

results Brief description:

Grouping images into different themes is a challenging task in photo book curation. Unlike image object recognition, image theme recognition focuses on the understanding of the main subject or overall meaning conveyed by an image. However, it is challenging to achieve satisfactory performance using existing general image recognition methods. In this paper, we aim to solve the {image theme recognition task with} few-shot {training samples} using pre-trained contrastive language-image models. A text-prompt-guided few-shot image adaptation framework is proposed, which incorporates a text-embedding-guided classifier and an auxiliary classification loss to exploit embedded visual and text features, stabilize the network training, and enhance recognition performance. We also present a new annotated dataset Theme25 for studying image theme recognition. We conducted experiments on our Theme25 dataset as well as the publicly available CIFAR100 and ImageNet datasets to demonstrate the superiority of our method over the compared state-of-the-art methods.

Recommended citation:

Jikai Wang, Wanglong Lu, Yu Wang, Kaijie Shi, Xianta Jiang, and Hanli Zhao "TEG: image theme recognition using text-embedding-guided few-shot adaptation," Journal of Electronic Imaging 33(1), 013028 (24 January 2024). https://doi.org/10.1117/1.JEI.33.1.013028