← Back to Publications

Introducing Concept And Syntax Transition Networks for Image Captioning

P. Blandfort, T. Karayil, D. Borth, A. Dengel

International Conference on Multimedia Retrieval (ICMR)||DOI: 10.1145/2911996.2930060
Image CaptioningDeep LearningNLP

Abstract

We introduce Concept and Syntax Transition Networks, a novel approach for image captioning that separates visual concept detection from syntactic arrangement, enabling more flexible and controllable caption generation.

Overview

Image captioning requires bridging the gap between visual understanding and natural language generation. We introduce Concept and Syntax Transition Networks (CSTN), a novel approach that separates the process of detecting visual concepts from the syntactic arrangement of words.

Method

The CSTN framework allows for more flexible and controllable caption generation by decoupling concept detection and language generation into distinct but connected stages.