Introducing Concept And Syntax Transition Networks for Image Captioning
P. Blandfort, T. Karayil, D. Borth, A. Dengel
Image CaptioningDeep LearningNLP
Abstract
We introduce Concept and Syntax Transition Networks, a novel approach for image captioning that separates visual concept detection from syntactic arrangement, enabling more flexible and controllable caption generation.
Overview
Image captioning requires bridging the gap between visual understanding and natural language generation. We introduce Concept and Syntax Transition Networks (CSTN), a novel approach that separates the process of detecting visual concepts from the syntactic arrangement of words.
Method
The CSTN framework allows for more flexible and controllable caption generation by decoupling concept detection and language generation into distinct but connected stages.