ISCG: An Intelligent Sensing and Caption Generation System for Object Detection and Captioning Using Deep Learning

Publication Type:

Journal Articles


International Journal of Intelligent Information Technologies, Volume 16, Issue 4, p.51-67 (2020)



Agent, Artificial Intelligence, Image Captioning, Intelligence Test, Visual Perception


<p>Artificial intelligence has paved the way for different areas of computing such as speech recognition and translation, object detection, machine translation, and others. One of the goals of artificial general intelligence is to simulate human thinking and rationality within machines such that they are able to perceive their environment and then perform reasonable actions based on their perception. Creating a single model that performs every single task from visual perception to actuation is currently impossible. The system must be divided into several models each of which functions independently as well also contribute to the operation of the whole intelligent machine. In this paper, an intelligent sensing and caption generation (ISCG) system is proposed which is capable of detecting living/non-living objects and states of motion in images. The system consists of two separate modules of caption generator and intelligence engine with a Convolutional Neural Network (CNN) for determining the different objects in the images. Our model yields state-of-the-art performance on benchmarked dataset.</p>