Training convolutional neural network models for object, scene and context recognition in images
Roman Orlov
Private Higher Educational Institution Kharkiv University of Technology “STEP”
Serhii Taboranskiy
Private Higher Educational Institution Kharkiv University of Technology “STEP”
Purpose. The study focuses on developing an optimized convolutional neural network (CNN) for detecting objects, scenes, and contexts in diverse images. It emphasizes improving the architecture, training methods, and performance of CNNs in computer vision tasks, which are essential for various industries. Design / Method / Approach. The study uses Python, TensorFlow, and Keras to create and train a CNN on dataset CIFAR-10. Hyperparameter tuning and data augmentation techniques were applied to enhance model performance. Findings. The CNN model trained on CIFAR-10 demonstrated strong performance with an accuracy of approximately 85% on the test set, highlighting its ability to classify diverse objects. Data augmentation techniques significantly improved the overall performance by making the model more robust to image variations. Theoretical Implications. The study emphasizes the importance of proper data preparation, including image normalization and augmentation, to achieve high accuracy in CNN models. Practical Implications. The application of convolutional neural networks in real-world scenarios such as security, medicine, and autonomous systems is transformative. These models can accurately detect objects and understand contexts, opening new possibilities for innovation and automation in various industries. Originality / Value. This study makes a valuable contribution to research on convolutional neural networks by showcasing the successful training and optimization of a CNN for object detection. The combination of data augmentation, architecture design, and hyperparameter tuning highlights an effective approach to achieving high accuracy in computer vision tasks. Research Limitations / Future Research. Future research could explore alternative CNN architectures and larger datasets to further enhance object detection accuracy. Additionally, integrating new learning strategies could improve the model’s performance in more complex and varied environments.