International Conference on Computer Vision and Graphics ICCVG 2024

Europe/Warsaw
Online meeting - MS Teams (Warsaw University of Life Sciences - SGGW)

Online meeting - MS Teams

Warsaw University of Life Sciences - SGGW

Nowoursynowska 159 Warszawa, Poland (See the section "Programme & Venue" for details)
Leszek Chmielewski (Warsaw University of Life Sciences - SGGW), João Manuel R. S. Tavares (Universidade do Porto), Arkadiusz Orłowski (Warsaw University of Life Scieces - SGGW), Ryszard Kozera (Warsaw Iniversity of Life Sciences - SGGW)
Description

The International Conference on Computer Vision and Graphics ICCVG is a biennial conference held in Poland since 2002. It continues the tradition of the conference series on Computer Graphics and Image Processing – GKPO (Grafika Komputerowa i Przetwarzanie Obrazów) held biennially since 1990.

ICCVG is usually held in September and lasts three days. It is organized in parallel sessions, with an exception for keynote lectures which can be attended by all participants.

The main organizer of ICCVG is the Association for Image Processing, Poland, which is the Polish chapter of the International Association for Pattern Recognition (IAPR). The principal supporting organizer is the Institute of Information Technology of the Warsaw University of Life Sciences – SGGW. The principal co-organizer is Faculdade de Engenharia, Universidade do Porto (FEUP), Porto, Portugal.

The papers are reviewed by 2 to 3 reviewers each (see the section "How to Submit"). The conference proceedings are indexed, among others, by Web of Science and Scopus.

The proceedings of the conference will be published in Lecture Notes in Networks and Systems (LNNS), Springer. Extended versions of selected outstanding papers will be published in Machine Graphics and Vision.

    • 1
      Opening of the Conference
    • 2
      Method for Fine Registration of Point Sets Based on the Curvature of the Surface

      Efficient and accurate point set registration is an important task in 3D scene reconstruction in computer vision.
      This paper presents a method called Curvature Surface Iterative Closest Point (CS-ICP) for precise point set registration. By leveraging the curvature characteristics of the point set input, CS-ICP resolves local minima challenges encountered by standard ICP algorithms, demonstrating superior precision across various datasets.
      Additionally, CS-ICP significantly reduces computation time by working with fewer points per iteration, reducing the runtime by around 83\% compared to the reference methods.

      This paper also introduces evaluation criteria based on Euclidean and Chebyshev measures, offering a better assessment of point set registration quality without needing additional attributes such as evaluation ICP threshold.

      Speakers: J. Glaser (Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic), M. Jiřina (Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic)
    • 3
      Contextual Information-Based Registration Method for Point Set Registration

      This paper introduces a method called Contextual Information-Based Registration (CIBR), used to accurately register large and dense point sets, which represent a 3D scene.

      Distinguished from existing techniques, CIBR works with input point sets by partitioning them into discrete logical parts, which represent objects in the scene. A registration process takes place on each part of the point set, which contains the richest contextual information related to the 3D objects in the point clouds, leading to a final precise alignment.

      Through experimentation, CIBR demonstrates superior precision across various datasets, with the best improvement of 267% in fitness and correspondence set size and 52.2% in inliner RMSE, even though there exist cases, when the registration was suboptimal. CIBR achieves in most cases more robust and precise registration outcomes than the traditional fine and rough ICP registration methods.

      Speakers: J. Glaser (Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic), M. Jiřina (Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic), T. Laurin (Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic)
    • 4
      An example of the use of a small dataset for the classification of simple actions based on manually extracted shape descriptors, a single-layer neural network and leave-one-actor-out cross-validation procedure

      In this study, the method that recognizes human actions is analysed. It uses manually created shape features, and is tested in combination with a neural network-based classifier. We assume an application scenario involving the recognition of physical exercises as one of the preventive measures to reduce the risk of non-communicable diseases in the elderly. A popular action recognition dataset is used, as it contains activities corresponding to selected exercises. In addition to the application, the paper focuses on the study of combining neural network classifier with manually created shape features extracted from a small dataset. The main steps of the approach include calculating shape descriptors for all moving foreground objects extracted from video frames, then using these descriptors to construct feature vectors, and ultimately applying the Fourier transform to create representations of action sequences. A coarse classification step is included, which distinguishes between actions performed in place and actions with changing object's locations. The final classification is carried out using a neural network and a leave-one-actor-out cross-validation procedure. This paper presents experimental results on classification using the proposed approach based on simple shape descriptors and a feed-forward neural network with a single hidden layer. The averaged accuracy exceeds 97%.

      Speakers: Dariusz Frejlichowski (West Pomeranian University of Technology, Szczecin, Poland), Katarzyna Gościewska (West Pomeranian University of Technology, Szczecin, Poland)
    • 5
      Break
    • 6
      Improving the efficiency of "Show and Tell" encoder-decoder image captioning model

      The paper investigates the influence of hyperparameters of the "Show and Tell" image captioning model on the overall efficiency of the method. The method is based on an encoder-decoder approach, where the encoder -- the backbone feature extractor based on the convolutional neural networks (CNN) is responsible for extracting image features and the decoder -- the recurrent neural network (RNN), produces a caption -- a phrase describing the image content. In our research, we tested the encoder part by verifying Densenet, Resnet, and Regnet image feature extractors and the decoder part by changing the size of the RNN sizes. Furthermore, we also investigated the sentence generation stage. The investigation aims to find the optimal feature extractor and decoder size combination. Our research proves that an optimal choice of model's hyperparameters increases caption generation efficiency.

      Speakers: Albert Ziółkiewicz, Karol Zieliński, Marcin Iwanowski (Institute of Control and Industrial Electronics, Warsaw University of Technology), Mateusz Bartosiewicz (Institute of Control and Industrial Electronics, Warsaw University of Technology), Piotr Szczepański
    • 7
      Utilisation of Vision Systems and Digital Twin for Maintaining Cleanliness in Public Spaces

      Nowadays, the increasing demand for maintaining high cleanliness standards in public spaces results in the search for innovative solutions. The deployment of CCTV systems equipped with modern cameras and software enables not only real-time monitoring of the cleanliness status but also automatic detection of impurities and optimisation of cleaning schedules. The Digital Twin technology allows for the creation of a virtual model of the space, facilitating the simulation, training, and testing of cleanliness management strategies before implementation in the real world.
      In this paper, we present the utilisation of advanced vision surveillance systems and the Digital Twin technology in cleanliness management, using a railway station as an example. The Digital Twin was created based on an actual 3D model in the Nvidia Omniverse Isaac Sim simulator. A litter detector, bin occupancy level detector, stain segmentation, and a human detector (including the cleaning crew) along with their movement analysis were implemented. A preliminary assessment was conducted, and potential modifications for further enhancement and future development of the system were identified.

      Speakers: Mateusz Wąsala (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland), Krzysztof Błachut (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland), Hubert Szolc (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland), Marcin Kowalczyk (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland), Michał Daniłowicz (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland), Tomasz Kryjak (Embedded Vision Systems Group, Computer Vision Laboratory, Department of Automatic Control and Robotics, AGH University of Krakow, Poland)
    • 8
      Closing of the Conference