Maier J, Schottenhamml J, Madhu P, da Costa CA, Maier A (2021)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2021
Conference Proceedings Title: 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS)
Background: Interventional workflow analysis can help to increase the quality and efficiency of performed procedures, which are two important factors in the medical domain. A useful tool for analyzing medical workflow is video-based phase annotation of procedures, since the duration of certain phases can give evidence about issues in the processes in order to propose suggestions for improvement. We present an approach for classifying frames of intervention videos in order to label phases of these procedures.
Methods: The videos in our dataset show interventional procedures recorded at different locations during the LINC meeting congresses. In each video, after an opening title the case is presented using slides, followed by the actual procedure. The procedure itself is a live catheter examination, a presentation of fluoroscopic images of a previously performed procedure, or a mixture of both. Occasionally, full screen images of the treating physicians are shown. Our data set consists of ten videos recorded at five locations A to E, where the number of videos recorded at the locations is A=3, B=3, C=2, D=1 and E=1. This low amount of data and the fact that some videos are compiled of recordings at different time points currently prevent a time sequence analysis for phase annotation. Instead, we extract one image per second, apply image classification, and define the phases of the videos based on the classes. All extracted video frames are manually assigned to one of the five classes opening (95), fluoroscopy (9376), fluoroscopy overview (196), humans (315), and presentation (1546). For an automatic classification of the images, we use a ResNet18 pre-trained on ImageNet data, since residual networks performed well on real world images and have shown positive results for transfer learning in the medical domain. The network is adapted to our task by training on a part of the available videos. The data set is split in training, validation and test set depending on the location in order to avoid overfitting on certain treating physicians. This results in 7044 frames for training (A, D and E), 1860 frames for validation (B), and 2624 frames for testing (C). The classified images are used to assign phase labels to the test videos.
Results: The trained network predicted the correct class for 99% of the test images. Because of this high accuracy, video phases were assigned reliably with average time deviations of 1.10 ± 0.94 seconds at phase changes. Incorrectly classified images led to short jumps in the phase annotations, which could in most cases be removed by requiring a minimum phase length of 3 seconds.
Conclusion: Our proposed method can classify frames of procedure videos with a high accuracy and enables an annotation of video phases. These phases are a first step towards interventional workflow analysis and could be improved by incorporating temporal information from more videos. Live procedures and previously recorded treatments are currently not differentiated, since the images in both cases are similar. Here, the fluoroscopy overview frames and the audio signal could give hints and will be included in a future analysis.
Maier, J., Schottenhamml, J., Madhu, P., da Costa, C.A., & Maier, A. (2021). Analysis of Interventional Workflow Phases based on Image Classification. In 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS). Berlin (online conference), DE.
Maier, Jennifer, et al. "Analysis of Interventional Workflow Phases based on Image Classification." Proceedings of the 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Berlin (online conference) 2021.