Hüttner L, Mayr M, Gorges T, Wu F, Seuret M, Maier A, Christlein V (2025)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2025
Publisher: IEEE
Pages Range: 1233-1242
Conference Proceedings Title: IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)
Event location: Tucson, Arizona, USA
ISBN: 979-8-3315-3662-6
URI: https://ieeexplore.ieee.org/document/10972546
DOI: 10.1109/WACVW65960.2025.00146
The continuous expansion of neural network sizes is a notable trend in machine learning, with transformer models exceeding 20 billion parameters in computer vision. This growth comes with rising demands for computational resources and large-scale datasets. Efficient techniques for transfer learning thus become an attractive option in setups with limited data, as in handwriting recognition. RecentlThe continuous expansion of neural network sizes is a notable trend in machine learning, with transformer models exceeding 20 billion parameters in computer vision. This growth comes with rising demands for computational resources and large-scale datasets. Efficient techniques for transfer learning thus become an attractive option in setups with limited data, as in handwriting recognition. Recently, parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and weight-decomposed low-rank adaptation (DoRA), have gained wide-spread interest. In this paper, we explore tradeoffs in parameter-efficient transfer learning using the synthetically pretrained Transformer-Based Optical Character Recognition (TrOCR) model for handwritten text recognition with LoRA and DoRA. Additionally, we analyze the performance of full fine-tuning with a limited number of samples, scaling from a few-shot learning scenario up to using the whole dataset. We conduct experiments on the popular IAM Handwriting database as well as the historical READ 2016 dataset. We find that (a) LoRA/DoRA does not outperform full fine-tuning as opposed to a recent paper and (b) LoRA/DoRA is not substantially faster than full fine-tuning of TrOCR.
APA:
Hüttner, L., Mayr, M., Gorges, T., Wu, F., Seuret, M., Maier, A., & Christlein, V. (2025). Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text Recognition. In IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) (pp. 1233-1242). Tucson, Arizona, USA, US: IEEE.
MLA:
Hüttner, Lukas, et al. "Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text Recognition." Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), Tucson, Arizona, USA IEEE, 2025. 1233-1242.
BibTeX: Download