The Modular Light Transformer is a software dedicated to text recognition. It is based on a lightweight Transformer-based neural network dealing with text-line images and a system to generate synthetic text-line images, both for modern and historical documents.
The proposed software is a Python code using in particular the PyTorch library to train and evaluate models and to solve handwriting recognition tasks. The code includes: neural network architectures based on the Transformer model; pre-trained models for direct use; a system for generating synthetic handwriting; and the associated code for performing data training and/or prediction.
To use the MLT, you agree to comply with the terms of the CLIC license (especially the use only for research purposes), contact the authors to access to the code, and reference the article below :
Barrere, K., Soullard, Y., Lemaitre, A., & Coüasnon, B. (2024). Training transformer architectures on few annotated data: an application to historical handwritten text recognition. International Journal on Document Analysis and Recognition (IJDAR), 1-14.