Art. 05 – Vol. 26 – No. 4 – 2016


Mihnea Horia Vrejoiu
National Institute for Research & Development in Informatics, ICI Bucharest

Abstract: Manually inputting data from paper support represents, on one hand, a necessary and important activity in the context of the widely spread informatization in most areas, but, on the other hand, so human, and time, resources consuming, and also a potential source of typing errors, especially when large volumes of data are inputted. In this context, techniques and tools by which this activity can be automated are extremely useful. However, often, due to critical requirements on the accuracy of the inputted data, it is absolutely necessary at least a check and validation of these data by the human operator in the absence of other criteria and/or automated validation possibilities. In this context, it has been proposed and experimentally implemented a semi-automatic software solution based on OCR/ICR techniques, intended to assist the work of inputting the data filled in standardized forms, with fixed format, for improving the efficiency, productivity and accuracy. Testing experiments were conducted, there were made observations on the functioning and results, there were summarized some conclusions and possible further improvements and optimizations.

Keywords: data inputting, standardized forms, machine learning, supervised learning, OCR/ICR, regular expressions.

View full article


  1. ONŢANU, D.-M.; VREJOIU, M. H.: Sistem de recunoaştere optică a caracterelor bazat pe reţele neurale – produs program pentru recunoaşterea scrisului de mână, Tema A15, Institutul Naţional de Cercetare-Dezvoltare în Informatică – ICI Bucureşti, 1993.
  2. VREJOIU, M. H.; ONŢANU, D.-M.: Sisteme de programe de tip OCR, PC World România, nr. 6, edit. IDG România, iunie 1995.
  3. MITCHELL, T.: Machine Learning, McGraw-Hill, ISBN: 0070428077, March 1997.
  4. ONŢANU, D.-M.: Learning by Evolution. A New Class of General Classifier Networks and Their Training Algorithm, Advances in Modelling & Analysis, AMSE Press, vol. 26, nr. 2, pp. 27-30, 1993.
  5. FRIEDL, J. E.: Mastering Regular Expressions, 2nd Ed., O’Reilly, July 2002.
  6. VREJOIU, M. H.; ONŢANU, D.-M.: Sistem de recunoaştere optică de caractere pentru citirea automată de formulare scanate. Faza a V-a. Realizare sistem experimental pentru asistarea introducerii în calculator a informaţiilor de tip text completate în câmpurile formularelor, Raport de fază, proiect PN0313-0301, Institutul Naţional de Cercetare-Dezvoltare în Informatică – ICI Bucureşti, noiembrie 2005.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.