Recently there has been an increased interest in document image skew detection algorithms. Most of the papers relevant to this problem include some experimental results. However, there exists a lack of a universally accepted methodology for evaluating the performance of such algorithms. We have implemented four types of skew detection algorithms in order to investigate possible testing methodologies. We then tested each algorithm on a sample of 460 page images randomly selected from a collection of approximately 100,000 pages. This collection contains a wide variety of typographical features and styles. In our evaluation we examine several issues relevant to the establishment of a uniform testing methodology. First, we begin with a clear definition of the problem and the ground truth collection process. Then we examine the need for pre-processing and parameter optimization specific to each technique. Next, we investigate the problem of establishing meaningful statistical measurements of the performance of these algorithms and the use of non-parametric comparison methods to perform pairwise comparisons of methods.

Evaluation of document image skew estimation techniques / Bagdanov, Andrew D.; Kanai, Junichi. - STAMPA. - 2660:(1996), pp. 343-353. (Intervento presentato al convegno PROCEEDINGS OF SPIE, THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING) [10.1117/12.234715].

Evaluation of document image skew estimation techniques

BAGDANOV, ANDREW DAVID;
1996

Abstract

Recently there has been an increased interest in document image skew detection algorithms. Most of the papers relevant to this problem include some experimental results. However, there exists a lack of a universally accepted methodology for evaluating the performance of such algorithms. We have implemented four types of skew detection algorithms in order to investigate possible testing methodologies. We then tested each algorithm on a sample of 460 page images randomly selected from a collection of approximately 100,000 pages. This collection contains a wide variety of typographical features and styles. In our evaluation we examine several issues relevant to the establishment of a uniform testing methodology. First, we begin with a clear definition of the problem and the ground truth collection process. Then we examine the need for pre-processing and parameter optimization specific to each technique. Next, we investigate the problem of establishing meaningful statistical measurements of the performance of these algorithms and the use of non-parametric comparison methods to perform pairwise comparisons of methods.
1996
Proceedings of SPIE
PROCEEDINGS OF SPIE, THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING
Bagdanov, Andrew D.; Kanai, Junichi
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1020607
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 7
social impact