Optical chemical structure recognition is the problem of converting a bitmap image containing a chemical structure formula into a standard structured representation of the molecule. We introduce a novel approach to this problem based on the pipelined integration of pattern recognition techniques with probabilistic knowledge representation and reasoning. Basic entities and relations (such as textual elements, points, lines, etc.) are first extracted by a low-level processing module. A probabilistic reasoning engine based on Markov logic, embodying chemical and graphical knowledge, is subsequently used to refine these pieces of information. An annotated connection table of atoms and bonds is finally assembled and converted into a standard chemical exchange format. We report a successful evaluation on two large image data sets, showing that the method compares favorably with the current state-of-the-art, especially on degraded low-resolution images. The system is available as a web server at http://mlocsr.dinfo.unifi.it .
Markov Logic Networks for Optical Chemical Structure Recognition / Paolo Frasconi;Francesco Gabbrielli;Marco Lippi;Simone Marinai. - In: JOURNAL OF CHEMICAL INFORMATION AND MODELING. - ISSN 1549-9596. - STAMPA. - 54:(2014), pp. 2380-2390. [10.1021/ci5002197]
Markov Logic Networks for Optical Chemical Structure Recognition
FRASCONI, PAOLO;Marco Lippi;MARINAI, SIMONE
2014
Abstract
Optical chemical structure recognition is the problem of converting a bitmap image containing a chemical structure formula into a standard structured representation of the molecule. We introduce a novel approach to this problem based on the pipelined integration of pattern recognition techniques with probabilistic knowledge representation and reasoning. Basic entities and relations (such as textual elements, points, lines, etc.) are first extracted by a low-level processing module. A probabilistic reasoning engine based on Markov logic, embodying chemical and graphical knowledge, is subsequently used to refine these pieces of information. An annotated connection table of atoms and bonds is finally assembled and converted into a standard chemical exchange format. We report a successful evaluation on two large image data sets, showing that the method compares favorably with the current state-of-the-art, especially on degraded low-resolution images. The system is available as a web server at http://mlocsr.dinfo.unifi.it .File | Dimensione | Formato | |
---|---|---|---|
ci5002197.pdf
Accesso chiuso
Tipologia:
Versione finale referata (Postprint, Accepted manuscript)
Licenza:
Tutti i diritti riservati
Dimensione
1.27 MB
Formato
Adobe PDF
|
1.27 MB | Adobe PDF | Richiedi una copia |
I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.