Tables are widely used in documents because of their compact and structured representation of information. In particular, in scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research comparable and easily understandable by scholars. Since the layout of tables is highly variable, it would be useful to interpret their content and classify them into categories. This could be helpful to directly extract information from scientific papers, for instance comparing performance of some models given their paper result tables. In this work, we address the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use. We evaluate our model on a subset of the Tab2Know dataset. Since it contains few examples manually annotated, we propose data augmentation techniques directly on the table graph structures. We achieve promising preliminary results, proposing a data augmentation method suitable for graph-based table representation.
Data Augmentation on Graphs for Table Type Classification / Del Bimbo, Davide; Gemelli, Andrea; Marinai, Simone. - ELETTRONICO. - 13813:(2022), pp. 242-252. (Intervento presentato al convegno Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition) [10.1007/978-3-031-23028-8_25].
Data Augmentation on Graphs for Table Type Classification
Gemelli, Andrea
;Marinai, Simone
2022
Abstract
Tables are widely used in documents because of their compact and structured representation of information. In particular, in scientific papers, tables can sum up novel discoveries and summarize experimental results, making the research comparable and easily understandable by scholars. Since the layout of tables is highly variable, it would be useful to interpret their content and classify them into categories. This could be helpful to directly extract information from scientific papers, for instance comparing performance of some models given their paper result tables. In this work, we address the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use. We evaluate our model on a subset of the Tab2Know dataset. Since it contains few examples manually annotated, we propose data augmentation techniques directly on the table graph structures. We achieve promising preliminary results, proposing a data augmentation method suitable for graph-based table representation.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.