Large language models grounded on attention-based architectures have outperformed recurrent neural networks (RNNs), like those based on long short-term memory (LSTM) gating systems, in various natural language processing tasks. However, despite optimization, these models (i) require unreasonable training data compared to what children need, (ii) exhibit an inverse relationship between their performance and their linguistic explanatory value, and, crucially, (iii) avoid reasonable (word-by-word) incremental sentence processing, raising doubts about their cognitive plausibility. In this paper, we address these issues starting from the intuition that RNNs directly model incrementality, a key factor in human language processing. Specifically, we discuss the performance of an RNN architecture, eMG-RNN, both during training and in minimal pairs forced-choice tasks in English and Italian. We observe that: (i) ecological training regimens lead to a decrease in cross-entropy loss, although performance on linguistic minimal pairs does not improve; (ii) the specific gating system adopted induces relevant structural biases; and (iii) while these networks outperform standard LSTM, gated recurrent unit (GRU) networks, and transformer models under the same ecological training regimens, their performance on linguistic tasks remains low compared to adults (in English) and 7-year-old children (in Italian).

From Recursion to Incrementality: Return to Recurrent Neural Networks / Cristiano Chesi; Matilde Barbini; Veronica Bressan; Achille Fusco; Sofia Neri; Maria Letizia Piccini Bianchessi; Sarah Rossi; Tommaso Sgrizzi. - In: LINGUISTICS VANGUARD. - ISSN 2199-174X. - ELETTRONICO. - -:(2026), pp. 0-0. [10.1515/lingvan-2024-0233]

From Recursion to Incrementality: Return to Recurrent Neural Networks

Achille Fusco;
2026

Abstract

Large language models grounded on attention-based architectures have outperformed recurrent neural networks (RNNs), like those based on long short-term memory (LSTM) gating systems, in various natural language processing tasks. However, despite optimization, these models (i) require unreasonable training data compared to what children need, (ii) exhibit an inverse relationship between their performance and their linguistic explanatory value, and, crucially, (iii) avoid reasonable (word-by-word) incremental sentence processing, raising doubts about their cognitive plausibility. In this paper, we address these issues starting from the intuition that RNNs directly model incrementality, a key factor in human language processing. Specifically, we discuss the performance of an RNN architecture, eMG-RNN, both during training and in minimal pairs forced-choice tasks in English and Italian. We observe that: (i) ecological training regimens lead to a decrease in cross-entropy loss, although performance on linguistic minimal pairs does not improve; (ii) the specific gating system adopted induces relevant structural biases; and (iii) while these networks outperform standard LSTM, gated recurrent unit (GRU) networks, and transformer models under the same ecological training regimens, their performance on linguistic tasks remains low compared to adults (in English) and 7-year-old children (in Italian).
2026
-
0
0
Cristiano Chesi; Matilde Barbini; Veronica Bressan; Achille Fusco; Sofia Neri; Maria Letizia Piccini Bianchessi; Sarah Rossi; Tommaso Sgrizzi
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1460792
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact