Students’ assessment tests are routinely validated through item response theory (IRT) models which assume unidimensionality and absence of observable differential item functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to lower secondary school students: the Language Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students’ gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for observable DIF effects for both tests. Besides, the assumption of unidimensionality is rejected for the Language Test, whereas it is reasonable for the Mathematics Test.
Joint assessment of the latent trait dimensionality and observed differential item functioning of students’ national tests / Gnaldi, Michela*; Bacci, Silvia. - In: QUALITY & QUANTITY. - ISSN 0033-5177. - STAMPA. - 50:(2016), pp. 1429-1447. [10.1007/s11135-015-0214-0]
Joint assessment of the latent trait dimensionality and observed differential item functioning of students’ national tests
Bacci, Silvia
2016
Abstract
Students’ assessment tests are routinely validated through item response theory (IRT) models which assume unidimensionality and absence of observable differential item functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to lower secondary school students: the Language Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students’ gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for observable DIF effects for both tests. Besides, the assumption of unidimensionality is rejected for the Language Test, whereas it is reasonable for the Mathematics Test.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.