Measurement Equivalence in Group Comparisons: Differential Item and Test Functioning

Oya Somer
2004; 19(53):69-86

Both in within and cross-cultural settings, measurement equivalence is one of the most important methodological problems in comparisons of group differences. Models based on Item Response Theory are being widely used in recent years in holding measurement equivalence. These models (generally take place under the title of Differential Item-Test Functioning-DIF, DTF) refer to the methods analyzing the relations between observed scores and the latent attribute measured by the test, across comparison groups. The existence of DIFDTF is evidenced when these relations are different across comparison groups. DIF is defined as differences in the probability of endorsing an item between members of the reference and focal groups having the same latent trait level. In this study, DIF analyses of 16 items of a personality scale (agreeableness) were performed on a student sample (1807 subjects). According to the results, all of the 16 items fitted to the two-parameter logistic model, but 5 items of the agreeableness scale showed differential item functioning between girls and boys. The properties of these DIF items and how to handle them are discussed.

Keywords: Measurement equivalence, Differential Item Functioning (DIF), group comparisons