Results for ' Item response theory (IRT)'

19 found
Order:
  1.  24
    US History Content Knowledge and Associated Effects of Race, Gender, Wealth, and Urbanity: Item Response Theory (IRT) Modeling of NAEP-USH Achievement.Tina L. Heafner & Paul G. Fitchett - 2018 - Journal of Social Studies Research 42 (1):11-25.
    Using an Item response theory (IRT) analysis, this study examined ethnic and gender groups differences in exposure to content material (i.e. access to curriculum) assessed on the 12th grade NAEP US History 2010 exam. Employing multi-step data analysis procedures, authors examined race and gender using the NAEP Item Mapping Tool available through NCES. Results revealed item-level patterns, which suggest that females and Black students are more likely to answer questions, related to social history, particularly the (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  2.  17
    Measuring Spatial Perspective Taking: Analysis of Four Measures Using Item Response Theory.Maria Brucato, Andrea Frick, Stefan Pichelmann, Alina Nazareth & Nora S. Newcombe - 2023 - Topics in Cognitive Science 15 (1):46-74.
    Research on spatial thinking requires reliable and valid measures of individual differences in various component skills. Spatial perspective taking (PT)—the ability to represent viewpoints different from one's own—is one kind of spatial skill that is especially relevant to navigation. This study had two goals. First, the psychometric properties of four PT tests were examined: Four Mountains Task (FMT), Spatial Orientation Task (SOT), Perspective-Taking Task for Adults (PTT-A), and Photographic Perspective-Taking Task (PPTT). Using item response theory (IRT), (...) difficulty, discriminability, and efficiency of item information functions were evaluated. Second, the relation of PT scores to general intelligence, working memory, and mental rotation (MR) was assessed. All tasks showed good construct validity except for FMT. PPTT tapped a wide range of PT ability, with maximum measurement precision at average ability. PTT-A captured a lower range of ability. Although SOT contributed less measurement information than other tasks, it did well across a wide range of PT ability. After controlling for general intelligence and working memory, original and IRT-refined versions of PT tasks were each related to MR. PTT-A and PPTT showed relatively more divergent validity from MR than SOT. Tests of dimensionality indicated that PT tasks share one common PT dimension, with secondary task-specific factors also impacting the measurement of individual differences in performance. Advantages and disadvantages of a hybrid PT test that includes a combination of items across tasks are discussed. (shrink)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  3.  15
    Measuring Spatial Perspective Taking: Analysis of Four Measures Using Item Response Theory.Maria Brucato, Andrea Frick, Stefan Pichelmann, Alina Nazareth & Nora S. Newcombe - 2023 - Topics in Cognitive Science 15 (1):46-74.
    Research on spatial thinking requires reliable and valid measures of individual differences in various component skills. Spatial perspective taking (PT)—the ability to represent viewpoints different from one's own—is one kind of spatial skill that is especially relevant to navigation. This study had two goals. First, the psychometric properties of four PT tests were examined: Four Mountains Task (FMT), Spatial Orientation Task (SOT), Perspective-Taking Task for Adults (PTT-A), and Photographic Perspective-Taking Task (PPTT). Using item response theory (IRT), (...) difficulty, discriminability, and efficiency of item information functions were evaluated. Second, the relation of PT scores to general intelligence, working memory, and mental rotation (MR) was assessed. All tasks showed good construct validity except for FMT. PPTT tapped a wide range of PT ability, with maximum measurement precision at average ability. PTT-A captured a lower range of ability. Although SOT contributed less measurement information than other tasks, it did well across a wide range of PT ability. After controlling for general intelligence and working memory, original and IRT-refined versions of PT tasks were each related to MR. PTT-A and PPTT showed relatively more divergent validity from MR than SOT. Tests of dimensionality indicated that PT tasks share one common PT dimension, with secondary task-specific factors also impacting the measurement of individual differences in performance. Advantages and disadvantages of a hybrid PT test that includes a combination of items across tasks are discussed. (shrink)
    No categories
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  4.  15
    Development of the short Creative Expression Interest Scale based on item response theory.Peng Juan Zhao, Xu Liang Gao, Nan Zhao & Zhao Sheng Luo - 2022 - Frontiers in Psychology 13.
    This study develops a short Creative Expression Interest Scale among Chinese freshmen based on the perspective of item response theory. Nine hundred fifty-nine valid Chinese freshmen participated in the Creative Expression Interest survey. Researchers applied the initial data for unidimensionality, item fit, discrimination parameter, and differential item functioning to obtain a short CEIS. The results show that the Short CEIS meets the psychometric requirements of the IRT. Pearson correlation coefficient of theta between the short and (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  5.  8
    Modeling Response Time and Responses in Multidimensional Health Measurement.Chun Wang, David J. Weiss & Shiyang Su - 2019 - Frontiers in Psychology 10.
    This study explored calibrating a large item bank for use in multidimensional health measurement with computerized adaptive testing, using both item responses and response time (RT) information. The Activity Measure for Post-Acute Care is a patient-reported outcomes measure comprised of three correlated scales (Applied Cognition, Daily Activities, and Mobility). All items from each scale are Likert type, so that a respondent chooses a response from an ordered set of four response options. The most appropriate (...) response theory model for analyzing and scoring these items is the multidimensional graded response model (MGRM). During the field testing of the items, an interviewer read each item to a patient and recorded, on a tablet computer, the patient’s responses and the software recorded RTs. Due to the large item bank with over 300 items, data collection was conducted in four batches with a common set of anchor items to link the scale. Van der Linden’s (2007) hierarchical modeling framework was adopted. Several models, with or without interviewer as a covariate and with or without interaction between interviewer and items, were compared for each batch of data. It was found that the model with the interaction between interviewer and item, when the interaction effect was constrained to be proportional, fit the data best. Therefore, the final hierarchical model with lognormal model for RT and the MGRM for response data was fitted to all batches of data via a concurrent calibration. Evaluation of parameter estimates revealed that (1) adding response time information did not affect the item parameter estimates and their standard errors significantly; (2) adding response time information helped reduce the standard error of patients’ multidimensional latent trait estimates, but adding interviewer as a covariate did not result in further improvement. Implications of the findings for follow up adaptive test delivery design are discussed. (shrink)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  6.  7
    Evaluating Different Equating Setups in the Continuous Item Pool Calibration for Computerized Adaptive Testing.Sebastian Born, Aron Fink, Christian Spoden & Andreas Frey - 2019 - Frontiers in Psychology 10.
    The increasing digitalization in the field of psychological and educational testing opens up new opportunities to innovate assessments in many respects (e.g., new item formats, flexible test assembly, efficient data handling). In particular, computerized adaptive testing provides the opportunity to make tests more individualized and more efficient. The newly developed continuous calibration strategy (CCS) from Fink, Born, Spoden, and Frey (2018) makes it possible to construct computerized adaptive tests in application areas where separate calibration studies are not feasible. Due (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  7. Multivariate Higher-Order IRT Model and MCMC Algorithm for Linking Individual Participant Data From Multiple Studies.Eun-Young Mun, Yan Huo, Helene R. White, Sumihiro Suzuki & Jimmy de la Torre - 2019 - Frontiers in Psychology 10.
    Many clinical and psychological constructs are conceptualized to have multivariate higher-order constructs that give rise to multidimensional lower-order traits. Although recent measurement models and computing algorithms can accommodate item response data with a higher-order structure, there are few measurement models and computing techniques that can be employed in the context of complex research synthesis, such as meta-analysis of individual participant data or integrative data analysis. The current study was aimed at modeling complex item responses that can arise (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8.  9
    Linking of Rasch-Scaled Tests: Consequences of Limited Item Pools and Model Misfit.Luise Fischer, Theresa Rohm, Claus H. Carstensen & Timo Gnambs - 2021 - Frontiers in Psychology 12.
    In the context of item response theory, linking the scales of two measurement points is a prerequisite to examine a change in competence over time. In educational large-scale assessments, non-identical test forms sharing a number of anchor-items are frequently scaled and linked using two− or three-parametric item response models. However, if item pools are limited and/or sample sizes are small to medium, the sparser Rasch model is a suitable alternative regarding the precision of parameter (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  9.  7
    A Rasch Model and Rating System for Continuous Responses Collected in Large-Scale Learning Systems.Benjamin Deonovic, Maria Bolsinova, Timo Bechger & Gunter Maris - 2020 - Frontiers in Psychology 11:500039.
    An extension to a rating system for tracking the evolution of parameters over time using continuous variables is introduced. The proposed rating system assumes a distribution for the continuous responses, which is agnostic to the origin of the continuous scores and thus can be used for applications as varied as continuous scores obtained from language testing to scores derived from accuracy and response time from elementary arithmetic learning systems. Large-scale, high-stakes, online, anywhere anytime learning and testing inherently comes with (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  10.  8
    Development and validation of the first adaptive test of emotion perception in music.Chloe MacGregor, Nicolas Ruth & Daniel Müllensiefen - 2023 - Cognition and Emotion 37 (2):284-302.
    The Musical Emotion Discrimination Task (MEDT) is a short, non-adaptive test of the ability to discriminate emotions in music. Test-takers hear two performances of the same melody, both played by the same performer but each trying to communicate a different basic emotion, and are asked to determine which one is “happier”, for example. The goal of the current study was to construct a new version of the MEDT using a larger set of shorter, more diverse music clips and an adaptive (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  10
    The Q-Matrix Anchored Mixture Rasch Model.Ming-Chi Tseng & Wen-Chung Wang - 2021 - Frontiers in Psychology 12.
    Mixture item response theory models include a mixture of latent subpopulations such that there are qualitative differences between subgroups but within each subpopulation the measure model based on a continuous latent variable holds. Under this modeling framework, students can be characterized by both their location on a continuous latent variable and by their latent class membership according to Students’ responses. It is important to identify anchor items for constructing a common scale between latent classes beforehand under the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  12.  55
    Psychometric Properties of the Reidenbach–Robin Multidimensional Ethics Scale.Joan Marie McMahon & Robert J. Harvey - 2007 - Journal of Business Ethics 72 (1):27-39.
    The factor structure of the Multidimensional Ethics Scale (MES; Reidenbach and Robin: 1988, Journal of Business Ethics 7, 871–879; 1990, Journal of Business Ethics 9, 639–653) was examined for the 8-item short form (N = 328) and the original 30-item pool (N = 260). The objectives of the study were: to verify the dimensionality of the MES; to increase the amount of true cross-scenario variance through the use of 18 scenarios varying in moral intensity (Jones: 1991, Academy of (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   22 citations  
  13.  12
    Validation of Embedded Experience Sampling (EES) for Measuring Non-cognitive Facets of Problem-Solving Competence in Scenario-Based Assessments.Andreas Rausch, Kristina Kögler & Jürgen Seifried - 2019 - Frontiers in Psychology 10:441622.
    To measure non-cognitive facets of competence, we developed and tested a new method that we refer to as Embedded Experience Sampling (EES). Domain-specific problem-solving competence is a multi-faceted construct that is not limited to cognitive facets such as domain knowledge or problem-solving strategies but also comprises non-cognitive facets in the sense of domain-specific emotional and motivational dispositions such as interest and self-concept. However, in empirical studies non-cognitive facets are usually either neglected or measured by generalized self-report questionnaires that are detached (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  14.  5
    ANT on the PISA Trail: Following the Statistical Pursuit of Certainty.Radhika Gorur - 2012 - In Michael A. Peters, Tara Fenwick & Richard Edwards (eds.), Researching Education Through Actor‐Network Theory. Chichester, UK: Wiley. pp. 60–77.
    This chapter contains sections titled: ANT and the ‘PISA Laboratory’ PISA: An Overview Background to the Study Making PISA Knowledge From ‘World’ to ‘Word’ Engaging in a ‘Politics of Fact’ Notes References.
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  15.  25
    Validation of the Weight Bias Internalization Scale for Mainland Chinese Children and Adolescents.Hao Chen & Yi-duo Ye - 2021 - Frontiers in Psychology 11.
    Weight stigma internalization among adolescents across weight categories leads to adverse psychological consequences. This study aims to adapt and validate a Chinese version of the Weight Bias Internalization Scale for Mainland Chinese children and adolescents. A total of 464 individuals aged 9 to 15 years participated in the present study. Based on item response theory and classical test theory, we selected the items for the C-WBIS and evaluated its reliability and validity. The item response (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  16.  8
    Properties of a Transport Instrument for Measuring Psychological Impacts of Delay on Commuters, Mokken Scale Analysis.Mahdi Rezapour, Cristopher Veenstra, Kelly Cuccolo & F. Richard Ferraro - 2021 - Frontiers in Psychology 12.
    This study assessed the validity of instrument including various negative psychological and physical behaviors of commuters due to the public transport delay. Instruments have been mostly evaluated by parametric method of item response theory. However, the IRT has been characterized by some restrictive assumptions about the data, focusing on detailed model fit evaluation. The Mokken scale analysis, as a scaling procedure is a non-parametric method, which does not require adherence to any distribution. The results of the study (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  17.  8
    The multiple indicator multiple cause model for cognitive neuroscience: An analytic tool which emphasizes the behavior in brain–behavior relationships.Adon F. G. Rosen, Emma Auger, Nicholas Woodruff, Alice Mado Proverbio, Hairong Song, Lauren E. Ethridge & David Bard - 2022 - Frontiers in Psychology 13.
    Cognitive neuroscience has inspired a number of methodological advances to extract the highest signal-to-noise ratio from neuroimaging data. Popular techniques used to summarize behavioral data include sum-scores and item response theory. While these techniques can be useful when applied appropriately, item dimensionality and the quality of information are often left unexplored allowing poor performing items to be included in an itemset. The purpose of this study is to highlight how the application of two-stage approaches introduces parameter (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  18.  14
    A Brief Online and Offline (Paper-and-Pencil) Screening Tool for Generalized Anxiety Disorder: The Final Phase in the Development and Validation of the Mental Health Screening Tool for Anxiety Disorders.Shin-Hyang Kim, Kiho Park, Seowon Yoon, Younyoung Choi, Seung-Hwan Lee & Kee-Hong Choi - 2021 - Frontiers in Psychology 12.
    Generalized anxiety disorder can cause significant socioeconomic burden and daily life dysfunction; hence, therapeutic intervention through early detection is important. This study was the final stage of a 3-year anxiety screening tool development project that evaluated the psychometric properties and diagnostic screening utility of the Mental Health Screening Tool for Anxiety Disorders, which measures GAD. A total of 527 Koreans completed online and offline versions of the MHS: A, Beck Anxiety Inventory, Generalized Anxiety Disorder-7, and Penn State Worry Questionnaire. The (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19.  15
    Development of Computerized Adaptive Testing for Emotion Regulation.Lingling Xu, Ruyi Jin, Feifei Huang, Yanhui Zhou, Zonglong Li & Minqiang Zhang - 2020 - Frontiers in Psychology 11.
    Emotion regulation plays a vital role in individuals’ well-being and successful functioning. In this study, we attempted to develop a computerized adaptive testing to efficiently evaluate ER, namely the CAT-ER. The initial CAT-ER item bank comprised 154 items from six commonly used ER scales, which were completed by 887 participants recruited in China. We conducted unidimensionality testing, item response theory model comparison and selection, and IRT item analysis including local independence, item fit, differential (...) functioning, and item discrimination. Sixty-three items with good psychometric properties were retained in the final CAT-ER. Then, two CAT simulation studies were implemented to assess the CAT-ER, which revealed that the CAT-ER developed in this study performed reasonably well, considering that it greatly lessened the test items and time without losing measurement accuracy. (shrink)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark