Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment
Abstract (Summary)Computer-based testing is becoming popular with credentialing agencies because new test designs are possible and the evidence is clear that these new designs can increase the reliability and validity of candidate scores and pass/fail decisions. Research on MST to date suggests that the measurement quality of MST results is comparable to full-fledged computer-adaptive tests and improved over computerized fixed-form tests. MST's promise dwells in this potential for improved measurement with greater control than other adaptive approaches for constructing test forms. Recommending use of the MST design and advising how best to set up the design, however, are two different things. The purpose of the current simulation study was to advance an established line of research on MST methodology by enhancing understanding of how several important design variables affect outcomes for high-stakes credentialing. Modeling of the item bank, the candidate population, and the statistical characteristics of test items reflect an operational credentialing exam's conditions. Studied variables were module arrangement (4 designs), amount of overall test information (4 levels), distribution of information over stages (2 variations), strategies for between-stage routing (4 levels), and pass rates (3 levels), for 384 conditions total. Results showed that high levels of decision accuracy (DA) and decision consistency (DC) were consistently observed, even when test information was reduced by as much as 25%. No differences due to the choice of module arrangement were found. With high overall test information, results were optimal when test information was divided equally among stages; with reduced test information gathering more test information at Stage 1 provided the best results. Generalizing simulation study findings is always problematic. In practice, psychometric models never completely explain candidate performance, and with MST, there is always the potential psychological impact on candidates if test difficulty shifts are noticed. At the same time, two findings seem to stand out in this research: (1)Ã?Â with limited amounts of overall test information, it may be best to capitalize on available information with accurate branching decisions early, and (2)Ã?Â there may be little statistical advantage in exceeding test information much above 10 as gains in reliability and validity appear minimal.
School Location:USA - Massachusetts
Source Type:Master's Thesis
Date of Publication:01/01/2004