2026, Number 1
Implementation and validation of a Computerized Adaptative Testing for general surgery certification by the Mexican Board of General Surgery: from the proof of concept to its definitive validation
Prado E, Pérez-Soto RH, Sánchez-Reyes K, López-Gavito E, Hernández-Cendejas A, Kobeh J, Velázquez-Fernández D
Language: Spanish
References: 15
Page: 7-17
PDF size: 1627.90 Kb.
ABSTRACT
Introduction: the board certification in general surgery requires the full compliance of 3 phases (pilot test, implementation, validation), designed to guarantee the quality standards of professional competence in our country. CAT is a tool that allows the optimized application of items in a dynamic way based on their performance in real time, optimizing the time, number of items, computer equipment and human resources plus reducing the potential fatigue associated with longer periods of time. Objective: to implement and analyze the efficiency of CAT to determine the academic proficiency of general surgeons who apply for the CMCG certification. Material and methods: we used a three-phase methodology: a pilot test, with the test implementation and a final validation in independent cohorts. The first included 322 supporters, the second was applied to 569 independent applicants in which the results from our conventional test were contrasted with the CAT, while for the validation phase 1,194 applicants with two different CATs were analyzed with randomly drawn items obtained from a pool of 1,200 items with different levels of difficulty that were contrasted with the conventional test of each sustainer in a paired and global analysis. For statistical analysis we used IBM SPSS® Statistics® v26 software considering any p value < 0.05 as statistically significant for a two-tie hypothesis test. Results: RACs resulted similar between the conventional test and CAT (30.32 ± 3.89 vs 30.67 ± 6.83 respectively) with a delta of 0.35 ± 7.26; p = 0.38 and SEM = 0.40. For the implementation phase, a shorter time was documented between groups (117.2 ± 12.6 vs 72.05 ± 18.2 minutes, respectively), with a lower average of answered items (100.56 ± 15.64 vs 193.69 ± 15.78), but with a similar average score (50.84 ± 7.9 vs 54.53 ± 7.2), but able to discern between AAA vs BAA applicants (51.34 ± 7.5 vs 36.42 ± 5.5). In the validation phase, there was no difference in time, number of items answered, or difficulty grade, but the ability to discriminate between AAA and BAA applicants was maintained (p < 0.0001). Conclusions: setting CAT proved to be a useful tool not only feasible, but also valid with greater efficiency than the conventional tests, achieving comparable results, but with a significant reduction in the time and number of items utilized, maintaining its ability to quickly discriminate the applicants with greater academic assertiveness.REFERENCES