In 2005, Weill Cornell Medicine revised the Hippocratic Oath. Since then, this revised oath has been administered to graduating medical students and become part of the curriculum during first-year orientation. In a flipped classroom exercise, one of the students questioned the Hippocratic injunction, "That I seek the counsel of others when they are more expert." The student wondered whether a physician should step aside in favor of artificial intelligence (AI). Should physicians defer to its expertise? In this Commentary, the author responds to this query, balancing the ethical mandate to use this emerging technology to provide the best evidence-based medicine to those entrusted to one's care against the risk of delegating responsibility to an entity that can neither worry about the provision of care nor provide the therapeutic balm of the patient-physician relationship. The author questions how AI might affect professional formation and medical education, alter the development of a caring ethos, and foster a misplaced confidence in the extent of our medical knowledge. Delegation of writing and thinking to AI could similarly undermine opportunities for learning and human discernment, which can be the font for discovery. The author notes the irony that AI, as an all-knowing and all-seeing muse, might replace Apollo to which the oath pledges fealty. The author asks if medicine wants to replace one mythology with another and cede the most humanistic of disciplines to a machine. If machines are allowed to think for us, physicians will lose an essential element of practice, forsaking the requisite self-knowledge to have agency and responsibility for their actions, thus ending medicine as it has been known.
Facilitation of competency-based medical education adoption would benefit from alignment between medical school (undergraduate medical education [UME]) and residency (graduate medical education [GME]) assessments. This study explores the strength of an a priori alignment of medical school assessments with residency competency domains to guide programmatic assessment and research. In fall 2023, the authors (US faculty with UME and GME expertise) used a 2-phase Delphi consensus approach (9 participants in phase 1 and 16 in phase 2) to independently map medical school assessments to the Accreditation Council for Graduate Medical Education competency domains (0, no association; 1, some or moderate association; and 2, strong association). The mean responses were used to create a heatmap of the strength of associations, with a score of 1.5 or higher indicating a meaningful association. Consensus discussion was used to develop a final heatmap of associations. Strong a priori associations emerged for the patient care domain and metrics of clerkship grading, professionalism concerns, clinical skills, and failure to match. Medical knowledge was closely associated with the United States Medical Licensing Examination Step scores and preclinical metrics. Communication and professionalism were associated with professionalism concerns, Gold Humanism Honor Society membership, clerkship grades and rank, clinical skills, Step 2 Clinical Skills failure, and failure to match or Supplemental Offer and Acceptance Program participation. Practice-based learning and improvement and systems-based practice had weaker associations, suggesting these competencies may be less reflected in traditional medical school metrics. However, subcomponents of practice-based learning related to reflection revealed associations with professionalism concerns and match outcomes. Experts perceived associations between UME and GME assessments in multiple domains, with weaker associations in practice-based learning and systems-based practice. These results offer perspective on how UME outcomes could be associated with GME performance and might be used in programmatic assessment.
Despite the movement toward competency-based medical education (CBME), most training programs in the United States still advance learners based on time-in-training rather than achievement of competency standards. The Education in Pediatrics Across the Continuum (EPAC) project sought to answer whether it is possible to make time-variable, competency-based advancement decisions by creating a program that spans undergraduate (UME) to graduate medical education (GME) to fellowship/practice. This article reports the outcomes of the EPAC project. Each participating site (Universities of California-San Francisco, Colorado, Minnesota, and Utah) selected 4 consecutive cohorts and prospectively followed them from July 2013 to June 2023. Implementation allowed for site-variability but adhered to common design principles. The Core Entrustable Professional Activities (EPAs) for Entering Residency and the General Pediatrics EPAs were the competency frameworks. Each site convened a clinical competency committee every three to six months to determine readiness for learner transition from UME to GME and GME to practice/fellowship based on demonstrated competence. Fifty-six learners enrolled in EPAC and forty-four (79%) completed the program as designed. The time required for each learner to demonstrate readiness for transition from UME to GME (mean 20.2 four-week blocks [SD 2.76], range 14.5-26) and GME to practice/fellowship (mean 5.8 six-month blocks, [SD 0.34], range 4.9-6.1) varied. EPAC participants' educational outcomes were comparable to non-EPAC graduates at the same site, including Milestones ratings (B = 0.30, P = .21), first time pass rates for the American Board of Pediatrics initial certifying exam (93%, 95% CI 81%-98%), and attainment of fellowship or job placements after graduation. The outcomes of the EPAC project demonstrated the ability to make competency-based readiness-to-transition decisions for individual learners and is the first CBME program to span the UME/GME medical education continuum in the US. Lessons learned from the EPAC project have the potential to contribute to advancement efforts for CBME.
Theory enables medical education scholars to move beyond local problems to understand broader phenomena and build cumulative knowledge. Yet many scholars struggle to use theory in their educational endeavors. While the medical education literature offers resources and mentors provide guidance, the field lacks insight into specific mentoring processes for theory use. This study aimed to illuminate processes by which experienced mentors help mentees use theory in medical education scholarship. Using constructivist grounded theory, the authors conducted semi-structured interviews in 2021, 2024, and 2025, with faculty from diverse disciplinary and geographic backgrounds with experience mentoring others in medical education scholarship. Through iterative data collection, coding, and discussion, we developed The Abstracting-Contextualizing Model (ACM) of Mentoring for Theory Use in Medical Education, which was supported by multiple sources of evidence and remained stable with additional data. Based on interviews with 20 mentors, the authors defined three core elements: problems (local issues), phenomena (generalizable abstractions), and theories (tools for understanding). Mentors identified understanding phenomena as the central goal of scholarship, with these core elements serving as entry points. Mentors employed general strategies including preparatory approaches (metaphors, role modeling, sharing exemplar articles, facilitating application exercises) and helping mentees avoid common errors. Based on mentees' starting points, mentors strategically selected between two complementary approaches: Abstracting (guiding from problems or phenomena toward theories) for mentees beginning with local issues, and Contextualizing (grounding theoretical knowledge in educational practice) for mentees starting with particular theories or disciplinary expertise. The ACM addresses a gap at the intersection of theory, research skills, faculty development, and mentoring literatures. It demonstrates how mentors employ sophisticated, adaptive, and task-specific strategies to guide scholars toward understanding phenomena based on their individual starting points. This work offers practical implications for mentees, mentors, institutions, and the field.
Faculty supervisor modification of trainee approaches to clinical encounters represents a chance for trainees to learn what is different about their conceptualization of the case that merits change. In addition to identifying errors or gaps in knowledge on the trainee's part, these changes can highlight meaningful variation in clinical approaches. However, little is known about if and how trainees arrive at this understanding. This study explores how trainees make sense of supervisor changes to their proposed clinical approaches. Semistructured interviews with senior emergency medicine trainees from 3 academic institutions were performed between May 2024 and January 2025. Content was analyzed using a constructivist grounded theory approach with meaningful variation as a sensitizing framework to generate a model of how trainees understand supervisor changes to their clinical approaches. Sixteen trainees in their final year of emergency medicine residency participated. Trainees saw changes as learning opportunities. However, trainee descriptions often revolved around preemptively modifying their approach to match the anticipated supervisor approach, thus avoiding changes altogether because trainees experienced discussing changes with supervisors as antagonistic and burdensome. When trainees did experience a change, they distinguished changes as clinical or stylistic (practitioner or context based). Perceiving the former to signal what they should do and the latter what they could do, trainees demonstrated greater investment in understanding clinical vs stylistic changes. Trainees frequently described attributing the reason for the change to 1 of these 2 domains on their own without supervisor input and only occasionally asking their supervisor. Trainees perceived value in understanding changes to their approaches but frequently did not engage supervisors in their meaning-making and sometimes preemptively avoided changes altogether. Such missed opportunities for exposure to meaningful variation suggest opportunities for educators and training systems to increase the frequency and decrease the perceived hostility of these conversations.
With evolving accreditation requirements, competing demands of medical students, and the national trend toward shortening the preclerkship curriculum, management of workload hours for preclerkship medical students is essential. Although many medical schools have established policies outlining student workload expectations in the preclerkship curriculum, the mechanisms to ensure adherence to these policies are often less well defined. In 2019, the University of Cincinnati College of Medicine (UCCOM) established a scheduling workgroup, the Student Workload Advisory Group (SWAG), which evolved into a collaborative effort to monitor and manage workload. SWAG created a workload calculator that could be applied to all preclerkship courses to monitor workload hours among medical students during academic years 2021 to 2022 through 2024 to 2025. Within the first year of formal monitoring, there were 27 instances in which student workload exceeded that defined in the workload policy. SWAG identified explicit and hidden workload exceedances. Explicit workload exceedances included requirements that can be captured on a course calendar, whereas hidden workload exceedances emerged after implementation of an assignment audit self-report tool. Multiple revisions to the workload policy were incorporated to ensure scheduled activities reflected the time needed to complete the activities, which led to a reduction in workload policy exceedances. More success was seen with a proactive approach that facilitates collaboration between SWAG and course directors to anticipate and address potential workload concerns before they occur. The creation of a formal body consisting of faculty, staff, and students to monitor workload combined with a proactive monitoring plan to enforce workload policies represents a reproducible framework for other institutions seeking to create and/or enforce workload policies. The next step at UCCOM is to incorporate formal workload monitoring into clinical phases of the curriculum to monitor workload hours outside enforced duty hours to improve student wellness.
This Scholarly Perspective frames workplace-based training in medical education as a succession concept and explores how the benefits of the traditional apprenticeship model can integrate with modern competency-based medical education. In 2024, the author broadened the construct of medical competence into 3 semihierarchical layers: canonical, contextual, and personalized. Canonical competence (the canon that all professionals should know) relates to knowledge and skills that meet generalized, context-independent standards, assessed with methods of high psychometric quality. Personalized competence has a focus on the individual pursuit of excellence. The focus of this article is on contextual competence (the ability to work in a patient care context). Contextual competence requires workplace-based assessment and judgments by professionals with relevant expertise and experience. Central is the notion of entrustment. The aim of workplace-based training and assessment is bringing learners to a level of readiness to be entrusted with health care tasks, beyond the standards for canonical competence. Entrustment with units of practice (entrustable professional activities [EPAs]) implies providing autonomy to learners, which requires a prospective focus (Is the learner ready for patient care?) rather than a retrospective focus (Has the learner completed all tasks as assigned?). Autonomy, relevant for identity formation, can be modulated by providing a deliberate decrease of supervision and by starting with small part-tasks (nested EPAs), later subsumed in broader EPAs. An apprenticeship model of workplace-based training and assessment with a prospective aim requires, along with supervisory effort, a rotational structure that allows for sufficient acquaintance with the trainees to enable entrustment with patient care. When entrustment is taken seriously, pivotal points arise when learners start to contribute to care and thus take over the work of their supervisors. Not only is this a return on investment, but it also can enhance learner motivation and supervisor satisfaction; however, it requires a curricular transformation.
Consensus recommendations and harmonized guidelines for reporting quantitative research in health professions education (HPE) research are lacking. This study synthesizes available quantitative reporting recommendations and guidelines to present a harmonized framework and derive quantitative standards for reporting HPE research. The authors identified existing standards, recommendations, and guidelines from peer-reviewed scientific journals in medicine, education, and social sciences by searching PubMed, Web of Science, and Google from August 2006 through April 2025. The authors also requested input from experts in HPE (scholars and journal editors) via electronic communication over 6 months (November 2024 to April 2025). The articles identified were then used to synthesize and generate an initial list of standards for reporting quantitative research in HPE. Two authors reviewed each article to rate its alignment with HPE research. All authors independently evaluated the full list of standards and discussed them as a group for their relevance and alignment for reporting in HPE; all authors also made recommendations for each standard (to report as is, report but modify, or drop). The group subsequently reviewed and collectively reached consensus on the final set of quantitative reporting standards. The authors reviewed a total of 34 articles, with 19 identified as having alignment with HPE quantitative research standards. Mean interrater agreement among author pairs was 98% (kappa = .96). Authors initially generated 40 reporting standards, of which 18 (45%) were modified and 1 (3%) was identified to be dropped. The final set of quantitative guidelines consisted of 39 standards, with each standard nested in one of 28 topic areas. The quantitative reporting standards identified in this study provide guidance to ensure rigor in reporting expectations for HPE research. These standards may also facilitate critical appraisal of articles and enhance the quality and impact of quantitative HPE research.
Graduate medical education requires learners to acquire broad clinical exposures to meet core competencies for unsupervised practice, but variability in clinical learning environments and reliance on resource-intensive assessments hinder precise assessment of trainees' clinical experiences. Electronic health records hold promise for precision medical education, yet manual mapping of International Classification of Diseases, Tenth Revision (ICD-10) codes to specialty-specific clinical practice domains limits scalability. The authors leveraged electronic health record data and artificial intelligence (AI) to map residents' encounter diagnoses to the American Board of Emergency Medicine's Model of the Clinical Practice of Emergency Medicine (MCPEM). Resident encounters across 3 sites at a single academic system (January 1 to October 31, 2023) were analyzed with an AI model, mapped to MCPEM categories with ICD-10 descriptors, and quantified with vectors to match to the closest MCPEM category. Faculty raters validated the most common mappings iteratively, which were subsequently integrated into interactive learner dashboards. Among 119,320 encounters, 5,960 unique ICD-10 descriptors (1,126 stem codes) were identified. For the 650 most common diagnoses, 507 (78.0%) of emergency department diagnosis text descriptors were determined as valid mappings to an MCPEM subcategory. In mappings where faculty were discordant with the lowest distance mapping, 171 of 305 alternative subcategory mappings (56.0%) achieved agreement, increasing the concordance between reviewers to 515 of 650 (79.2%) overall. Interactive dashboards displayed resident-level case mix mapped to MCPEM categories, with anonymized peer comparisons and program-level aggregates, enabling identification of patterns and gaps by domain. Planned work includes iterating AI-automated mappings by expanding inputs beyond diagnoses, engaging wider stakeholder review of mapping validations, and assessing generalizability to other specialties' content outlines to produce a scalable and reproducible model to increase the precision of feedback loops to inform graduate medical education, the clinical learning environment, and training design.
Gossip has been found to influence feedback, but the interplay between gossip and feedback is unclear. Given the importance of feedback for resident development, the education community needs to better understand how gossip and feedback are connected, and the implications of that relationship. For purposes of this study, the authors used an established definition of gossip as "evaluative talk about a person who is not present." The authors used constructivist grounded theory to iteratively conduct and analyze semi-structured interviews from November 2024 to April 2025 with 16 resident participants from 7 programs, including pediatric, obstetrics-gynecology, internal medicine, and psychiatry programs, in the United States and the Netherlands. Interview questions focused on the experience of gossip in residency and the interplay between gossip and feedback. The analysis revealed that gossip and feedback relate to each other in 4 distinct ways, particularly when there are barriers to feedback. "Gossip catalyzes feedback," encouraging its delivery, whether explicitly or subconsciously. "Gossip replaces feedback" when giving critical feedback is challenging or when there is no forum for feedback. Sometimes, "gossip is feedback" given that feedback systems share elements with gossip conversations, such as when feedback conversations are indirect or the feedback provider maintains anonymity. Finally, "gossip follows feedback" when it helps trainees process the emotional experience of receiving challenging feedback and guards against peers receiving similar feedback. The authors theorize gossip as a mechanism for navigating formal feedback systems in practice. Gossip, as much as it is commentary about individuals, is metacommentary on our systems of feedback. The authors call on educators to understand when gossip is used in feedback as a means to further understand barriers to formal feedback. Educators grappling with their feedback systems without considering gossip are missing the looming third party.
Faculty development (FD) is a strategic priority in health professions education, yet its impact in low- and middle-income countries (LMICs) is often limited by contextual and cultural factors. This study explored how Vietnamese health professions faculty engaged with and applied learning from a cross-cultural, interprofessional FD certificate program. Using a qualitative approach informed by a constructivist paradigm, the authors conducted semistructured interviews with 10 faculty from a single academic institution in Vietnam who completed a nine-month international FD program. Interviews, conducted from December 2024-February 2025, were audio-recorded, transcribed verbatim, and thematically analyzed following Braun and Clarke's six phases; coding was conducted in pairs with regular reconciliation meetings. Cultural Historical Activity Theory (CHAT) served as an interpretive lens to analyze systemic tensions. Three interrelated themes characterized participant experiences. First, cultural norms of hierarchy and conflict avoidance, compounded by language barriers and time zone differences, limited open dialogue and learner engagement. Second, contextual barriers, including heavy clinical workloads, structural constraints, and a lack of locally relevant case examples, impeded the translation of educational theory to practice. Third, while interprofessional education (IPE) broadened participants' perspectives, its implementation was limited by role ambiguity between health professions, insufficient institutional support, and a lack of local IPE expertise. Viewed through CHAT, these findings highlight three key tensions: between cultural rules and the learning community; between program instruments (curriculum and pedagogical approaches) and the object of improving teaching capacity; and between the interprofessional learners and the object of applying new teaching practices in local contexts. Cross-cultural and interprofessional FD can foster significant professional growth, but its sustainability in LMIC contexts requires system-level alignment. To enhance relevance and impact, programs should embed cultural humility and psychological safety, co-design locally relevant content, and use layered learning designs that balance interprofessional breadth with profession-specific depth.
Licensure and certification examinations have traditionally served to ensure physicians possess the knowledge and skills required for independent practice. However, with the growing emphasis on workplace-based assessment (WBA), the role and relevance of these point-in-time examinations are increasingly debated. This systematic review aimed to explore the association between physician performance on national licensing and certification examinations and subsequent measures of quality of care in practice. The review followed the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines. Six databases were searched through to November 2025. Original research studies were included if they examined the relationship between performance on medical licensing or certification examinations and physician performance in independent practice, including patient outcomes and fitness to practice. Data were extracted and analyzed using content analysis to identify patterns and trends across studies. A total of 44 studies involving 1,021,187 physicians were included. The findings demonstrated that examination performance was consistently associated with quality of patient care and fitness-to-practice concerns. Better examination performance was linked to improved adherence to mammography screening recommendations, appropriate prescribing practices, improved care of patients with diabetes, lower patient morbidity and mortality, fewer complaints to regulatory bodies, and lower malpractice payments. The association was observed across examination formats and medical specialties. The results suggest that national licensing and certification examinations are valuable predictors of future clinical performance. They provide a uniform benchmark to ensure physicians meet rigorously defined criteria. The findings support continued use of these examinations as part of a comprehensive assessment program to promote high-quality patient care and physician competence. This is critical in an era of increased physician mobility, where many physicians have trained in jurisdictions with widely varying curricula and assessment practices. National examinations help ensure that all licensed physicians meet a consistent standard for clinical competence.
Accurate and complete order entry is an essential skill for all medical graduates. However, medical students report limited confidence in performing this task. This study examined stakeholder perspectives on order entry to identify educational gaps and inform curriculum development. This phenomenological, qualitative study was performed from November 2024 to May 2025 with semistructured interviews of interprofessional stakeholders at the University of Nebraska Medical Center regarding their perceptions of common and high-consequence errors in order entry by new physicians and suggestions for needed education. Stakeholders included inpatient nurses, inpatient pharmacists, interns, and supervising physicians. Analysis was performed using open inductive coding. Eight themes emerged from qualitative analysis of 17 stakeholder interviews. Education and training highlighted the need for both formal and informal instruction on order entry, medication management, and institutional protocols. Contextual awareness emphasized the importance of critically evaluating order appropriateness and identifying inconsistencies. Collaboration in care underscored the role of interprofessional teamwork and communication in reducing errors. Oversight and accountability revealed that early trainees often rely on pharmacists and nurses to catch mistakes, highlighting the need for structured supervision and personal responsibility. Optimizing order systems addressed how electronic health record design, including order sets, prepopulated fields, and decision support, affects accuracy and efficiency. Special populations and vulnerable moments identified increased risk of errors in pediatric, renal, and transitional care scenarios. Informed decision-making reflected gaps in clinical knowledge related to medication selection, dosing, and interactions. Precision in practice stressed the importance of paying attention to detail, verifying accuracy, and balancing speed with thoroughness in order entry. New physicians face multifaceted challenges in mastering order entry, underscoring the need for integrated, longitudinal, and interprofessional educational approaches. Although targeted interventions can improve confidence and competence, sustained practice, reflective learning, and supportive system design are essential for developing durable proficiency.
Documentation of a clinical encounter is a foundational skill required of all physicians; however, there is little standardization in documentation skills teaching. In this Innovation Report, the authors describe a novel patient-centered documentation curriculum and assess the efficacy of the curriculum after implementation. Design of the new patient-centered documentation curriculum included creation of a novel rubric to assess patient-centered language (PCL) in student notes built from the IDEA Assessment Tool as well as the development of new faculty and student workshops. The curriculum was introduced at Harvard Medical School between September 2022 and July 2023. History of present illness (HPI), physical examination (PE), and PCL were analyzed at early (September-October), mid (February-March), and late (May-June) time points. At the early time point, the mean (SD) student composite score in the PCL domain was 2.53 (0.04), suggesting students were using PCL in their notes some or all of the time. This score was higher than the scores for the HPI and PE domains (2.16 [0.56] and 2.20 [0.52], respectively). Comparing the early and late time points, students had statistically significant improvements in the PCL domain, with a mean (SD) composite value of 2.87 (0.03) at the late time point (P < .001). Statistically significant mean (SD) improvements from the early to late time points were also observed in the HPI (2.16 [0.56] to 2.76 [0.03], P < .001) and PE (2.20 [0.52] to 2.65 [0.51], P = .02) domains. Students demonstrated improvements in the PCL domain when comparing the early and mid (2.53 to 2.76; P < .001) and mid and late (2.76 to 2.87, P = .047) time points. Next steps include modifying the curriculum to the needs of other medical schools and assessing whether the curriculum has a positive impact when adopted more broadly.
The concept of "distance traveled" (DT), which contextualizes applicants' achievements within their lived experiences, has emerged as a key component of holistic medical school admissions. However, little is known about how applicants interpret and navigate disclosing DT within application materials. This study explores medical students' experiences with sharing DT and offers actionable recommendations for incorporating DT into equitable and holistic medical school selection processes. The authors conducted semistructured interviews with medical students from US medical schools. Participants were recruited through purposive sampling to ensure demographic and geographic diversity. The Social Ecological Model guided interviews to explore students' understanding of DT, disclosure decisions, and admissions process reflections. Interviews were conducted between October and December 2021. Transcribed interviews were qualitatively analyzed using interpretive description. In total, 31 medical students from 7 US medical schools were included in the study. Three major themes emerged: (1) navigating unclear expectations-students experienced confusion about what aspects of their DT to share, how much detail to include, and how their narratives would be received; (2) balancing vulnerability and perception-students feared being judged negatively or perceived as seeking pity, leading some to omit significant experiences; and (3) the burden of personal disclosure-students described discomfort with recounting personal hardships in a professional context, especially when unsure who would read or interpret their stories. Although DT disclosures can enrich holistic review, applicants face emotional, strategic, and informational barriers to sharing their lived experiences. Medical school admissions processes are encouraged to provide more precise guidance on DT, promote culturally responsive review practices, and support applicants through thoughtful design and implementation of narrative components. These changes can help ensure DT is equitably and meaningfully incorporated into applicant evaluations.
The Association of American Medical Colleges (AAMC) has urged actions to promote inclusivity in medical education for historically underrepresented groups. Studies suggest that underrepresentation may be addressed through structured mentorship and workshops at the pre-medical level and generate interest in less familiar specialties including physical medicine and rehabilitation (PM&R). The authors implemented a novel half-day multi-institutional workshop in spring 2024 featuring physicians and medical students from New York City medical centers. The workshop included structured mentorship and pertinent PM&R-related topics, including disability health and adaptive sports, to which pre-medical students often have limited exposure. Participants were invited to complete pre- and post-workshop surveys to evaluate the workshop's impact. The authors collaborated with pre-health chapters of organizations such as the American Medical Women's Association (AMWA) and the Minority Association of Pre-Medical Students (MAPS) to distribute surveys. The pre-workshop survey was completed by 147 pre-medical students, and 81 registered for the workshop. Ultimately, 55 attended the workshop, with 71% completing the post-workshop survey. Most participants had no previous exposure to PM&R (n = 77, 53.5%) and to adaptive sports (n = 116, 80.6%). Furthermore, although 76 (52.8%) had shadowed a physician, only 6 (4.2%) had experience shadowing a PM&R physician. The authors gathered a diverse representation of responses, including those identifying as female (n = 119, 81.0%) and underrepresented in medicine (n = 84, 57.1%). Pre-post comparative analysis showed a statistically significant rise in students' motivation to apply to medical school (P = .018), along with a statistically significant increase in the interest and understanding of PM&R and adaptive sports (P < .001). Future work will focus on developing longitudinal mentorship, improving accessibility through hybrid formats, and expanding the curriculum through interspecialty collaborations to support broader implementation and evaluation.
This study evaluated differences in expected and observed proportions of activities, average Step 2 scores, and abstracts, publications, and presentations since the adoption of a 10-experience maximum for residency applications and pass/fail Step 1 grading. The authors queried National Resident Matching Program data for MD senior applicants from 2016 to 2024 across 22 specialties. Data included average Step 2 scores; research, work, and volunteer experiences; and number of research products for matched and unmatched students by specialty. Repeated-measures multilevel models were used to estimate the difference in observed outcome in 2024 and expected outcome based on trends from 2016 to 2022. The difference in the 2024 predicted and observed values was the estimated effects of pass/fail Step 1 grading and 10-experience maximum. A total of 91,992 applicants were included in this analysis. In 2024, residency applicants reported higher average Step 2 scores (average observed, 249.62; average predicted, 248.48; b = 1.21; 95% CI, 0.39-2.03; P = .003) and higher average number of research outputs (average observed, 10.30; average predicted, 9.21; b = 1.10; 95% CI, 0.61-1.59; P < .001) than predicted. The proportion of reported research experiences was significantly higher than expected (observed, 0.360; model predicted, 0.254; odds ratio [OR], 1.67; 95% CI, 1.58-1.75), whereas reported work (observed, 0.445; model predicted, 0.506; OR, 0.85; 95% CI, 0.80-0.89) and volunteer (observed, 0.194; model predicted, 0.221; OR, 0.73; 95% CI, 0.69-0.76) experiences were significantly lower than expected (P < .001 for all). These results suggest potential downstream effects of the Step 1 shift to pass/fail and the imposition of a 10-experience limit on residency applications with an increase in scholarly output but a decrease in volunteer and work experiences. These findings challenge medical education leadership to evaluate whether these potential changes align with desired applicant qualities.
Learning analytics involve collecting, analyzing, and visualizing the digital 'footprints' that learners leave behind as they interact with digital learning environments. Learning analytics inform ongoing refinements to improve educational design and teaching practice. To date, research suggests that educators have not fully capitalized on learning analytics, despite the growing availability of data generated through digital learning platforms, programmatic assessment, and competency-based training. In this article, the authors adapted an established "learning analytics lifecycle" framework to evaluate a representative example of digital learning in HPE: eduCAST, an online, video-based website for orthopaedic surgery training. The framework provided a stepwise approach for exploring the potential of learning analytics in eduCAST, identifying implementation gaps, and developing a transferable list of recommendations for educators in HPE. Demographic and engagement data of 141 registered eduCAST users were analyzed after being collected via website analytics. The findings revealed limitations and opportunities related to: (1) planning for the learning environment and its users, (2) the scope and specificity of available learning analytics data, (3) the purposeful use of data analysis techniques, and (4) the relative absence of educators' data-informed actions. Generic website analytics showed significant limitations: high website traffic did not correspond to meaningful learner engagement, nor did it provide useful data to inform ongoing website refinement. Indeed, these analytics produced more data mysteries than meaningful data stories. Consistent with prior reviews in the HPE literature, this worked example demonstrated that educators appear to make diminishing investments of effort and resources across the learning analytics lifecycle. The authors argue that HPE educators can benefit from using learning analytics frameworks to guide the design, implementation, evaluation, and long-term sustainability of digital learning environments. When thoughtfully collected, analyzed, and interpreted, learning analytics can enhance learners' experiences and outcomes, support educators' professional development, and inform continuous program refinement.
Medical knowledge is a critical competency for physicians. Assessment of medical knowledge is distinguished between acquisition and application. Studies have demonstrated the association in acquisition throughout training; however, few have assessed the association between acquisition and application to practice. The purpose of this study was to explore the relationship between medical knowledge acquisition and application across a nationally representative data set. This was a multi-institutional, multi-specialty retrospective study of data from medical school graduates at seven institutions entering residency training from 2016 to 2018. Medical knowledge acquisition was assessed using scores on United States Medical Licensing Examination (USMLE) Step 1 and Step 2 Clinical Knowledge. Medical knowledge application was assessed using Accreditation Council for Graduate Medical Education (ACGME) milestone ratings during internship. ​​Mixed-effects regression was used to estimate the predictive association between USMLE performance and ACGME milestones, clustering specialty, and program effects using coefficients reflecting beta estimates. Data were analyzed from 3,430 medical school graduates. Failing USMLE Step 1 was associated with a higher risk of being rated as "not yet level 1" in patient care milestones at mid-year (OR = 4.19, P = .007) while failure of USMLE Step 2 was associated with the same in medical knowledge (OR = 7.56, P = .15). Higher USMLE scores were associated with higher milestone ratings, though the effect size was very small (standard coefficient range .01 - .19). There were no significant associations between USMLE scores and end-of-year milestone ratings for any analyses. Previous data suggest USMLE scores provide value in identifying future standardized examination (ie, medical knowledge acquisition) performance. These findings demonstrate less compelling evidence using USMLE to predict application of knowledge into practice. USMLE pass/fail status is the most predictive metric of medical knowledge application and may be useful in creating targeted remediation plans.
The United States is facing a primary care workforce crisis that is impacting the overall health of the population. The University of Washington School of Medicine's Rural Underserved Opportunities Program (RUOP) is a 4-week primary care preclerkship rotation. This study examined whether early and immersive exposure to primary care is independently associated with a career choice in primary care overall and family medicine as a distinct specialty. This retrospective cohort study included 2,234 students who matriculated at the University of Washington School of Medicine between 2003 and 2015. Program records were linked to American Medical College Application Service data, a matriculation career-preference survey, and the American Medical Association Physician Masterfile. The outcomes were practice in primary care (family medicine, internal medicine, and pediatrics) and family medicine; the exposure was RUOP participation. Covariates included age; sex; parental educational level; rural upbringing; Washington, Wyoming, Alaska, Montana, or Idaho origin; and 17 baseline specialty or practice setting interests. Multivariable logistic regression generated adjusted odds ratios (ORs) and 95% confidence intervals (CIs). Of the 2,234 graduates, 1,104 (49.4%) completed RUOP. Among RUOP participants, 533 (48.2%) were practicing primary care compared with 328 of the 1,130 non-RUOP graduates (29.3%). In multivariate regression, RUOP participation was associated with higher odds of practicing primary care (OR, 1.39; 95% CI, 1.10-1.76; P = .01) and family medicine practice (OR, 1.54; 95% CI, 1.11-2.14; P =  .01). RUOP was associated with higher odds of practicing both primary care and family medicine. Although confounding factors cannot be entirely excluded, these findings suggest that short, preclerkship primary care immersions may support students' trajectories toward primary care careers. Medical schools aiming to strengthen the primary care pathway should consider incorporating similar experiences into their curricula.