Extracting key concepts and their relationships from course information and materials facilitates the provision of visualizations and recommendations for learners who need to select the right courses to take from a large number of courses. However, identifying and extracting themes manually is labor-intensive and time-consuming. Previous machine learning-based methods to extract relevant concepts from courses heavily rely on detailed course materials, which necessitates labor-intensive preparation of course materials. This paper investigates the potential of LLMs such as GPT in automatically generating course concepts and their relations. Specifically, we design a suite of prompts and provide GPT with the course information with different levels of detail, thereby generating high-quality course concepts and identifying their relations. Furthermore, we comprehensively evaluate the quality of the generated concepts and relationships through extensive experiments. Our results demonstrate the viability of LLMs as a tool for supporting educational content selection and delivery.
Research suggests that interacting with more peers about physics course material is correlated with higher student performance. Some studies, however, have demonstrated that different topics of peer interactions may correlate with their performance in different ways, or possibly not at all. In this study, we probe both the peers with whom students interact about their physics course and the particular aspects of the course material about which they interacted in six different introductory physics courses: four lecture courses and two lab courses. Drawing on methods in social network analysis, we replicate prior work demonstrating that, on average, students who interact with more peers in their physics courses have higher final course grades. Expanding on this result, we find that students discuss a wide range of aspects of course material with their peers: concepts, small-group work, assessments, lecture, and homework. We observe that in the lecture courses, interacting with peers about concepts is most strongly correlated with final course grade, with smaller correlations also arising for small-group work and homework. In the lab courses, on the other hand, small-group work is the
Large Language Models (LLMs) have become part of how students solve programming tasks, offering immediate explanations and even full solutions. Previous work has highlighted that novice programmers often heavily rely on LLMs, thereby neglecting their own problem-solving skills. To address this challenge, we designed a course-specific online Python tutor that provides retrieval-augmented, course-aligned guidance without generating complete solutions. The tutor integrates a web-based programming environment with a conversational agent that offers hints, Socratic questions, and explanations grounded in course materials. Students used the system during self-study to work on homework assignments, and the tutor also supported questions about the broader course material. We collected structured student feedback and analyzed interaction logs to investigate how they engaged with the tutor's guidance. We observed that students used the tutor primarily for conceptual understanding, implementation guidance, and debugging, and perceived it as a course-aligned, context-aware learning support that encourages engagement rather than direct solution copying.
Current course recommender systems primarily leverage learner-course interactions, course content, learner preferences, and supplementary course details like instructor, institution, ratings, and reviews, to make their recommendation. However, these systems often overlook a critical aspect: the evolving skill demand of the job market. This paper focuses on the perspective of academic researchers, working in collaboration with the industry, aiming to develop a course recommender system that incorporates job market skill demands. In light of the job market's rapid changes and the current state of research in course recommender systems, we outline essential properties for course recommender systems to address these demands effectively, including explainable, sequential, unsupervised, and aligned with the job market and user's goals. Our discussion extends to the challenges and research questions this objective entails, including unsupervised skill extraction from job listings, course descriptions, and resumes, as well as predicting recommendations that align with learner objectives and the job market and designing metrics to evaluate this alignment. Furthermore, we introduce an initia
Increasingly, courses on Empirical Software Engineering research methods are being offered in higher education institutes across the world, mostly at the M.Sc. and Ph.D. levels. While the need for such courses is evident and in line with modern software engineering curricula, educators designing and implementing such courses have so far been reinventing the wheel; every course is designed from scratch with little to no reuse of ideas or content across the community. Due to the nature of the topic, it is rather difficult to get it right the first time when defining the learning objectives, selecting the material, compiling a reader, and, more importantly, designing relevant and appropriate practical work. This leads to substantial effort (through numerous iterations) and poses risks to the course quality. This chapter attempts to support educators in the first and most crucial step in their course design: creating the syllabus. It does so by consolidating the collective experience of the authors as well as of members of the Empirical Software Engineering community; the latter was mined through two working sessions and an online survey. Specifically, it offers a list of the fundament
Academic choice is crucial in U.S. undergraduate education, allowing students significant freedom in course selection. However, navigating the complex academic environment is challenging due to limited information, guidance, and an overwhelming number of choices, compounded by time restrictions and the high demand for popular courses. Although career counselors exist, their numbers are insufficient, and course recommendation systems, though personalized, often lack insight into student perceptions and explanations to assess course relevance. In this paper, a deep learning-based concept extraction model is developed to efficiently extract relevant concepts from course descriptions to improve the recommendation process. Using this model, the study examines the effects of skill-based explanations within a serendipitous recommendation framework, tested through the AskOski system at the University of California, Berkeley. The findings indicate that these explanations not only increase user interest, particularly in courses with high unexpectedness, but also bolster decision-making confidence. This underscores the importance of integrating skill-related data and explanations into educati
Visual Question Generation (VQG) research focuses predominantly on natural images while neglecting the diagram, which is a critical component in educational materials. To meet the needs of pedagogical assessment, we propose the Diagram-Driven Course Questions Generation (DDCQG) task and construct DiagramQG, a comprehensive dataset with 15,720 diagrams and 25,798 questions across 37 subjects and 371 courses. Our approach employs course and input text constraints to generate course-relevant questions about specific diagram elements. We reveal three challenges of DDCQG: domain-specific knowledge requirements across courses, long-tail distribution in course coverage, and high information density in diagrams. To address these, we propose the Hierarchical Knowledge Integration framework (HKI-DDCQG), which utilizes trainable CLIP for identifying relevant diagram patches, leverages frozen vision-language models for knowledge extraction, and generates questions with trainable T5. Experiments demonstrate that HKI-DDCQG outperforms existing models on DiagramQG while maintaining strong generalizability across natural image datasets, establishing a strong baseline for DDCQG.
The integration of AI tools into programming education has become increasingly prevalent in recent years, transforming the way programming is taught and learned. This paper provides a review of the state-of-the-art AI tools available for teaching and learning programming, particularly in the context of introductory courses. It highlights the challenges on course design, learning objectives, course delivery and formative and summative assessment, as well as the misuse of such tools by the students. We discuss ways of re-designing an existing course, re-shaping assignments and pedagogy to address the current AI technologies challenges. This example can serve as a guideline for policies for institutions and teachers involved in teaching programming, aiming to maximize the benefits of AI tools while addressing the associated challenges and concerns.
Curriculum Analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. One desirable property of courses within curricula is that they are not unexpectedly more difficult for students of different backgrounds. While prior work points to likely variations in course difficulty across student groups, robust methodologies for capturing such variations are scarce, and existing approaches do not adequately decouple course-specific difficulty from students' general performance levels. The present study introduces Differential Course Functioning (DCF) as an Item Response Theory (IRT)-based CA methodology. DCF controls for student performance levels and examines whether significant differences exist in how distinct student groups succeed in a given course. Leveraging data from over 20,000 students at a large public university, we demonstrate DCF's ability to detect inequities in undergraduate course difficulty across student groups described by grade achievement. We compare major pairs with high co-enrollment and transfer students to their non-transfer peers. For the former, our findings suggest a link between DCF effect sizes and the alignmen
In 2020, I designed the course CMSC 20630/30630 Human-Robot Interaction: Research and Practice as a hands-on introduction to human-robot interaction (HRI) research for both undergraduate and graduate students at the University of Chicago. Since 2020, I have taught and refined this course each academic year. Human-Robot Interaction: Research and Practice focuses on the core concepts and cutting-edge research in the field of human-robot interaction (HRI), covering topics that include: nonverbal robot behavior, verbal robot behavior, social dynamics, norms & ethics, collaboration & learning, group interactions, applications, and future challenges of HRI. Course meetings involve students in the class leading discussions about cutting-edge peer-reviewed research HRI publications. Students also participate in a quarter-long collaborative research project, where they pursue an HRI research question that often involves conducing their own human-subjects research study where they recruit human subjects to interact with a robot. In this paper, I detail the structure of the course and its learning goals as well as my reflections and student feedback on the course.
General-education college astronomy courses offer instructors both a unique audience and a unique challenge. For many students, such a course may be their first time encountering a standalone astronomy class, and it is also likely one of the last science courses they will take. Thus, in a single semester, primary course goals often include both imparting knowledge about the Universe and giving students some familiarity with the processes of science. In traditional course environments, students often compartmentalize information into separate "life files" and "course files" rather than integrating information into a coherent framework. The astronomy course created through this project, taught at the University of Arizona in Spring 2019, was designed around inclusivity-driven guiding principles that help students engage with course content in ways that are meaningful, relevant, and accessible. Our course bridges the gap between students' "life" and "course files", encourages and respects diverse points of view, and empowers students to connect course content with their personal lives and identities. In this paper, we provide insight into the guiding principles that informed our cours
The study in interaction patterns between students in on-campus and MOOC-style online courses has been broadly studied for the last 11 years. Yet there remains a gap in the literature comparing the habits of students completing the same course offered in both on-campus and MOOC-style online formats. This study will look at browser-based usage patterns for students in the Georgia Tech CS1301 edx course for both the online course offered to on-campus students and the MOOCstyle course offered to anyone to determine what, if any, patterns exist between the two cohorts.
The accurate estimation of students' grades in future courses is important as it can inform the selection of next term's courses and create personalized degree pathways to facilitate successful and timely graduation. This paper presents future-course grade predictions methods based on sparse linear models and low-rank matrix factorizations that are specific to each course or student-course tuple. These methods identify the predictive subsets of prior courses on a course-by-course basis and better address problems associated with the not-missing-at-random nature of the student-course historical grade data. The methods were evaluated on a dataset obtained from the University of Minnesota. This evaluation showed that the course-specific models outperformed various competing schemes with the best performing scheme achieving an RMSE across the different courses of 0.632 vs 0.661 for the best competing method.
Background: Researchers in developing countries often do not have access to training on research writing. The purpose of this study was to test whether researchers in Rwanda might complete and benefit from a pilot online course in research writing. Methods: The pilot course was set up on Moodle, an open-source online learning environment, and facilitated by the author. The lessons and assignment were spread over six weeks, followed by a two-week extension period. Twenty-eight faculty members of the National University of Rwanda enrolled themselves in the course. Results: Twenty-five of the 28 learners completed the course. After the course, these learners expressed high satisfaction, e.g, 24 of them felt that they were ready to write a research paper for publication. Conclusion: The high completion rate (89%) is noteworthy for two reasons: e-learning courses tend to have lower completion rates than classroom courses, and 76% of the learners in the pilot course had not taken an e-learning course before. This result and the positive feedback indicate that online courses can benefit researchers in developing countries who may not have access to classroom courses on research writing.
Curriculum analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. An essential aspect is studying course properties, which involves assigning each course a representative difficulty value. This is critical for several aspects of CA, such as quality control (e.g., monitoring variations over time), course comparisons (e.g., articulation), and course recommendation (e.g., advising). Measuring course difficulty requires careful consideration of multiple factors: First, when difficulty measures are sensitive to the performance level of enrolled students, it can bias interpretations by overlooking student diversity. By assessing difficulty independently of enrolled students' performances, we can reduce the risk of bias and enable fair, representative assessments of difficulty. Second, from a measurement theoretic perspective, the measurement must be reliable and valid to provide a robust basis for subsequent analyses. Third, difficulty measures should account for covariates, such as the characteristics of individual students within a diverse populations (e.g., transfer status). In recent years, various notions of difficulty have been p
In this paper I describe the design of an introductory course in Human-Robot Interaction. This project-driven course is designed to introduce undergraduate and graduate engineering students, especially those enrolled in Computer Science, Mechanical Engineering, and Robotics degree programs, to key theories and methods used in the field of Human-Robot Interaction that they would otherwise be unlikely to see in those degree programs. To achieve this aim, the course takes students all the way from stakeholder analysis to empirical evaluation, covering and integrating key Qualitative, Design, Computational, and Quantitative methods along the way. I detail the goals, audience, and format of the course, and provide a detailed walkthrough of the course syllabus.
The risk of harmful content generated by large language models (LLMs) becomes a critical concern. This paper presents a systematic study on assessing and improving LLMs' capability to perform the task of \textbf{course-correction}, \ie, the model can steer away from generating harmful content autonomously. To start with, we introduce the \textsc{C$^2$-Eval} benchmark for quantitative assessment and analyze 10 popular LLMs, revealing varying proficiency of current safety-tuned LLMs in course-correction. To improve, we propose fine-tuning LLMs with preference learning, emphasizing the preference for timely course-correction. Using an automated pipeline, we create \textsc{C$^2$-Syn}, a synthetic dataset with 750K pairwise preferences, to teach models the concept of timely course-correction through data-driven preference learning. Experiments on 2 LLMs, \textsc{Llama2-Chat 7B} and \textsc{Qwen2 7B}, show that our method effectively enhances course-correction skills without affecting general performance. Additionally, it effectively improves LLMs' safety, particularly in resisting jailbreak attacks.
Course load analytics (CLA) inferred from LMS and enrollment features can offer a more accurate representation of course workload to students than credit hours and potentially aid in their course selection decisions. In this study, we produce and evaluate the first machine-learned predictions of student course load ratings and generalize our model to the full 10,000 course catalog of a large public university. We then retrospectively analyze longitudinal differences in the semester load of student course selections throughout their degree. CLA by semester shows that a student's first semester at the university is among their highest load semesters, as opposed to a credit hour-based analysis, which would indicate it is among their lowest. Investigating what role predicted course load may play in program retention, we find that students who maintain a semester load that is low as measured by credit hours but high as measured by CLA are more likely to leave their program of study. This discrepancy in course load is particularly pertinent in STEM and associated with high prerequisite courses. Our findings have implications for academic advising, institutional handling of the freshman e
Students' decisions on whether to take a class are strongly affected by whether their friends plan to take the class with them. A student may prefer to be assigned to a course they likes less, just to be with their friends, rather than taking a more preferred class alone. It has been shown that taking classes with friends positively affects academic performance. Thus, academic institutes should prioritize friendship relations when assigning course seats. The introduction of friendship relations results in several non-trivial changes to current course allocation methods. This paper explores how course allocation mechanisms can account for friendships between students and provide a unique, distributed solution. In particular, we model the problem as an asymmetric distributed constraint optimization problem and develop a new dedicated algorithm. Our extensive evaluation includes both simulated data and data derived from a user study on 177 students' preferences over courses and friends. The results show that our algorithm obtains high utility for the students while keeping the solution fair and observing courses' seat capacity limitations.
University students routinely use the tools provided by online course ranking forums to share and discuss their satisfaction with the quality of instruction and content in a wide variety of courses. Student perception of the efficacy of pedagogies employed in a course is a reflection of a multitude of decisions by professors, instructional designers and university administrators. This complexity has motivated a large body of research on the utility, reliability, and behavioral correlates of course rankings. There is, however, little investigation of the (potential) implicit student bias on these forums towards desirable course outcomes at the institution level. To that end, we examine the connection between course outcomes (student-reported GPA) and the overall ranking of the primary course instructor, as well as rating disparity by nature of course outcomes, based on data from two popular academic rating forums. Our experiments with ranking data about over ten thousand courses taught at Virginia Tech and its 25 SCHEV-approved peer institutions indicate that there is a discernible albeit complex bias towards course outcomes in the professor ratings registered by students.