Needles in a Haystack: Student Struggles with Working on Large Code Bases
This program is tentative and subject to change.
Background and Context. Prior work has explicitly called on undergraduate computer science (CS) programs to better prepare students for the demands of professional software development. While students in university courses primarily work on programming projects from scratch (greenfield development
) and create small coding projects, professional developers in industry are expected to comprehend and modify a large, existing code base (brownfield development
). As a result, CS graduates enter the workforce with little to no instruction on how to comprehend and modify a large code base.
Objectives. We aim to identify the variety of struggles that final-year undergraduate students experience when comprehending and modifying a large code base so that software engineering instruction adapt to address students’ needs.
Methods. We conducted a think-aloud protocol with 13 undergraduates in their final year of a CS degree at a public, four-year university in North America. In the protocol, students modified an existing feature in an open-source code base with roughly 60,000 lines of code. Using Information Foraging Theory to analyze students’ code navigation and program comprehension processes, we identified ineffective student behaviors related to their thought processes and comprehension strategies.
Findings. We found a variety of ineffective behaviors among students, which we categorized into four concrete struggles. Students were unable to 1) effectively use documentation to get started on the task, 2) use a methodical, structurally-guided comprehension process, 3) find all parts of the relevant code, and 4) abandon irrelevant lines of reasoning.
Implications. Our study not only shows that novice students experience some of the same struggles as professional developers, but also identifies struggles that are unique to students with limited experience working on large code bases, such as relying on opportunistic search strategies and over-investigating irrelevant code. We suggest pedagogical recommendations to address these struggles, such as explicitly teaching students about code comprehension techniques for large code bases and designing tasks in which students find, comprehend, and modify code across \textit{multiple files} in a code base.
This program is tentative and subject to change.
Mon 4 AugDisplayed time zone: Eastern Time (US & Canada) change
09:15 - 10:30 | |||
09:15 25mTalk | Understanding and Improving Student Note-Taking in Live Coding Lectures Research Papers Daniel Manesh Virginia Tech, Tong Wu Virginia Tech, Yan Chen Virginia Tech, USA, Sang Won Lee Virginia Polytechnic Institute and State University | ||
09:40 25mTalk | Do CS Undergraduates Show Evidence of a Security Mindset without Formal Coursework? An Exploratory Qualitative Study Research Papers Michelle Jensen University of Wisconsin - Madison, Matthew Berland University of Wisconsin - Madison, Rahul Chatterjee University of Wisconsin-Madison | ||
10:05 25mTalk | Needles in a Haystack: Student Struggles with Working on Large Code Bases Research Papers Anshul Shah University of California, San Diego, Thomas Rexin University of California, San Diego, Anya Chernova University of California, San Diego, Gonzalo Allen-Perez University of California, San Diego, William G. Griswold University of California San Diego, Gerald Soosairaj University of California, San Diego |