What is programming plagiarism? It is, simply put, using another person’s source code and claiming it as your own. Programming plagiarism has been around since at least the 1990s when MOSS (Measure of Software Similarity) was developed to check for plagiarism in programming assignments. In this post, we investigate not only the increase of plagiarism in academic programming coursework but the reasons for the increase and what educators can do to mitigate programming plagiarism.
According to a 1998 Wired article entitled "Catching Computer Science Cheaters ," “Computer science instructors have guessed that on any given assignment, between 5 and 20 percent of the students have collaborated 'beyond what is reasonable.’”
A more recent The New York Times article, "As Computer Coding Classes Swell, So Does Cheating ," published in 2017, confirms that programming plagiarism is an ongoing issue when it comes to academic misconduct on campuses. Bidgood and Merill state, “At Brown University, more than half the 49 allegations of academic code violations last year involved cheating in computer science. At Stanford, the alma mater of the founders of Google, Snapchat, and countless other internet wonders, as many as 20 percent of the students in one 2015 computer science course were flagged for possible cheating.”
The Ambiguity of Collaboration
Witness any piece of software meant for download or purchase, and it’s more likely than not it was built by more than one person. Additionally, the code within may even contain what’s deemed “open source,” which is free and available source code accessible to anyone for use, with or without credit, depending on the attached license. In the world of software development, collaboration is common and encouraged in ways that are likely unacceptable in a typical academic class.
In the realm of academic coursework, programming projects are often meant for individuals to complete for individual assessment. Code must be attributed--or perhaps be an entirely original work by the student being assessed. The reasons for this difference may be obvious: in assignments, students are demonstrating their knowledge of concepts to instructors and doing so with original ideas, not building something to sell.
Solution Availability
In the world of open-source software, code is open for reuse--and even though there are different licenses and permission for reuse, students may be struggling with coursework and/ or be too tempted to use open source software for their assignments. Open-source software resides in online code repositories like Github , which may have answers to concepts and assignments posted by students who have previously taken the course, according to The New York Times . Another resource that students sometimes misuse is Stack Overflow , a question and answer site used by new programmers and experienced professionals alike.
Programming coursework, for the above reasons and more, doesn’t match the more collaborative nature of industry approaches to programming. As a result, this leaves waters murky for instances of student collusion and programming plagiarism, not the least of which is due to students having access to a large number of resources meant for industry software development. According to a research article "Eliminating Plagiarism in Programming Courses through Assessment Design ," “Students tend to plagiarize if solutions to assignment [sic] can be easily obtained from Internet [sic] or similar sources” (Ngo, 2016, p. 873).
What can educators do to clarify and prevent what constitutes plagiarism in programming?
- Make clear the rules on academic integrity, including defining collaboration versus collusion. Define upfront the lines that students must not cross. Research, as expressed in "Collaboration Versus Cheating, " has shown that “Explaining and reinforcing lessons in academic honesty results in statistically significant (p < 0.05) decreases in plagiarism rates across all of the distinctive programming submissions in a large online graduate computer science course” (Mason, Gavrilovska, & Joyner, 2019).
- Emphasize policies on using outside code. If you allow the use of outside code and /or have a restricted source list, ask students to comment on their code to clarify any areas in which code has been borrowed from outside sources--in other words, ask students to cite their sources.
- Brainstorm and come up with original and unique code. Model original thinking and design assessments that support the process for developing original code.
- Provide scaffolding for original code by setting intermediate due dates with feedback to encourage student learning and to help those who are struggling. Some instructors may use git or another version control tool so students can commit work in segments over a larger period. Intermediate due dates also increase transparency into student work and help students feel seen and thus be less prone to academic misconduct.
- Use software that reviews for similarity, which can act as a deterrent for misconduct. Additionally, it can be used in conjunction with explicit instruction and feedback to reinforce academic integrity in programming.
- Invest in assessment design using item analysis and assessment with integrity so that assignments contribute to student learning and so students understand the value of assessment and formative feedback. According to "Eliminating Plagiarism in Programming Courses through Assessment Design ," “In the domain of programming, learning from worked examples is especially useful where students learn to interpret existing source code and modify it to their needs. The key to learning from examples is that we need to make sure that students understand the examples, not just copy the source code for the sake of completing assignments. As such, in our design strategy, code skeletons are developed for each assignment to prevent students from copying without understanding. In-class assessments are then designed based on the assignment content to test students’ understanding and ability to modify their source code to meet new program requirements. The initial results and feedback from students show potential benefit of our design method in improving students’ understanding and performance. More importantly, it eliminates instructors’ time and effort in detecting plagiarism” (Ngo 2016, p. 878).
- Nurture student-instructor feedback loops and respect. Research has shown that strong student-instructor relationships mitigate instances of plagiarism. Computer science is no exception. "Plagiarism and Programming: A Survey of Student Attitudes " summarized research focused on student perceptions of programming plagiarism and the prevention of programming plagiarism. The paper concludes, “This study did find that education about what constitutes academic dishonest behavior for graded programming assignments does make a difference in student perceptions, educators need to be diligent about clearly outlining what is acceptable and what is not acceptable as well as constantly reminding students about course policies as they relate to academic dishonesty.” This is in line with the findings of Simkin and McLeod (2010) that the presence of an ethical faculty member with opinions that students respected was one reason students chose not to cheat” (Aashiem, Rutner, Li, & Williams 2012, p. 307). Explicit instruction around academic integrity in programming is a necessary step.
Unfettered digital access to online source code makes programming plagiarism particularly vulnerable to instances of academic misconduct. Source availability is plentiful and accessible to overwhelmed and stressed students. And the collaborative camaraderie of the software industry--ideal for building software--can be misleading for students being assessed for individual performance.
What we as educators can do is help students understand the purpose of learning and the guiding principles of academic integrity so that they can be prepared for careers with original ideas, original code, and a clear understanding of true collaboration.