Skip to content

Detection of Copy-Pasting in Online Examinations through Machine Intelligence

Online educational institutions, due to the Covid-19 pandemic, have transitioned to home-based learning and adapted to alternative examination methods. The conventional on-site exams are characterized by face-to-face supervision from educational staff. However, online learning and assessments...

Detecting Plagiarism in Online Exams Through Machine Learning Techniques
Detecting Plagiarism in Online Exams Through Machine Learning Techniques

Detection of Copy-Pasting in Online Examinations through Machine Intelligence

In the wake of the Covid-19 pandemic, educational institutions worldwide have been forced to adapt, with many shifting to home schooling and alternative examination methods. One such challenge has been ensuring exam integrity during online exams, a task that traditionally relies on the permanent surveillance of educational staff. However, a new machine learning-based analytical pipeline is stepping up to the plate, offering a solution that combines advanced techniques to detect collusion and plagiarism.

The data-driven approach of this plagiarism detection system prevents collusive behaviour without resorting to permanent and privacy-infringing surveillance. It achieves this by estimating an exam vector, which is dependent on the underlying structure of the questions, and using various methods for automated plagiarism detection. By setting the number of components (dimensions) to 25, the explained variance is reduced to 98.76% of the original variance, allowing for a more compact vector for each exam.

This compact vector is then used to calculate pairwise cosine distances for all exams, which are stored in a data frame matrix. Exams with a similarity score of 1 (completely identical) or close to 1 (very similar) are identified using this algorithm. The final step for identifying collusive behaviour is sorting the pairwise exam similarities decreasingly.

The system also employs several other strategies to prevent students' collusion during exams. These include generating student-individual assignments and rubric-based tools for evaluating student solutions. For identifying plagiarism, cases of two bad/average exams with a similarity score of almost +1 are considered highly suspicious.

In addition to plagiarism detection, the pipeline also incorporates behavioural and facial recognition monitoring. Machine learning models analyse exam-taker behaviour and facial cues in real-time to identify suspicious activities, such as the presence of unauthorised people or devices, helping detect collusion.

Secure browsers like Proctortrack's PEBble are another tool used to prevent cheating or collaborative answer-sharing. These browsers prevent students from opening new tabs, running external applications, or accessing AI tools during the exam.

The system also includes continuous data monitoring and anomaly detection. Algorithms monitor vast amounts of data from exam sessions, learning normal patterns and identifying anomalies indicative of fraud. This includes irregular response timings, unusual answer similarities across candidates, or inconsistent typing and navigation behaviours associated with plagiarism or collusion.

Finally, the machine learning systems adapt over time, refining their detection logic from new data to stay ahead of emerging cheating tactics, such as AI-assisted answer generation or deepfake identity fraud.

In summary, the machine learning-based pipeline combats online exam fraud by locking down test environments to limit cheating opportunities, using AI-powered behavioural analytics and facial recognition to detect suspicious conduct, continuously analysing data patterns to identify collusion and plagiarism, and adapting detection methods based on emerging fraud trends and AI-assisted cheating tactics. This multifaceted approach ensures robust exam integrity while scaling to large online testing environments.

The machine learning pipeline in education-and-self-development integrates technology to combat collusion and plagiarism during online exams, a challenge exacerbated by the shift in learning environments due to the Covid-19 pandemic. It achieves this through advancements in finance, such as the development of a plagiarism detection system that doesn't require intrusive surveillance, instead relying on data-driven methods and complex machine learning algorithms for analysis.

Read also:

    Latest