Rethinking assessment in the age of GenAI – without losing what writing uniquely teaches

Table of Contents

1) Should we switch to alternative examination formats so AI can only help (not short-circuit learning)?

Yes – and many should. Authentic, applied, and oral formats can reduce opportunities for low-effort outsourcing and invite productive AI use (e.g., iterative drafts, data-informed reflection, viva/orals, in-class builds). But three practical tests matter:

AI-resilience: Does the task make low-effort AI use unhelpful? (e.g., staged deliverables, oral defenses, traceable process) [1]
Feasibility: Can teachers run it efficiently at scale without adding substantial workload compared to a writing task?
Learning goals parity: Does it still build the same outcomes as academic writing: literature navigation, evidence selection, argumentation, synthesis, citation, and voice?

In practice, few single formats deliver all three. That’s why a balanced portfolio – and transparent use of AI – often beats a wholesale swap. Meanwhile, we must recognize a stubborn fact: post-hoc AI detection, be it by humans or AI-tools, is unreliable in isolation (without considering the writing process and version history analysis, respectively); e.g., a rigorous PLOS ONE field study found 94% of injected AI submissions went undetected and often scored well. Assessment design and process evidence matter more than ever, but cannot solve the genAI – academic integrity problem entirely. [2]

2) Why not drop take-home writing entirely and let AI do it?

Because we would lose a near-unique learning engine. Decades of research show that writing to learn cultivates critical thinking more effectively than non-writing tasks; students who write routinely demonstrate larger gains in analysis and reasoning. Writing externalizes thought, forces evidence selection, connects sources, and strengthens conceptual “wiring” through revision. [3]

If we remove writing, we remove the slow, effortful cognition where much of the learning happens. The answer isn’t to ban AI or abandon essays – it’s to make the process visible and to reward the thinking that writing requires: staged outlines, annotated sources, version history, and/or short oral defenses tied to the submitted text.

3) Does Mentafy’s process-forensics approach really work – and can’t students just circumvent it?

Digital forensics is not new; it’s a mature research area we apply to authorship, contract cheating, and plagiarism. Process-level traces – draft evolution, timing patterns, and file metadata – offer evidence that text was composed rather than pasted or transcribed. Clare Johnson’s doctoral work synthesizes this field and shows how process signals can indicate genuine authorship and learning effort. [4]

Mentafy builds on that foundation. Our system analyzes version history and other minimally invasive artifacts (no keylogging) to differentiate self-produced writing (many small revisions, natural pauses, local edits) from shortcut behaviors (large block insertions, implausible timing, flat revision trails). We’ve tuned our models on 10,000+ validated documents to maximize sensitivity to human writing patterns. And because detectors of final text alone are brittle (false positives/negatives, bias concerns), we focus on how a text comes to be, not just what it says – complementing teacher judgment and alternative assessments. [1]

Independent testing is essential, of course. A third-party evaluation under the European Network for Academic Integrity (ENAI) is planned, with results expected in the second half of 2026. Generally, we advocate a portfolio approach: process evidence + authentic tasks + explicit AI use policies – so students learn, and teachers can grade fairly. For further details, please have a look at our whitepaper.

Literature

[1] “Short Guide 9: Assessment in the Age of AI”, Centre for the Integration of Research, Teaching and Learning (CIRTL), University College Cork, January 16th, 2025k

[2] Scarfe P, Watcham K, Clarke A, Roesch E, “A real-world test of artificial intelligence infiltration of a university examinations system: A “Turing Test” case study”. PLoS ONE 19(6): e0305354. June 26th, 2024, https://doi.org/10.1371/journal.pone.0305354

[3] Quitadamo IJ, Kurtz MJ. “Learning to improve: using writing to increase critical thinking performance in general education biology.” CBE Life Sci Educ. 2007 Summer;6(2):140-54. doi: 10.1187/cbe.06-11-0203. PMID: 17548876; PMCID: PMC1885902.

[4] Clare Johnson, “Understanding academic integrity and plagiarism in the digital age: Can digital forensics techniques help prevent and detect academic misconduct?” February, 2023) PQDT-Global