Examination Center logo Examination Center

Blog

How Do You Assess Coding Skill When Every Student Has an AI?

For two years, much of what we call coding assessment has measured the AI, not the student. That is not a cheating problem; it is an instrument-validity problem, and the fix is not a better detector. It is a different kind of exam.

Examination Center is in early access — we're onboarding institutions through our Early Access Program. The information here describes our current platform and direction and may evolve; it is not a contractual commitment.

By the Examination Center team · Last updated: 2026-06-18

We do not have a cheating problem. We have a validity problem.

Programming education has always leaned on a convenient assumption: if the code runs and passes the tests, the student probably understands it. Generative AI severed that link. A student can now submit correct, well-structured code they could not have written and cannot explain.

The artifact looks identical to mastery. The signal and the noise are now visually indistinguishable in the submitted file, so no amount of staring at the code tells you which one you are holding. The instrument we inherited, the take-home and the online code box, was built for a world where producing working code was itself evidence of understanding. That world is gone.

Why the popular fixes do not hold

Four responses dominate faculty meetings. Each fails for a structural reason, not a tooling reason:

The reframe: separate learning from assessment

The claim that AI should be banned from the classroom and the claim that AI makes assessment impossible are the same mistake. Both assume learning and assessment must happen under the same conditions. They should not.

Learning is generative and open: students should use every tool, including AI, to build understanding. Assessment is a controlled measurement: its job is to isolate one variable, what can this student do on their own, and read it cleanly. A measurement you cannot control is not a measurement.

So the answer is not to take AI away from students. It is to give them AI for everything except the moment of measurement, and make that moment controlled, authentic, and observable. Use AI all term; turn it off, and prove it is off, for the exam.

What a defensible coding assessment requires

Once the assessment moment is a controlled measurement, the requirements fall out almost mechanically. A coding exam you could defend has to be:

From detecting cheating to designing for integrity

Detection is a losing arms race: every detector invites a workaround and you are always one model release behind. Design is durable. If AI is genuinely unavailable during a controlled, observed exam, there is nothing to detect, because there is nothing to catch.

It is also the honest position with students. We expect you to use AI to learn, and here is the one bounded window where we measure you without it, so your transcript means something, is a policy students respect. Integrity by design is integrity by respect.

The uncomfortable conclusion

If you teach programming, your assessment instrument is probably broken right now, and patching it with detectors and pledges will not fix it. The fix is a different kind of exam moment, authentic and controlled and observable and humane, designed on purpose and separate from the open, AI-rich way students should learn the rest of the time.

The faculties that get this right will not be the ones with the best detector. They will be the ones who stopped trying to detect and started designing the room where the measurement happens.

Related reading

How to prevent AI cheating in coding exams · Examination Center vs Google Colab · Examination Center vs JupyterHub

Run fair coding exams

Early Access scope: up to 40 students and 1 exam. Indicative pricing only — see pricing or apply for Early Access.

FAQ

Should I ban AI from my course?

No. Separate learning from assessment: let students use AI to learn all term, then run a controlled, AI-off exam to measure what they can do unaided. Banning AI from learning handicaps them; measuring while AI is present is not a measurement.

Do AI detectors work for code?

Not reliably. Source code has low entropy and high convergence, so detector false positives and negatives are high and hard to defend in an integrity hearing. Designing an AI-free exam environment is more durable than detecting after the fact.

Apply for Early Access Book a demo →