Examination Center logo Examination Center

Blog

How to Design AI-Resistant Coding Exam Questions

AI coding assistants can solve most textbook programming prompts in seconds. This guide shows CS instructors how to write coding exam questions that measure real understanding, and how exam conditions back them up.

Examination Center is in early access — we're onboarding institutions through our Early Access Program. The information here describes our current platform and direction and may evolve; it is not a contractual commitment.

By the Examination Center team · Last updated: 2026-06-18

Why standard coding questions stopped working

Most classic programming prompts ask a student to produce a function from a clear, self-contained spec: "write a function that reverses a linked list" or "implement binary search." These are exactly the prompts that AI assistants handle best, because the problem is common, the spec is complete, and the answer is short. A student can paste the prompt into a chat tool and get working code with comments in one shot.

The harder truth is that you usually can't tell from the final code alone. A correct submission looks the same whether the student reasoned through it or copied it. So the fix isn't a cleverer phrasing of the same kind of question. It's a shift toward questions where the thinking is the deliverable, paired with exam conditions that make outside help difficult and visible.

Two levers matter, and you need both. Question design raises the cost of using AI to the point where it's slower than just knowing the material. Exam conditions remove the easy paths and capture evidence when something looks off. One without the other leaves a gap.

Question patterns that resist one-shot AI answers

AI-resistant coding exam questions tend to share a few traits: they depend on context the model doesn't have, they reward explanation over output, or they ask the student to engage with code rather than generate it from scratch. Here are patterns that work in practice.

Make the questions practical, not just AI-proof

It's easy to overcorrect. Questions so obscure that no tool can help often also confuse students who do know the material, and they're harder to grade fairly. The goal is to measure understanding, not to win an arms race.

A useful test: would a student who attended your lectures and did the labs find this fair, while a student relying only on an AI tool would struggle to finish in time? If yes, you're in the right zone. Keep specs clear and keep the scope honest for the time allowed. Difficulty should come from depth of reasoning, not from ambiguity or trivia.

Calibrate with your own materials. Run a draft question through a current AI assistant yourself before the exam. If it produces a complete, correct answer instantly, revise: add course-specific context, a constraint, or an explanation requirement. If it produces something plausible but wrong, or needs information only your class has, you're likely on solid ground.

Question design alone isn't enough: exam conditions matter

Even well-designed questions leak value if a student can quietly run them through an assistant during the exam. This is where the testing environment does work that wording can't. The aim is a setting that's identical for everyone, removes the easiest shortcuts, and records signals a human can review afterward.

Examination Center is built for exactly this part of the problem. It gives every student the same plain, AI-free editor: no built-in AI assistant and no autocomplete, so the environment itself doesn't hand out answers. Code runs right there, with Python in the browser (NumPy, pandas, Matplotlib) and C, C++, Fortran, and Java compiled and run in a secure server sandbox, so your questions can use real tools without students installing anything.

Just as important, it captures integrity evidence for human review: paste events, large or sudden edits, and cross-student code similarity. These are surfaced as evidence for an instructor to interpret, never as automated accusations or verdicts. A burst of pasted code right after a hard sub-question is a prompt to look closer, not a guilty finding. And because the platform does not grade or score, academic judgment stays with you.

One more practical benefit during high-stakes exams: autosave and session recovery mean a frozen or closed browser doesn't wipe out a student's work, so a technical glitch never becomes an integrity dispute.

A workflow you can use this term

Pulling it together, here's a repeatable process for building an exam that holds up.

The takeaway

No single question is truly AI-proof, and you shouldn't aim for that. What you can do is make outside help slow, awkward, and visible, while measuring the understanding you actually care about.

Good question design and a fair, monitored exam environment reinforce each other. Together they give you results you can trust and defend, without turning your exam into an arms race or your platform into an accuser.

Related reading

Run AI-free Python lab exams · Examination Center vs autograders · Glossary: AI-free exam, integrity evidence · How exams stay secure

Run fair coding exams

Early Access scope: up to 40 students and 1 exam. Indicative pricing only — see pricing or apply for Early Access.

FAQ

What makes a coding exam question "AI-resistant"?

It depends on something an AI assistant doesn't have or doesn't do well: your specific course context, reading and debugging unfamiliar code, explaining trade-offs, or working under constraints that break the obvious solution. Generic, self-contained prompts like "implement binary search" are the easiest for AI to answer in one shot, so they offer the least signal about a student's own ability.

Can I make questions completely impossible for AI to solve?

No, and chasing that usually backfires. Questions obscure enough to defeat every tool tend to confuse students who do know the material and are harder to grade fairly. A better goal is to make AI help slower than simply knowing the content, then back the questions with exam conditions that remove easy shortcuts and capture evidence for review.

How does Examination Center help with AI-resistant exams?

It provides the exam conditions that question design alone can't. Every student gets the same AI-free editor with no built-in assistant or autocomplete, code runs in the browser (Python) or a secure server sandbox (C, C++, Fortran, Java), and the platform captures integrity evidence such as paste events, sudden edits, and code similarity for human review. It does not grade or accuse; judgment stays with you.

Does the platform detect or accuse students of cheating with AI?

No. Examination Center surfaces integrity evidence for a human to interpret, never automated verdicts or accusations. A signal like a large paste after a difficult question is a reason to look more closely, not proof of misconduct. The instructor decides what it means, and grading stays entirely in your own workflow.

Apply for Early Access Book a demo →