Tip 1: Rule of thumb - include forty questions in an exam
Is there a guideline for how many questions to include in a test with four-choice multiple-choice questions? Yes, the short answer is a minimum of forty. There is a reasoning behind this from a reliability and validity perspective: (1) making sure that the outcome for an individual student is not too dependent on chance, (2) that you are actually measuring what you want to measure and (3) that execution of the measurement is feasible.
To exclude chance in the measurement as much as possible, you aim for high reliability, which means that the test has a (subsequently) calculated reliability coefficient cronbach alpha of more than 0.75. In addition, you want the validity of the test to be good, meaning that your test covers the topics of the course well. Finally, you want the time it takes for a student to complete the test not to be too long so that students don't have to concentrate for an excessive amount of time, and that it is still timetable compatible. To achieve this, there are three factors that you need to apply all three together where it has been found that the number of forty four-option questions is a nice optimum. If one factor falls away, the acceptability of the reliability and validity and feasibility of your test also falls away - just like a stool that needs three legs.
These are the following three factors:
- You come up with reasonably discriminating test questions. That means questions to which students with better overall knowledge or skill are more likely to answer correctly and vice versa.
- You administer the test to a student population with a reasonable spread in knowledge or ability.
- The amount of material in your course is not too limited nor too extensive (say between 8-12 different topics).
If you are unable to design properly discriminating questions, or the material is much more extensive, then you should consider including more questions in your test if test time allows. If you know that the skill differences are small (this is not easy to determine, by the way), then there are in principle no consequences except that the calculated reliability of your test is low.
Tip 2: use variations on the rule of thumb
There are several possible variations on the above rule of thumb, depending on the situation. below you will find two of them:
- More questions in a test. Depending on the source you’re looking into, more or fewer questions will be recommended. For example, CITO recommends a minimum of sixty questions. And that's explainable because they make tests that have even greater stakes than exams and therefore need to be more reliable.
- Fewer questions in a test. If the final grade of a course is made up of two subtests, you can use fewer questions per test. The individual test will then be less reliable, but the reliability of the result for the course as a whole (i.e., the two tests together) will probably still be acceptable.
Tip 3: three-option questions are actually better
While the number of questions in a test is important, it’s also good to consider the number of possible answer options per question. Therefore, let’s first explore: in your opinion as a teacher, is a multiple-choice test with three or four options per question better? You are not alone if you think four is the best option, yet this is unfortunately a misconception. The use of four-choice tests as the norm in the Netherlands was coined by A.D. de Groot (1946) who brought multiple-choice tests to the Netherlands. He actually took a personal stand by promoting the type of multiple-choice tests with four choices. This has since become part of the culture. Partly because of this, we often think that four choices are best for multiple-choice tests, as three choices would make it too easy for the student. But the difficulty of a test question actually stems from the content of the question and the quality of the distractors, not the number of options.
In fact, review studies (Rodriguez, M. C., 2005) show that the third distractor (the last incorrect answer thought up) of four-option multiple-choice questions generally works poorly: it is not attractive to either proficient and low proficiency students. Meanwhile, as a teacher, you often spend a lot of time coming up with the final distractor. Therefore, it is advisable to compose tests with three-choice multiple-choice questions. This leaves the teacher with more time to create quality questions, which increases the reliability of the test and allows for a better distribution of the material.