Many teachers use multiple choice questions to assess students knowledge in a subject matter. This is especially true if the class is large and marking essays would provide to be impractical.
Even if best practices are used in making multiple choice exams it can still be difficult to know if the questions are doing the work they are supposed too. Fortunately, there are several quantitative measures that can be used to assess the quality of a multiple choice question.
This post will look at three ways that you can determine the quality of your multiple choice questions using quantitative means. These three items are
- Item facility
- Item discrimination
- Distractor efficiency
Item facility measures the difficulty of a particular question. This is determined by the following formula
Item facility = Number of students who answer the item correctly
Total number of students who answered the item
This formula simply calculates the percentage of students who answered the question correctly. There is no boundary for a good or bad item facility score. Your goal should be to try and separate the high ability from the low ability students in your class with challenging items with a low item facility score. In addition, there should be several easier items with a high item facility score for the weaker students to support them as well as serve as warmups for the stronger students.
Item discrimination measures a questions ability to separate the strong students from the weak ones.
Item discrimination = # items correct of strong group – # items correct of weak group
1/2(total of two groups)
The first thing that needs to be done in order to calculate the item discrimination is to divide the class into three groups by rank. The top 1/3 is the strong group, the middle third is the average group and the bottom 1/3 is the weak group. The middle group is removed and you use the data on the strong and the weak to determine the item discrimination.
The results of the item discrimination range from zero (no discrimination) to 1 (perfect discrimination). There are no hard cutoff points for item discrimination. However, values near zero are generally removed while a range of values above that is expected on an exam.
Distractor efficiency looks at the individual responses that the students select in a multiple choice question. For example, if a multiple choice has four possible answers, there should be a reasonable distribution of students who picked the various possible answers.
The Distractor efficiency is tabulated by simply counting the which answer students select for each question. Again there are no hard rules for removal. However, if nobody selected a distractor it may not be a good one.
Assessing multiple choice questions becomes much more important as the size of class grows bigger and bigger or the test needs to be reused multiple times in various context. This information covered here is only an introduction to the much broader subject of item response theory.
Hi! thank you for an interesting blog!
I have a couple of questions
I do not fully understand this formula: Item discrimination = # items correct of strong group – # items correct of weak group
1/2(total of two groups).
Why the .5*total… ? could you explain that a bit? or have I got caught in semantics or layout?
Secondly what is a “reasonable” distribution of students picking the answers. Don’t you mean that an even distribution between the three distractors? We do expect the students to pick the right answer and hopefully students know the right answer. Therefore I should mean that the interesting thing is the plausibility of the distractors. What is the scientific research that you use for your opinion in this matter?
Thirdly which references do you rely on in this article?
Looking forward to your responses, have a good day
Cita Nørgård, teaching and learning consultant at University of Southern Denmark.
The underline in the formula represents the line we use in fraction notation. What I was trying to say was you subtract the number of items correct of the weak group from the strong group. You then divide this number by one-half multiple by the total items correct of the two groups.
Multiplying by 1/2 keeps the final number between 0-1. However, this is not the only way to calculate the item discrimination. Another way is to use the percentage correct rather than the number of correct items.
In terms of the distribution, this is more art than science. It is at the discretion of the teacher to determine what is most appropriate given the content and the particular context.
For reference, consider Brown (2003), “language assessment principles and classroom practices”