Category Archives: assessment

Journal Writing

A journal is a log that a student uses to record their thoughts about something. This post will provide examples of journals as well as guidelines for using journals in the classroom.

Types of Journals

There are many different types of journals. Normally, all journals have some sort of dialog happening between the student and the teacher. This allows both parties to get to know each other better.

Normally, journals will have a theme or focus. Examples in TESOL would include journals that focus on grammar, learning strategies, language-learning, or recording feelings. Most journals will focus on one of these to the exclusion of the others.

Guidelines for Using Journals

Journals can be useful if they are properly planned. As such, a teacher should consider the following when using journals.

  1. Provide purpose-Students need to know why they are writing journals. Most students seem to despise reflection and will initially reject this learning experience
  2. Forget grammar-Journals are for writing. Students need to set aside the obsession they have acquired for perfect grammar and focus on developing their thoughts about something. There is a time and place for grammar and that is for summative assessments such as final drafts of research papers.
  3. Explain the grading process-Students need to know what they must demonstrate in order to receive adequate credit.
  4. Provide feedback-Journals are a dialog. As such, the feedback should encourage and or instruct the students.  The feedback should also be provided consistently at scheduled intervals.

Journals take a lot of time to read and provide feedback too. In addition, the handwriting quality of students can vary radically which means that some students journals are unreadable.

Conclusion

Journaling is an experience that allows students to focus on the process of learning rather than the product. This is often neglected in the school experience. Through journals, students are able to focus on the development of ideas without wasting working memory capacity on grammar and syntax. As such, journals can be a powerful in developing critical thinking skills.

Advertisements

Cradle Approach to Portfolio Development

Portfolio development is one of many forms of alternative assessment available to teachers. When this approach is used, generally the students collected their work and try to make sense of it through reflection.

It is surprisingly easy for portfolio development to amount to nothing more than archiving work. However, the CRADLE approach was developed by Gottlieb to alleviate potential confusion over this process. CRADLE stands for the following

C ollecting
R eflecting
A ssessing
D ocumenting
L inking
E valuating

Collecting

Collecting is the process in which the students gather materials to include in their portfolio. It is left to the students to decide what to include. However, it is still necessary for the teacher to provide clear guidelines in terms of what can be potentially selected.

Clear guidelines include stating the objectives as well as explaining how the portfolio will be assessed. It is also important to set aside class time for portfolio development.

Some examples of work that can be included in a portfolio include the following.

  • tests, quizzes
  • compositions
  • electronic documents (powerpoints, pdfs, etc)

Reflecting

Reflecting happens through the student thinking about the work they have placed in the portfolio. This can be demonstrated many different ways. Common ways to reflect include the use of journals in which students comment on their work. Another way for young students is the use of checklist.

Another way for young students is the use of a checklist. Students simply check the characteristics that are present in their work. As such, the teacher’s role is to provide class time so that students are able to reflect on their work.

Assessing

Assessing involves checking and maintaining the quality of the portfolio over time. Normally, there should a gradual improvement in work quality in a portfolio. This is a subjective matter that is negotiated by the student and teacher often in the form of conferences.

Documenting

Documenting serves more as a reminder than an action. Simply, documenting means that the teacher and student maintain the importance of the portfolio over the course of its usefulness. This is critical as it is easy to forget about portfolios through the pressure of the daily teaching experience.

Linking

Linking is the use of a portfolio to serve as a mode of communication between students, peers, teachers, and even parents. Students can look at each other portfolios and provide feedback. Parents can also examine the work of their child through the use of portfolios.

Evaluating

Evaluating is the process of receiving a grade for this experience. For the teacher, the goal is to provide positive washback when assessing the portfolios. The focus is normally less on grades and more qualitative in nature.

Conclusions

Portfolios provide rich opportunities for developing intrinsic motivation, individualize learning, and critical thinking. However, the trying to affix a grade to such a learning experience is often impractical. As such, portfolios are useful but it can be hard to prove that any learning took place.

Types of Rubrics for Writing

Grading essays, papers and other forms of writing is subjective and frustrating for teachers at times. One tool that helps in improving the consistency of the marking, as well as the speed, is the use of rubrics. In this post, we will look at three commonly used rubrics which are…

  • Holistic
  • Analytical
  • Primary trait

Holistic Rubric

A holistic rubric looks at the overall quality of the writing. Normally, there are several levels on the rubric and each level has several descriptors on it. Below is an example template

Presentation1.gifThe descriptors must be systematic which means that they are addressed in each level and in the same order. Below is an actual Holistic Rubric for Writing.

Presentation1In the example above, there are four levels of marking. The descriptors are

  • idea explanation
  • coherency
  • grammar

Between levels, different adverbs and adjectives are used to distinguish the levels.  For example, in level one, “ideas are thoroughly explained” becomes “ideas are explained” in the second level. The use of adverbs is one of the easiest ways to distinguish between levels in a holistic rubric.

Holistic rubrics offer the convenience of fast marking that is easy to interpret and comes with high reliability. The downside is that there is a lack of strong feedback for improvement.

Analytical Rubrics

Analytical rubrics assign a score to each individual attribute the teacher is looking for in the writing. In other words, instead of lumping all the descriptors together as is done in a holistic rubric, each trait is given its own score. Below is a template of an analytical rubric.

Presentation1

You can see that the levels are across the top and the descriptors across the side. Best performance moves from left to right all the way to worst performance. Each level is assigned a range of potential point values.

Below is an actual holistic writing template

Presentation1

Analytical rubrics provide much more washback and learning than holistic. Of course, they also take a  lot more time for the teacher to complete as well.

Primary Trait

A lesser-known way of marking papers is the use of primary trait rubric. With primary trait, the student is only assessed on one specific function of writing. For example, persuasion if they are writing an essay or perhaps vocabulary use for an ESL student writing paragraphs.

The template would be similar to a holistic rubric except that there would only be on descriptor instead of several. The advantage of this is that it allows the teacher and the student to focus on one aspect of writing. Naturally, this can be a disadvantage as writing involves more than one specific skill.

Conclusion

Rubrics are useful for a variety of purposes. For writing, it is critical that you understand what the levels and descriptors are one deciding on what kind of rubric you want to use. In addition, the context affects the use of what type of rubric to use as well.

Types of Writing

This post will look at several types of writing that are done for assessment purposes. In particular, we will look this from the four level of writing which are

  • Imitative
  • Intensive
  • Responsive
  • Extensive

Imitative 

Imitative writing is focused strictly on the grammatical aspects of writing. The student simply reproduces what they see. This is a common way to teach children how to write. Additional examples of activities at this level include cloze task in which the student has to write the word in the blank from a list, spelling test, matching, and even converting numbers to their word equivalent.

Intensive

Intensive writing is more concern about selecting the appropriate word for a given context. Example activities include grammatical transformation, such as changing all verbs to past tense, sequencing pictures, describing pictures, completing short sentences, and ordering task.

Responsive 

Responsive writing involves the development of sentences into paragraphs. The purpose is almost exclusively on the context or function of writing. Form concerns are primarily at the discourse level which means how the sentences work together to make paragraphs and how the paragraphs work to support a thesis statement. Normally no more than 2-3 paragraphs at this level

Example activities at the responsive level include short reports, interpreting visual aids, and summary.

Extensive

Extensive writing is responsive writing over the course of an entire essay or research paper. The student is able to shape a purpose, objectives, main ideas, conclusions, etc. Into a coherent paper.

For many students, this is exceedingly challenging in their mother tongue and is further exasperated in a second language. There is also the experience of multiple drafts of a single paper.

Marking Intensive & Responsive Papers

Marking higher level papers requires a high degree of subjectivity. THis is because of the authentic nature of this type of assessment. As such, it is critical that the teacher communicate expectations clearly through the use of rubrics or some other form of communication.

Another challenge is the issue of time. Higher level papers take much more time to develop. This means that they normally cannot be used as a form of in class assessment. If they are used as in class assessment then it leads to a decrease in the authenticity of the assessment.

Conclusion

Writing is a critical component of the academic experience. Students need to learn how to shape and develop their ideas in print. For teachers, it is important to know at what level the student is capable of writing at in order to support them for further growth.

Reading Assessment at the Interactive and Extensive Level

In reading assessment, the interactive and extensive level are the highest levels of reading. This post will provide examples of assessments at each of these two levels.

Interactive Level

Reading at this level is focused on both form and meaning of the text with an emphasis on top-down processing. Below are some assessment examples

Cloze

Cloze assessment involves removing certain words from a paragraph and expecting the student to supply them. The criteria for removal is every nth word aka fixed-ratio or removing words with meaning aka rational deletion.

In terms of marking, you have the choice of marking based on the student providing the exact wording or an appropriate wording. The exact wording is strict but consistent will appropriate wording can be subjective.

Read and Answer the Question

This is perhaps the most common form of assessment of reading. The student simply reads a passage and then answer questions such as T/F, multiple choice, or some other format.

Information Transfer

Information transfer involves the students interpreting something. For example, they may be asked to interpret a graph and answer some questions. They may also be asked to elaborate on the graph, make predictions, or explain. Explaining a visual is a common requirement for the IELTS.

Extensive Level

This level involves the highest level of reading. It is strictly top-down and requires the ability to see the “big picture” within a text. Marking at this level is almost always subjective.

Summarize and React

Summarizing and reacting requires the student to be able to read a large amount of information, share the main ideas, and then providing their own opinion on the topic. This is difficult as the student must understand the text to a certain extent and then form an opinion about what they understand.

I like to also have my students write several questions they have about the text This teaches them to identify what they do not know. These questions are then shared in class so that they can be discussed.

For marking purposes, you can provide directions about a number of words, paragraphs, etc. to provide guidance. However, marking at this level of reading is still subjective. The primary purpose of marking should probably be evidence that the student read the text.

Conclusion

The interactive and extensive level of reading is when teaching can become enjoyable. Students have moved beyond just learning to read to reading to learn. This opens up many possibilies in terms of learning experiences.

Reading Assessment at the Perceptual and Selective Level

This post will provide examples of assessments that can be used for reading at the perceptual and selective level.

Perceptual Level

The perceptual level is focused on bottom-up processing of text. Comprehension ability is not critical at this point. Rather, you are just determining if the student can accomplish the mechanical process of reading.

Examples

Reading Aloud-How this works is probably obvious to most teachers. The students read a text out loud in the presence of an assessor.

Picture-Cued-Students are shown a picture. At the bottom of the picture are words. The students read the word and point to a visual example of it in the picture. For example, if the picture has a cat in it. At the bottom of the picture would be the word cat. The student would read the word cat and point to the actual cat in the picture.

This can be extended by using sentences instead of words. For example, if the actual picture shows a man driving a car. There may be a sentence at the bottom of the picture that says “a man is driving a car”. The student would then point to the man in the actual picture who is driving.

Another option is T/F statements. Using our cat example from above. We might write that “There is one cat in the picture” the student would then select T/F.

Other Examples-These includes multiple-choice and written short answer.

Selective Level

The selective level is the next above perceptual. At this level, the student should be able to recognize various aspects of grammar.

Examples

Editing Task-Students are given a reading passage and are asked to fix the grammar. This can happen many different ways. They could be asked to pick the incorrect word in a sentence or to add or remove punctuation.

Pictured-Cued Task-This task appeared at the perceptual level. Now it is more complicated. For example, the students might be required to read statements and label a diagram appropriately, such as the human body or aspects of geography.

Gap-Filling Task-Students read a sentence and complete it appropriately

Other Examples-Includes multiple-choice and matching. The multiple-choice may focus on grammar, vocabulary, etc. Matching attempts to assess a students ability to pair similar items.

Conclusion

Reading assessment can take many forms. The examples here provide ways to deal with this for students who are still highly immature in their reading abilities. As fluency develops more complex measures can be used to determine a students reading capability.

Assessing Speaking in ESL

In this post, we will look at different activities that can be used to assess a language learner’s speaking ability, Unfortunately, will not go over how to mark or give a grade for the activities we will only provide examples.

Directed Response

In this activity, the teacher tries to have the student use a particular grammatical form by having the student modify something the teacher says. Below is an example.

Teacher: Tell me he went home
Student: He went home

This is obviously not deep. However, the student had to know to remove the words “tell me” from the sentence and they also had to know that they needed to repeat what the teacher said. As such, this is an appropriate form of assessment for beginning students.

Read Aloud

Read aloud is simply having the student read a passage verbatim out loud. Normally, the teacher will assess such things as pronunciation and fluency. There are several problems with this approach. First, reading aloud is not authentic as this is not an in demand skill in today’s workplace. Second, it blends reading with speaking which can be a problem if you do not want to assess both at the same time.

Oral Questionnaires 

Students are expected to respond and or complete sentences. Normally, there is some sort of setting such as a mall, school, or bank that provides the context or pragmatics. below is an example in which a student has to respond to a bank teller. The blank lines indicate where the student would speak.

Teacher (as bank teller): Would you like to open an account?
Student:_______________________
Teacher (as bank teller): How much would you like to deposit?
Student:___________________________

Visual Cues

Visual cues are highly opened. For example, you can give the students a map and ask them to give you directions to a location on the map. In addition, students can describe things in the picture or point to things as you ask them too. You can also ask the students to make inferences about what is happening in a picture. Of course, all of these choices are highly difficult to provide a grade for and may be best suited for formative assessment.

Translation

Translating can be a highly appropriate skill to develop in many contexts. In order to assess this, the teacher provides a word, phrase, or perhaps something more complicated such as directly translating their speech. The student then Takes the input and reproduces it in the second language.

This is tricky to do. For one, it is required to be done on the spot, which is challenging for anybody. In addition, this also requires the teacher to have some mastery of the student’s mother tongue, which for many is not possible.

Other Forms

There are many more examples that cannot be covered here. Examples include interviews, role play, and presentations. However, these are much more common forms of speaking assessment so for most they are already familiar with these.

Conclusion

Speaking assessment is a major component of the ESL teaching experience. The ideas presented here will hopefully provide some additionals ways that this can be done.

Authentic Listening Tasks

There are many different ways in which a teacher can assess the listening skills of their students. Recognition, paraphrasing, cloze tasks, transfer, etc. are all ways to determine a student’s listening proficiency.

One criticism of the task above is that they are primarily inauthentic. This means that they do not strongly reflect something that happens in the real world.

In response to this, several authentic listening assessments have been developed over the years. These authentic listening assessments include the following.

  • Editing
  • Note-taking
  • Retelling
  • Interpretation

This post will each of the authentic listening assessments listed above.

Editing

An editing task that involves listening involves the student receiving reading material. The student reviews the reading material and then listens to a recording of someone reading aloud the same material. The student then marks the hard copy they have when there are differences between the reading and what the recording is saying.

Such an assessment requires the student to carefully for discrepancies between the reading material and the recording. This requires strong reading abilities and phonological knowledge.

Note-Taking

For those who are developing language skills for academic reasons. Note-taking is a highly authentic form of assessment. In this approach, the students listen to some type of lecture and attempt to write down what they believe is important from the lecture.

The students are then assessed by on some sort of rubric/criteria developed by the teacher. As such, marking note-taking can be highly subjective. However, the authenticity of note-taking can make it a valuable learning experience even if providing a grade is difficult.

Retelling

How retelling works should be somewhat obvious. The student listens to some form of talk. After listening, the student needs to retell or summarize what they heard.

Assessing the accuracy of the retelling has the same challenges as the note-taking assessment. However, it may be better to use retelling to encourage learning rather than provide evidence of the mastery of a skill.

Interpretation

Interpretation involves the students listening to some sort of input. After listening, the student then needs to infer the meaning of what they heard. The input can be a song, poem, news report, etc.

For example, if the student listens to a song they may be asked to explain why the singer was happy or sad depending on the context of the song. Naturally, they cannot hope to answer such a question unless they understood what they were listening too.

Conclusion

Listening does not need to be artificial. There are several ways to make learning task authentic. The examples in this post are just some of the potential ways

Responsive Listening Assessment

Responsive listening involves listening to a small amount of language such as command, question, or greeting. After listening, the student is expected to develop an appropriate short response. In this post, we will examine two examples of the use of responsive listening. These two examples are…

  • Open-ended response to question
  • Suitable response to a question

Open-Ended Responsive Listening

When an open-ended item is used in responsive listening it involves the student listening to a question and provided an answer that suits the context of the question. For example,

Listener hears: What country are you from
Student writes: _______________________________

Assessing the answer is determined by whether the student was able to develop an answer that is appropriate. The opened nature of the question allows for creativity and expressiveness.

A drawback to the openness is determining the correctness of them. You have to decide if misspellings, synonyms, etc are wrong answers.  The are strong arguments for and against any small mistake among ESL teachers. Generally, communicate policies trump concerns of grammatical and orthography.

Suitable Response to a Question

Suitable response items often use multiple choice answers that the student select from in order to complete the question. Below is an example.

Listener hears: What country is Steven from
Student picks:
a. Thailand
b. Cambodia
c. Philippines
d. Laos

Based on the recording the student would need to indicate the correct response. The multiple-choice limits the number of options the student has in replying. This can in many ways making determining the answer much easier than short answer. No matter what, the student has a 25% chance of being correct in our example.

Since multiple-choice is used it is important to remember that all the strengths and weaknesses of multiple-choice items.This can be good or bad depending on where your students are at in their listening ability.

Conclusion

Responsive listening assessment allows a student to supply an answer to a question that is derived from what they were listening too.This is in many ways a practical way to assess an individual’s basic understanding of a conversation.

Intensive Listening and ESL

Intensive listening is listening for the elements (phonemes, intonation, etc.) in words and sentences. This form of listening is often assessed in an ESL setting as a way to measure an individual’s phonological,  morphological, and ability to paraphrase. In this post, we will look at these three forms of assessment with examples.

Phonological Elements

Phonological elements include phonemic consonant and phonemic vowel pairs. Phonemic consonant pair has to do with identifying consonants. Below is an example of what an ESL student would hear followed by potential choices they may have on a multiple-choice test.

Recording: He’s from Thailand

Choices:
(a) He’s from Thailand
(b) She’s from Thailand

The answer is clearly (a). The confusion is with the adding of ‘s’ for choice (b). If someone is not listening carefully they could make a mistake. Below is an example of phonemic pairs involving vowels

Recording: The girl is leaving?

Choices:
(a)The girl is leaving?
(b)The girl is living?

Again, if someone is not listening carefully they will miss the small change in the vowel.

Morphological Elements

Morphological elements follow the same approach as phonological elements. You can manipulate endings, stress patterns, or play with words.  Below is an example of ending manipulation.

Recording: I smiled a lot.

Choices:
(a) I smiled a lot.
(b) I smile a lot.

I sharp listener needs to hear the ‘d’ sound at the end of the word ‘smile’ which can be challenging for ESL student. Below is an example of stress pattern

Recording: My friend doesn’t smoke.

Choices:
(a) My friend doesn’t smoke.
(b) My friend does smoke.

The contraction in the example is the stress pattern the listener needs to hear. Below is an example of a play with words.

Recording: wine

Choices:
(a) wine
(b) vine

This is especially tricky for languages that do not have both a ‘v’ and ‘w’ sound, such as the Thai language.

Paraphrase recognition

Paraphrase recognition involves listening to an example of being able to reword it in an appropriate manner. This involves not only listening but also vocabulary selection and summarizing skills. Below is one example of sentence paraphrasing

Recording: My name is James. I come from California

Choices:
(a) James is Californian
(b) James loves Calfornia

This is trickier because both can be true. However, the goal is to try and rephrase what was heard.  Another form of paraphrasing is dialogue paraphrasing as shown below

Recording: 

Man: My name is Thomas. What is your name?
Woman: My name is Janet. Nice to meet you. Are you from Africa
Man: No, I am an American

Choices:
(a) Thomas is from America
(b)Thomas is African

You can see the slight rephrase that is wrong with choice (b). This requires the student to listen to slightly longer audio while still have to rephrase it appropriately.

Conclusion

Intensive listening involves the use of listening for the little details of an audio. This is a skill that provides a foundation for much more complex levels of listening.

Critical Language Testing

Critical language testing (CLT) is a philosophical approach that states that there is widespread bias in language testing. This view is derived from critical pedagogy, which views education as a process manipulated by those in power.

There are many criticisms that CLT has of language testing such as the following.

  • Test are deeply influenced by the culture of the test makers
  • There is  a political dimension to tests
  • Tests should provide various modes of performance because of the diversity in how students learn.

Testing and Culture

CLT claim that tests are influenced by the culture of the test-makers. This puts people from other cultures at a disadvantage when taking the test.

An example of bias would be a reading comprehension test that uses a reading passage that reflects a middle class, white family. For many people, such an experience is unknown for them. When they try to answer the questions they lack the contextual knowledge of someone who is familiar with this kind of situation and this puts outsiders at a disadvantage.

Although the complaint is valid there is little that can be done to rectify it. There is no single culture that everyone is familiar with. The best that can be done is to try to diverse examples for a diverse audience.

Politics and Testing

Politics and testing is closely related to the prior topic of culture. CLT claims that testing can be used to support the agenda of those who made the test. For example, those in power can make a test that those who are not in power cannot pass. This allows those in power to maintain their hegemony. An example of this would be the literacy test that African Americans were

An example of this would be the literacy test that African Americans were required to pass in order to vote. Since most African MAericans could not read the were legally denied the right to vote. This is language testing being used to suppress a minority group.

Various Modes of Assessment

CLT also claims that there should be various modes of assessing. This critique comes from the known fact that not all students do well in traditional testing modes. Furthermore, it is also well-documented that students have multiple intelligences.

It is hard to refute the claim for diverse testing methods. The primary problem is the practicality of such a request. Various assessment methods are normally impractical but they also affect the validity of the assessment. Again, most of the time testing works and it hard to make exceptions.

Conclusion

CLT provides an important perspective on the use of assessment in language teaching. These concerns should be in the minds of test makers as they try to continue to improve how they develop assessments. This holds true even if the concerns of CLT cannot be addressed.

 

Developing Standardized Tests

For better or worst, standardized testing is a part of the educational experience of most students and teachers. The purpose here is not to attack or defend their use. Instead, in this post, we will look at how standardized test are developed.

There are primarily about 6 steps in developing a standardized test. These steps are

  1. Determine the goals
  2. Develop the specifications
  3. Create and evaluate test items
  4. Determine scoring and reporting
  5. Continue further development

Determing Goals

The goals of a standardized test are similar to the purpose statement of a research paper in that the determine the scope of the test. By scope, it is meant what the test will and perhaps will not do. This is important in terms of setting the direction for the rest of the project.

For example, the  TOEFL purpose is to evaluate English proficeny. This means that the TOEFL does not deal with science, math, or other subjects. This seems silly for many but this purpose makes it clear what the TOEFL is about.

Develop the Specifications

Specifications have to do with the structure of the test. For example, a test can have multiple-choice, short answer, essay, fill in the blank, etc. The structure of the test needs to be determined in order to decide what types of items to create.

Most standardized tests are primarily multiple-choice. This is due to the scale on which the test are given. However, some language tests are including a writing component as well now.

Create Test Items

Once the structure is set it is now necessary to develop the actual items for the test. This involves a lot with item response theory (IRT) and the use of statistics. There is also a need to ensure that the items measure the actual constructs of the subject domain.

For example, the TOEFL must be sure that it is really measuring language skills. This is done through consulting experts as well as statistical analysis to know for certain they are measuring English proficiency. The items come from a bank and are tested and retested.

Determine Scoring and Reporting

The scoring and reporting need to be considered. How many points is each item worth? What is the weight of one section of the test? Is the test norm-referenced or criterion-referenced? How many people will mark each test?These are some of the questions to consider.

The scoring and reporting matter a great deal because the scores can affect a person’s life significantly. Therefore, this aspect of standardized testing is treated with great care.

Further Development

A completed standardized test needs to be continuously reevaluated. Ideas and theories in a body of knowledge change frequently and this needs to be taken into account as the test goes forward.

For example, the SAT over the years has changed the point values of their test as well as added a writing component. This was done in reaction to concerns about the test.

Conclusion

The concepts behind developing standardize test can be useful for even teachers making their own assessments. There is no need to follow this process as rigorously. However, familiarity with this strict format can help guide assessment development for many different situations.

Item Indices for Multiple Choice Questions

Many teachers use multiple choice questions to assess students knowledge in a subject matter. This is especially true if the class is large and marking essays would provide to be impractical.

Even if best practices are used in making multiple choice exams it can still be difficult to know if the questions are doing the work they are supposed too. Fortunately, there are several quantitative measures that can be used to assess the quality of a multiple choice question.

This post will look at three ways that you can determine the quality of your multiple choice questions using quantitative means. These three items are

  • Item facility
  • Item discrimination
  • Distractor efficiency

Item Facility

Item facility measures the difficulty of a particular question. This is determined by the following formula

Item facility = Number of students who answer the item correctly
Total number of students who answered the item

This formula simply calculates the percentage of students who answered the question correctly. There is no boundary for a good or bad item facility score. Your goal should be to try and separate the high ability from the low ability students in your class with challenging items with a low item facility score. In addition, there should be several easier items with a high item facility score for the weaker students to support them as well as serve as warmups for the stronger students.

Item Discrimination

Item discrimination measures a questions ability to separate the strong students from the weak ones.

Item discrimination = # items correct of strong group – # items correct of weak group
1/2(total of two groups)

The first thing that needs to be done in order to calculate the item discrimination is to divide the class into three groups by rank. The top 1/3 is the strong group, the middle third is the average group and the bottom 1/3 is the weak group. The middle group is removed and you use the data on the strong and the weak to determine the item discrimination.

The results of the item discrimination range from zero (no discrimination) to 1 (perfect discrimination). There are no hard cutoff points for item discrimination. However, values near zero are generally removed while a range of values above that is expected on an exam.

Distractor Efficiency

Distractor efficiency looks at the individual responses that the students select in a multiple choice question. For example, if a multiple choice has four possible answers, there should be a reasonable distribution of students who picked the various possible answers.

The Distractor efficiency is tabulated by simply counting the which answer students select for each question. Again there are no hard rules for removal. However, if nobody selected a distractor it may not be a good one.

Conclusion

Assessing multiple choice questions becomes much more important as the size of class grows bigger and bigger or the test needs to be reused multiple times in various context. This information covered here is only an introduction to the much broader subject of item response theory.

Tips for Developing Tests

Assessment is a critical component of education. One form of assessment  that is commonly used is testing. In this post, we will look at several practical tips for developing tests.

Consider the Practicality

When developing a test, it is important to consider the time constraints, as well as the time it will take to mark the test. For example, essays are great form of assessment that really encourage critical thinking. However, if the class has 50 students the practicality of essays test quickly disappears.

The point is that the context of teaching moves what is considered practical. What is practical can change from year to year while adjusting to new students.

Think about the Reliability

Relibility is the consistency of the score that the student earns. THis can be affected by the setting of the test as well as the person who marks the test. It is difficult to maintain consistency when marking subject answers such as short and answer and or essay. However, it is important that this is still done.

Consider Validity

Validity in this context has to do with whether the test covers objects that were addressed  in the actual teaching. Assessing this is subject but needs to be considered. What is taught is what should be on the test. This is easier said than done as poor planning can lead to severally poor testing.

The students also need to be somewhat convince that the testing is appropriate. If not it can lead to problems and complaints. Furthermore, an invalid test from the students perspective can lead to cheating as the students will cheat in order to survive.

Make it Aunthentic 

Tests, if possible, should mimic real-world behaviors whenever possible. This enhances relevance and validity for students. One of the main problems with authentic assessment is what to do when it is time to mark them. The real-world behaviors cannot always be reduced to a single letter grade. This concern is closely relates to practicality.

Washback

Washback is the experience of learning from an assessment. This normally entails some sort of feedback that the teacher provides the student. the feedbag they give. This personal attention encourages reflection which aides in comprehension. Often, it will happen after the testing as the answers are reviewed.

Conclusion

Tests can be improved by keeping in mind the concepts addressed in this post. Teachers and students can have better experiences with testing by maintaining practical assessments that are valid, provide authentic experiences as well insights into how to improve.

Washback

Washback is the effect that testing has on teaching and learning. This term is commonly used in used in language assessment but it is not limited to only that field. One of the primary concerns of many teachers is developing that provide washback or that enhances students learning and understanding of ideas in a class.

This post will discuss three ways in which washback can be improved in a class. The three ways are…

  • Written feedback on exams
  • Go over the results as a class
  • Meetings with students on exam performance

Written Feedback

Exams or assignments that are highly subjective (ie essays) require written feedback in order to provide washback. This means specific, personalized feedback for each student. This is a daunting task for most teachers especially as classes get larger. However, if your goal is to improve washback providing written comments is one way to achieve this.

The letter grade or numerical score a student receives on a test does not provide insights into how the student can improve. The reasoning behind what is right or wrong can be provided in the written feedback.

Go Over Answers in Class

Perhaps the most common way to enhance feedback is to go over the test in class. This allows the students to learn what the correct answer is, as well as why one answer is the answer. In addition, students are given time to ask questions and clarification of the reasoning behind the teacher’s marking.

If there were common points of confusion, going over the answers in this way allows for the teacher to reteach the confusing concepts. In many ways, the test revealed what was unclear and now the teacher is able to provide support to achieve mastery.

One-on-One Meetings

For highly complex and extremely subjective forms of assessments (ie research paper) one-on-one meetings may be the most appropriate. This may require a more personal touch and a greater deal of time.

During the meeting, students can have their questions addressed and learn what they need to do in order to improve. This is a useful method for assignments that require several rounds of feedback in order to be completed.

Conclusion

Washback, if done properly, can help with motivation, autonomy, and self-confidence of students. What this means is that assessment should not only be used for grades but also to develop learning skills.

Understanding Testing

Testing is standard practice in most educational context. A teacher needs a way to determine what level of knowledge the students currently have or have gained through the learning experience. However, identifying what testing is and is not has not always been clear.

In this post, we will look at exactly what testing as. In general, testing is a way of measuring a person’s ability and or knowledge in a given are of study. Specifically, there are five key characteristics of a test, and they are…

  • Systematic
  • Quantifiable
  • Individualistic
  • Competence
  • Domain specific

Systematic

A test must be well organized and structured. For example, the multiple choice are in one section while the short answers are in a different section. If an essay is required there is a rubric for grading. Directions for all sections are in the test to explain the expectations to the students.

This is not as easy or as obvious as some may believe. Developing a test takes a great deal of planning for the actual creation of the test.

Quantifiable

Test are intended to measure something. A test can measure general knowledge such as proficiency test of English or a test can be specific such as a test that only looks at vocabulary memorization. Either way, it is important for both the student and teacher to know what is being measured.

Another obvious but sometimes mistake by test makers is the reporting of results. How many points each section and even each question is important for students to know when taking a test. This information is also critical for the person who is responsible for grading the tests.

Individualistic 

Test are primarily designed to assess a student’s individual knowledge/performance. This is a Western concept of the responsibility of a person to have an individual expertise in a field of knowledge.

There are examples of groups working together on tests. However, group work is normally left to projects and not formal modes of assessment such as testing.

Competence

As has already been alluded too, tests assess competence either through the knowledge a person has about a subject or their performance doing something. For example, a vocabulary test assesses knowledge of words while a speaking test would assess a person ability to use words or their performance.

Generally, a test is either knowledge or performance based.  it is possible to blend the two, however, mixing styles raises the complexity not only for the student but also for the person who s responsible for marking the results.

Domain Specific

A test needs to be focused on a specific area of knowledge. A language test is specific to language as an example. A teacher needs to know in what specific area they are trying to assess students knowledge/performance. This not always easy to define as not only are there domains but sub-domains and many other ways to divide up the information in a given course.

Therefore, a teacher needs to identify what students need to know as well as what they should know and assess this information when developing a test. This helps to focus the test on relevant content for the students.

Conclusion

There is art and science to testing. There is no simple solution to how to setup tests to help students. However, the five concepts here provides a framework that can help a teacher to get started in developing tests.

Discrete-Point and Integrative Language Testing Methods

Within language testing, there has arisen over time at least two major viewpoints on assessment. Originally,  the view was that assessing language should look specific elements of a language or you could say that language assessment should look at discrete aspects of the language.

A reaction to this discrete methods came about with the idea that language is wholistic so testing should be integrative or address many aspects of language simultaneously. In this post, we will take a closer look at discrete and integrative language testing methods through providing examples of each along with a comparison.

Discrete-Point Testing

Discrete-point testing works on the assumption that language can be reduced to several discrete component “points” and that these “points” can be assessed. Examples of discrete-point test items in language testing include multiple choice, true/false, fill in the blank, and spelling.

What all of these example items have in common is that they usually isolate an aspect of the language from the broader context. For example, a simple spelling test is highly focused on the orthographic characteristics of the language. True/false can be used to assess knowledge of various grammar rules etc.

The primary criticism of discrete-point testing was its discreteness. Many believe that language is wholistic and that in the real world students will never have to deal with language in such an isolated way. This led to the development of integrative language testing methods.

Integrative Language Testing Methods

Integrative language testing is based on the unitary trait hypothesis, which states that language is indivisible. This is in complete contrast to discrete-point methods which supports dividing language into specific components.  Two common integrative language assessments include cloze test and dictation.

Cloze test involves taking an authentic reading passage and removing words from it. Which words remove depends on the test creator. Normally, it is every 6th or 7th word but it could be more or less or only the removal of key vocabulary. In addition, sometimes potential words are given to the student to select from or sometimes the list of words is not given to the student

The student’s job is to look at the context of the entire story to determine which words to write into the blank space.  This is an integrative experience as the students have to consider grammar, vocabulary, context, etc. to complete the assessment.

Dictation is simply writing down what was heard. This also requires the use of several language skills simultaneously in a realistic context.

Integrative language testing also has faced criticism. For example, discrete-point testing has always shown that people score differently in different language skills and this fact has been replicated in many studies. As such, the exclusive use of integrative language approaches is not supported by most TESOL scholars.

Conclusion

As with many other concepts in education, the best choice between discrete-point and integrative testing is a combination of both. The exclusive use of either will not allow the students to demonstrate mastery of the language.

Providing Quiz Feedback in Moodle

Like all of it’s other features in Moodle, the quiz module has so many options as to make it difficult to use. In this post, we are going to look at providing feedback to students for their participation in a quiz.

In the example used in this post, we are going to use a quiz that was already developed in a prior post as the example for this blogpost.

The first step is to click on “edit settings” to display all of the various options available for the quiz. Once there, you want to scroll down to “review options”. After doing this you will see the following

Screenshot from 2016-09-02 08:09:36.png

As you can see, there are four columns and under each column there are 7 choices. The columns are about the timing of the feedback. Feedback can happen immediately after an attempt, it can happen after the student finishes the quix but is still available for others to take, or it can happen after everyone has taken the quiz and the quiz is no longer available.

Which type of timing you pick depends on your goals. If the quiz is for learning and not for assessment perhaps “immediately after the attempt” is best. However, if this is a formal summative assessment it might be better to provide feedback after the quiz is closed.

The options under each column are the same. By clicking on the question mark you can get a better explanation of what it is.

Overall Feedback

One important feedback feature is “Overall Feedback”. This tells the student a general idea of their understanding. You can set it up so that different overall feedback is given based on their score. Below is a screen shot of overall feedback

Screenshot from 2016-09-02 08:44:25.png

In the example, the first boundary is for scores of 100 and above and the second boundary is for scores 1-99. Students who get 100 know they are OK while students with less than 100 will get a different feedback. You have to add boundaries manually. Also, remember to add the percent sign after the number

General Feedback and Specific Feedbackfor a Question

General feedback for a question is the feedback a person gets regardless of their answer. To find this option you need to either create a question or edit a questions.

Specific feedback depends on the answer they pick. Below is a visual of both general and specific feedback.

Screenshot from 2016-09-02 08:31:17.png

Below is an example of the feedback a student would get taking the example quiz in this post. In the picture below, the student got the question wrong and received the feedback for an incorrect response.

Screenshot from 2016-09-02 08:46:48.png

Conclusion

The quiz module is a great way to achieve many different forms of assessment online. Whether the assessment is formative or summative the quiz module is one option. However, due to the complex nature of Moodle it is important that a teacher knows exactly what they want before attempting to use the quiz module.

Creating a Quiz in Moodle

In this post, we will look at how to setup a quiz through importing questions from the question bank. Quizzes can serve many different functions within Moodle depending on the goals and objectives of the instructor.

After logging into Moodle and selecting a class that you are a teacher in. You need to click on “activity and resources” and click on “quiz”. You should see the following screen.

Screenshot from 2016-08-29 10:24:44.png

Give your quiz a name. Below there are many different options that are very confusing for people new to Moodle. Below are some brief explanations.

  • Timing is how long the quiz last as well as when it is available.
  • Grading allows you to determine what category to place the assessment as well as how many times the student can take it.
  • Layout is important as it determines how the quiz is displayed. It is usually best to have one question per page because if the computer freezes the student will only lose the information of the current question as the others were saved.
  • Question behavior refers to the action of the questions. The answers can be shuffled and or the the feedback can be adjusted as well.
  • Review options explains how the computer communicates feedback after a quiz response and both when the quiz is open and closed.
  • Appearance allows you to see the students profile picture during the exam if the exam is proctored.
  • Extra restrictions allows you to set a password or limit the IP addresses that can access the quiz
  • Overall feedback allows you to share with the students a general idea of how well they did based on their score.

Obviously the options are staggeringly confusing. Before trying to make a quiz it is always important to determine exactly what you want the students to do and the role the assignment plays in achieving this. For the example in this post, we want to make a quiz that assesses the students understanding of some content. As such, here are the options used in Moodle to achieve this

  • Timing: 10 minutes, pick date to open and close the quiz
  • Layout: New page every question
  • Question behavior: shuffle questions and deferred feedback
  • Review options: Clear the following
    • All under “immediately after the attempt”
    • All under “later while the quiz is still open”
    • We don’t want students to see the results until the quiz is closed
  • Extra restrictions: None
  • Overall feedback: none

Once the setting are determined you click “save and display” and you will see the following.

Screenshot from 2016-08-29 10:49:05.png

Now click “edit quiz” and you will see the image belowScreenshot from 2016-08-29 10:50:25.png

We will now add questions. The questions we will add were created in a prior post. To do this click “add” and select “from question bank”. From there, select as many questions as you want and click “add questions to the quiz.” You will see the following

Screenshot from 2016-08-29 10:52:51.png

In a future post, we will learn about providing feedback  for quizzes.

Conclusion

Quizzes provide a way for teacher to determine the progress of their students. This post provide some basic insights into setting up a quiz in Moodle.

 

Making Quiz Questions in Moodle

One of Moodle’s many features is the quiz activity, which allows a teacher to assess a student’s knowledge in a variety of ways. However, before developing a quiz, a teacher needs to have questions developed and ready to be incorporated into the quiz.

The purpose of this post is to explain how to develop questions that are available to be used in a quiz.

Make a Category

When making questions it is important to be organized and this involves making categories in which to put your questions. To do this you need to click on course administrator|questions bank|Categories. After doing this you will see something similar to the image below.

Screenshot from 2016-08-24 09:40:53.png

You want to click add category and type a name for your category. In the picture below we named the category “example”. When you are finished click “add category and you will see the following.

Screenshot from 2016-08-24 09:43:36.png

Finding the Question Bank

Now that we have a question category we need to go to the question bank. To do so click on  course administrator|question bank. You should see something similar to the following.

Screenshot from 2016-08-24 09:36:40.png

Select the “example” category you made and then “click create new question.” You should see the following.

Screenshot from 2016-08-24 09:50:39.png

As you can see, there are many different forms of questions available. The type of questions you should ask depends on many factors. For now, we will make a true or false example question. Once you select the option for T/F question you will see the following.

Screenshot from 2016-08-24 09:54:15.png

The question name is for identifying the question in the bank and not on the quiz. Therefore, avoid calling your questions “question 1, 2,3 etc.” because if you have multiply quizzes you will not know which question one to take from your bank. You need to develop some sort of cataloging system for your questions such as the following

1 Q1 TF 2016

This means the following

  • 1 means this is number 1
  • Q1 means this is quiz 1
  • TF means the question is true false
  • 2016 is the year the question was developed

How you do this is your own decision and this is just an example.

The other boxes on this page are self-explanatory. General feedback is what the student receives whether they are right or wrong. The other feedback is given depending on the response. After making a question selecting if it is true or false you will see the following.

Screenshot from 2016-08-24 10:10:01.png

In a future post, we will learn how to take questions from the question bank and incorporate them into an actually quiz.

Distributed Practice: A Key Learning Technique

A key concept in teaching and learning is the idea of distributed practice. Distributed practice is a process in which the teacher deliberately arranges for their students to practice a skill or use knowledge in many learning sessions that are short in length and distributed over time.

The purpose behind employing distributed practice is to allow for the reinforcement of the material in the student’s mind through experiencing the content several times. In this post, we will look at pros and cons of distributed practice as well as practical applications of this teaching technique

Pros and Cons

Distributed practice helps to maintain student motivation through requiring short spans of attention and motivation. For most students, it is difficult to study anything for long periods of time. Through constant review and exposure, students become familiar with the content.

Another benefit is the prevention of mental and physical fatigue. This is related to the first point. Fatigue interferes with information processing. Therefore, a strategy that reduces fatigue can help in students’ learning new material.

However, there are times when short intense sessions are not enough to achieving mastery. Project learning may be one example. When completing a project, it often requires several long stretches of completing tasks that are not conducive to distributed practice.

Application Examples

When using distributed practice it is important to remember to keep the length of the practice short. This maintains motivation. In addition, the time between sessions should initial be short as well and lengthen as mastery develops. If the practice sessions are too far a part, students will forget.

Lastly, the skill should be practiced over and over for a long period of time. How long depends on the circumstances. The point is that distributed practice takes a commitment to returning to a concept the students need to master over a long stretch of time.

One of the most practical examples of distributed practice may be in any curriculum that employs a spiral approach. A spiral curriculum is one in which key ideas are visited over and over through a year or even over several years of curriculum.

For our purposes, distributed practice is perhaps a spiral approach employed within a unit plan or over the course of a semester. This can be done in many ways such as.

  • The use of study guides to prepare for quizzes
  • Class discussion
  • Student presentations of key ideas
  • Collaborative project

The primary goal should be to employ several different activities that require students to return to the same material from different perspectives.

Conclusions

Distributed practice is a key teaching technique that many teachers employ even if they are not familiar with the term. Students cannot see any idea or skill once. There must be exposed several times in order to develop mastery of the skill. As such, understanding how to distribute practice is important for student learning.

Marking an Assignment in Moodle

As with all the features in Moodle, there are many different ways to mark an assignment. In this post we will explain several different approaches that can be taken to marking an assignment in Moodle. For information on setting up an assignment see the post on how to do this.

Below is a screen shot of a demo class for this post. To beginning marking an assignment, you need to click on the assignment while in the role of a teacher.

Screenshot from 2016-07-27 13:58:52.png

After clicking on the assignment you will see a

  • summary page that indicates the number of students
  • how many assignments have been submitted
  • the number of assignments that need to be graded
  • the due date
  • how much time before the assignment is late.

Underneath all this information is a link for viewing submissions and you need to click on this. Below is a visual of this.

Screenshot from 2016-07-27 14:02:55.png

On the next page there is a lot of information. For “grading action” we don’t want to change this option for now. The next section has the names of the students who have submitted the assignment. The “grade” box allows you to submit a numerical grade for the assignment. The “online text” box is only available if you want the students to type a response into Moodle. The “file submission” link allows you to download any attachments the students uploaded. If any comments have been made by the student or someone else you can see those in the “comments” section. The “feedback comments” allows you to inform the student privately how they did on the assignment.

The other options are self-explanatory. Please note that this example uses the quick grading option which is useful if you are the only one marking assignments in the class. Below is a visual of this page.

Screenshot from 2016-07-27 14:05:50.png

Once you put in a score and at feedback (feedback is optional). You must click on “save all quick grading changes”. The student now has a grade with feedback on the assignment. As the teacher, you can view the students overall grade by going to the “grading action” drop down menu and clicking on “View gradebook” You will see the following.

Screenshot from 2016-07-27 14:20:19.png

You can also change grades here by clicking on the assignment. This will take you “grading summary page” which is the second screenshot in this post. If you click on the pencil you can override an existing grade as shown in the screen below. It will take you to the following screen.

Screenshot from 2016-07-27 14:22:47.png

Click on override and you can change the grade or feedback. Click on exclude and the assignment will not be a part of the final grade.

Conclusion

In this post we explored some of the options for grading assignments in Moodle. This is not an inherently technical task but you should be aware of the different ways that it can be done to avoiding becoming confused when trying to use Moodle.

 

Adding Categories and Graded Items in Moodle

In this post, we are going to take a closer look at setting up the gradebook in Moodle. In particular we are going to learn how to setup categories and graded items in. For many, the gradebook in Moodle is very confusing and hard to understand. However, with some basic explanation the gradebook can become understandable and actually highly valuable.

Finding the Setup Page

After logging into Moodle and selecting a course in which you are the teacher, you need to do the following.

  1. Go to the administration block and click on “grades”
  2. Next click on the “setup” tab. You should see the following

33.jpg

The folder “ENGL 5000 Experimental course” is the name of the class that I am using. Your folder should have the name of your class in this place. When you create categories and grade items they should all be inside this folder.

Making Categories

It makes sense to create categories first so that we have a place to put various graded items. How you setup the categories is up to you. One thing to keep in mind is that you can create sub-categories, sub-sub categories, etc. This can get really confusing so it is suggested that you only make main categories for simplicity sake unless there is a compelling reason not to do this. In this example, we will create 4 main categories and they are

  • Classwork (35% of grade)
  • Quizzes (20% of grade)
  • Tests (20% of grade)
  • Final (25% of grade)

To make a category click on “Add category” and you will see the following.

12.jpg

  1. Give the category the name “Classwork”
  2. Aggregation is confusing for people who are not familiar with statistics. There are different ways in which grades can be calculated in a category below is the explanation of 2 that are most commonly used.
    • Mean of grades-This aggregation calculate the mean of the graded items. All items have the same weight
    • Simple weighted mean-For this aggregation, the more points an item is worth the more influence it has in the calculation of the grade for the category.
  3. Set your aggregation to “mean of grades
  4. Click on “category total”
  5. The grade type should be set to “value” this means that it is worth points.
  6. The maximum grade should be set to 35. Remember our classwork category is worth 35% so we want the category to be worth 35 points and the entire class to be worth 100 points. Moodle is able to standardized the data so that everything fits accordingly.
  7. Click on “save changes”

Repeat what we did for the “classwork” category for each of the other categories in the example. Below are screenshots of the categories

QUIZZES Category

12.jpg

TEST CATEGORY

12.jpg

FINAL CATEGORY

12

If everything went well you should see the following on the setup page.

12.jpg

Notice how the class is now worth 100 points. You can make your categories worth whatever you want. However, it becomes difficult to interpret the scores when you do anything. As educators, we are already use to a 100 point system so you may as well use that in Moodle as well even though you have the flexibility to make it whatever you want.

There is one more step we need to take in order to make sure the gradebook calculates grades correctly. You may have noticed that each of our categories are worth a different number of points. Therefore, we must tell Moodle to weigh these categories differently. Otherwise the results of each category will have the same weight on the overall grade. To fix this problem do the following.

  1. Find the folder that has the name of your course (for me this is ENGL 5000 Experimental course)
  2. To the right of the folder there is a link called “edit” click on this.
  3. Click “edit settings”
  4. You do not need to give this category a name so leave that blank.
  5. For aggregation, change it to “simple weighted mean”
  6. Click “save changes”

You should see the following

12

Notice in the course total that it now says “simple weighted mean of grades”.

For adding graded items, you do the following

  1. Click on “add graded item”
  2. Give it a name (I will call mines quiz 1)
  3. Determine how many points it is worth (for me 10 points)
  4. Scroll to the bottom and you will see a drop down tab called “grade category”
  5. Pick the category you want the graded item to be in.

Below is an example of a quiz I put in the quiz category. This is what the setup page should look like if this is down correctly

12.jpg

As you can see, quiz 1 is worth ten points. You may wonder how quiz 1 can be worth 10 points when the entire category is only worth 20. Remember, Moodle use statistics to condense the score of the quiz to fit within the 20 points of the category.

Conclusion

This post exposed you to the basics of setting up categories and graded items in Moodle. The main problem with the gradebook is the flexibility it provides. With some sort of a predefined criteria it is easy to get confused in using it. However, with the information provided here, you now have a foundation for using the Moodle gradebook.

Direct and Indirect Test Items

In assessment, there are two categories that most test items fall into which are direct and indirect test items. Direct test items ask the student to complete some sort of authentic action. Indirect test items measure a students knowledge about a subject. This post will provide examples of test items that are either direct or indirect items.

Direct Test Items

Direct test items used authentic assessment approaches. Examples in TESOL would include the following…

  • For speaking: Interviews and presentations
  • For writing: Essay questions
  • For reading: Using real reading material and having the student respond to question verbally and or in writing
  • For listening: Following oral directions to complete a task

The primary goal of direct test items is to be as much like real-life as possible. Often, direct testing items are integrative, which means that the student has to apply several skills at once. For example, presentations involve more than just speaking but also the writing of the speech, the reading or memorizing of the speech as well as the critical thinking skills to develop the speech.

Indirect Test Items

Indirect test items assess knowledge without authentic application. Below are some common examples of indirect test items.

  • Multiple choice questions
  • Cloze items
  • Paraphrasing
  • Sentence re-ordering

Multiple Choice

Multiple choice questions involve the use of a question followed by several potential answers. It is the job of the student to determine what is the most appropriate answer. Some challenges with writing multiple choice are the difficulty of writing incorrect choices. For every correct answer, you need several wrong ones. Another problem is that with training, students can learn how to improve their success on multiple choice test without having a stronger knowledge of the subject matter.

Cloze Items

Cloze items involve giving the student a paragraph or sentence with one or more blanks in it that the student has to complete. One problem with Cloze items is that more than one answer may be acceptable for a blank. This can lead to a great deal of confusion when marking the test.

Paraphrasing

Paraphrasing is strictly for TESOL and involves having the student rewrite a sentence in a slightly different way as the example below.

“I’m sorry I did not go to the assembly”

I wish________________________________

In the example above the student needs to write the sentence in quotes starting with the phrase “I wish.” The challenging is determining if the paraphrase is reasonable as this is highly subjective.

Sentence Re-Ordering

In this item for TESOL assessment, a student is given a sentence that is out of order and they have to arrange the words so that an understandable sentence is developed. This one way to assess knowledge of syntax. The challenge is that for complex sentences more than one answer may be possible

It is important to remember that all indirect items can be integrative or discrete-point. Unlike integrative, discrete point only measures one narrow aspect of knowledge at a time.

Conclusion

A combination of direct and indirect test items would probably best ensure that a teacher is assessing students so that they have success. What mixture of the two to use always depends on the context and needs of the students

Test Validity

Validity is often seen as a close companion of reliability. Validity is the assessment of the evidence that indicates that an instrument is measuring what it claims to measure. An instrument can be highly reliable (consistent in measuring something) yet lack validity. For example, an instrument may reliably measure motivation but not valid in measuring income. The problem is that an instrument that measures motivation would not measure income appropriately.

In general, there are several ways to measure validity, which includes the following.

  • Content validity
  • Response process validity
  • Criterion-related evidence of validity
  • Consequence testing validity
  • Face validity

Content Validity

Content validity is perhaps the easiest way to assess validity. In this approach, the instrument is given to several experts who assess the appropriateness or validity of the instrument. Based on their feedback, a determination of the validity is determined.

Response Process Validity

In this approach, the respondents to an instrument are interviewed to see if they considered the instrument to be valid. Another approach is to compare the responses of different respondents for the same items on the instrument. High validity is determined by the consistency of the responses among the respondents.

Criterion-Related Evidence of Validity

This form of validity involves measuring the same variable with two different instruments. The instrument can be administered over time (predictive validity) or simultaneously (concurrent validity). The results are then analyzed by finding the correlation between the two instruments. The stronger the correlation implies the stronger validity of both instruments.

Consequence Testing Validity

This form of validity looks at what happened to the environment after an instrument was administered. An example of this would be improved learning due to test. Since the the students are studying harder it can be inferred that this is due to the test they just experienced.

Face Validity

Face validity is the perception that the students have that a test measures what it is supposed to measure. This form of validity cannot be tested empirically. However, it should not be ignored. Students may dislike assessment but they know if a test is testing what the teacher tried to teach them.

Conclusion 

Validity plays an important role in the development of instruments in quantitative research. Which form of validity to use to assess the instrument depends on the researcher and the context that he or she is facing.

Assessing Reliability

In quantitative research, reliability measures an instruments stability and consistency. In simpler terms, reliability is how well an instrument is able to measure something repeatedly. There are several factors that can influence reliability. Some of the factors include unclear questions/statements, poor test administration procedures, and even the participants in the study.

In this post, we will look at different ways that a researcher can assess the reliability of an instrument. In particular, we will look at the following ways of measuring reliability…

  • Test-retest reliability
  • Alternative forms reliability
  • Kuder-Richardson Split Half Test
  • Coefficient Alpha

Test-Retest Reliability

Test-retest reliability assesses the reliability of an instrument by comparing results from several samples over time. A researcher will administer the instrument at two different times to the same participants. The researcher then analyzes the data and looks for a correlation between the results of the two different administrations of the instrument. in general, a correlation above about 0.6 is considered evidence of reasonable reliability of an instrument.

One major drawback of this approach is that often given the same instrument to the same people a second time influences the results of the second administration. It is important that a researcher is aware of this as it indicates that test-retest reliability is not foolproof.

Alternative Forms Reliability 

Alternative forms reliability involves the use of two different instruments that measure the same thing. The two different instruments are given to the same sample. The data from the two instruments are analyzed by calculating the correlation between them. Again, a correlation around 0.6 or higher is considered as an indication of reliability.

The major problem with this is that it is difficult to find two instruments that really measure the same thing. Often scales may claim to measure the same concept but they may both have different operational definitions of the concept.

Kuder-Richardson Split Half Test

The Kuder-Richardson test involves the reliability of categorical variables. In this approach, an instrument is cut in half and the correlation is found between the two halves of the instrument. This approach looks at internal consistency of the items of an instrument.

Coefficient Alpha

Another approach that looks at internal consistency is the Coefficient Alpha. This approach involves administering an instrument and analyze the Cronbach Alpha. Most statistical programs can calculate this number. Normally, scores above 0.7 indicate adequate reliability. The coefficient alpha can only be used for continuous variables like Lickert scales

Conclusion

Assessing reliability is important when conducting research. The approaches discussed here are among the most common. Which approach is best depends on the circumstances of the study that is being conducted.

Reasons for Testing

Testing is done for many different reasons in various fields such as education,  business, and even government. There are many motivations that people have for using evaluation. In this post, we will look at four reasons that testing is done. The five reasons are…

  • For placement
  • For diagnoses
  • For assessing progress
  • For determining proficiency
  • For providing evidence of competency

For Placement

Placement test serve the purpose of determining at what level a student should be placed. There are often given at the beginning of a student’s learning experience at an institution, often before taking any classes. Normally, the test will consist of specific subject knowledge that a student needs to know in order to have success at a certain level.

For Diagnoses

Diagnostic test are for identifying weaknesses or learning problems. There similar to a doctor looking over a patient and trying to diagnose the patients health problem. Diagnostic test help in identifying gaps in knowledge and help a teacher to know what they need to do to help their students.

For Assessing Progress

Progress test are used to assess how the students are doing in comparison to the goals and objectives of the curriculum.  At the university level, these are the mid-terms and final exams that students take. How well the students is able to achieve the objects of the course is measured by progress test.

For Determining Proficiency 

Testing for proficiency provides a snapshot of the student is able to do right now. They do not provide a sign of weaknesses like diagnoses nor do they assess progress in comparison to a curriculum like progress test. Common examples of this type of test are test that are used to determine admission into a program such as the SAT, MCAT, or GRE.

For Providing Evidence of Proficiency 

Sometimes, people are not satisfied with traditional means of evaluation. For them, they want to see what the student can do by having the student through examining the students performance over several assignments over the course of a semester. This form of assessment  provides a way of having students produce work that demonstrates improvement in the classroom.

One of the most common forms of assessment that provides evidence of proficiency is the portfolio. In this approach, the students collect assignments that they have done over the course of the semester to submit. The teacher is able to see how the progress as he sees the students’ improvement over time. Such evidence is harder to track through using tests.

Conclusions

How to assess is best left for the teacher to decide. However, teachers need options that they can use when determining how to assess their students. The examples provided here give teachers ideas on what can assessment they can use in various situations.

Giving Feedback on Written Work

Marking papers and providing feedback is always a chore. However, nothing seems to be more challenging in teaching then providing feedback for written work. There are so many things that can go wrong when students write. Furthermore, the mistakes made are often totally unique to each student. This makes it challenging to try and solve problems by teaching all the students at once. Feedback for writing must often be tailor-made for each student. Doing this for a small class is doable but few have the luxury of teaching a handful of students.

Despite the challenge, there are several practical ways to streamline the experience of providing feedback for papers. Some ideas include the following

  • Structuring the response
  • Training the students
  • Understanding your purpose for marking

Structuring the Response

A response to a student should include the following two points

  1. What went well (positive feedback)
  2. What needs to improvement (constructive feedback)

The response should be short and sweet. No more than a few sentences. It is not necessary to report every flaw to the student. Rather, point out the majors and deal with other problems later.

If it is too hard to try and explain what went wrong sometimes providing an example of a rewritten paragraph from the student’s paper is enough to give feedback. The student compares your writing with their own to see what needs to be done.

Training Students

Students need to know what you want. This means that clear communication about expectations saves time on providing feedback. Providing rubrics is one way of lessen a teacher’s workload. Students see the expectations for the grade they want and target those expectations accordingly. The rubric also helps the teacher to be more consistent in marking papers and providing feedback.

Peer-evaluation is another tool for saving time. Students are more likely to think about what they are doing when hearing it from peers. In addition, students can find some of the smaller problems, such as grammar, so that the teacher can focus on shaping the ideas of the paper. Depending on the maturity of the students, it is better to let them look at it before you invest any energy in providing feedback.

What’s Your Purpose

Many teachers will mark papers and try to catch everything every single time. This means that they are looking at the flow of the paragraph, the connection of the main ideas, will also catch typos and grammatical mistakes. This approach is often overwhelming and extremely time-consuming. In addition, it is discouraging to students who receive papers that are covered in red.

Another approach is what is called selective marking. Selective marking is when a teacher focuses only on specific issues in a paper. For example, a teacher might only focus on paragraph organization for a first draft and focus on the overall flow of the paper later. With this focus, the teacher and students can handle similar issues at the same time that are much more defined than checking everything at once.

Personally, I believe it is best to focus on macro issues such as paragraph organization and overall consistency first before focusing on grammatical issues. If the ideas are going in the right direction it is easy to spot grammar issues. In addition, if the students know English well, most grammar issues are irritating rather than completely crippling in understanding the thrust of the paper. However, perfect grammar without a thesis is a hopeless paper.

Conclusion 

There is no reason to overwork ourselves in marking papers. Basic adjustments in strategy can lead to students who are provided feedback without a teacher over doing it.

Dealing with Mistakes and Providing Feedback

Students are in school to learn. We learn most efficiently when we make mistakes. Understanding how students make mistakes and the various types of mistakes that can happen can help teachers to provide feedback.

Julian Edge describes three types of mistakes

  • Slips-miscalculations that students make that they can fix themselves
  • Errors-Mistakes students cannot fix on their own but require assistance
  • Attempts-A student tries but does not yet know how to do it

It is the last two as a teacher that we are most concern. Helping students with errors and providing assistants with attempts is critical to the development of student learning.

Assessing Students

Students need to know at least two things whenever they are given feedback

  1. What they did well (positive feedback)
  2. What they need to do in order to improve (constructive feedback)

Positive feedback provides students with an understanding of what they have mastered. Whatever they did correctly are things they do not need to worry about for now. Knowing this helps students to focus on their growth areas.

Constructive feedback indicates to students what they need to work. It is not enough to tell students what is wrong. A teacher should also provide suggests on how to deal with the mistakes. The suggestions for improvement become the standard by which the student is judge in the future.

For example, if a student is writing an essay and is struggling with passive voice the teacher indicates what the problem is. After this, the teacher provides suggestions or even examples of switching from passive to active voice. Whenever the essay is submitted again the teacher looks for improve in this particular area of the assignment.

Ways of Giving Feedback

Below are some ways to provide feedback to students

  • Comments-A common method. The teacher writes on the assignment the positive and constructive feedback. This can be used in almost any situation but can be very time-consuming.
  • Grades-This approach is most useful for a summative assessment or when students are submitting something for the final time. The grade indicates the level of mastery that the student has achieved.
  • Self-evaluation-Students judge themselves. This is best done through providing them with a rubric so that they evaluate their performance. Very useful for projects and saves the teacher a great deal of time
  • Peer-evaluation-Same as above except peers evaluate the student instead of himself or herself.

Mistakes are what students do. It is the teacher’s responsibility to turn mistakes into learning opportunities. This can happen through careful feedback the encourages growth and not discouragement.

Assessing Learning

Assessment is focused on determining a students’ progress as related to academics. In this post, we will examine several types of assessment common in education today. The types we will look at are

  • direct observation
  • Written responses
  • Oral responses
  • Rating by others

Direct Observation

Direct observation are instances in which a teacher watches a student to see if learning has occurred. For example, a parent that has instructed a child in how

to tie their shoe will watch the child doing this. When successful, as observed, the parent is assured that learning has occurred. If the child is not successful the parent knows to provide some form of intervention, such as reteaching, to help the child to have success.

Problems with direct observation include the issue of only being able to focus on what is seen. There is no way of knowing what is going on in the child’s mind. Another challenge is that just because the behavior is not observed does not mean that no learning has happened. Students can understand, at times, with being able to perform.

Written Response

Written response is the assessing of a student’s response in writing. These can take the form of test quizzes, homework, and more. The teacher reads the student’s response and determines if there is adequate evidence to indicate that learning has happened. Appropriate answers indicate evidence of learning

In terms of problems, written responses can be a problem for students who lack writing skills. This is especially true for ESL students. In addition, writing takes substantial thinking skills that some students may not possess.

Oral Responses

Oral responses involve a student responding verbally to a question or sharing their opinion. Again issues with language can be a barrier along with difficulties with expressing and articulating one’s opinion. Culturally, mean parts of the world do not encourage students to express themselves verbally. This puts some students at a disadvantage when this form of assessment is employed.

For teachers leading a discussion, it is often critical that they develop methods for rephrasing student comments as well as strategies for developing thinking skills through the use of questions.

Rating by Others

Rating by others can involve teachers, parents, administrators, peers, etc. These individuals assess the performance of a student and provide feedback. The advantages of this include having multiple perspectives on students progress. Every individual has their own biases but when several people assess such threats to validity are reduced.

Problems with rating by others includes finding people who have the time to come and watch a particular student. Another issue is training the raters to assess appropriately. As such, though this is an excellent method, it is often difficult to use.

Conclusion

The tools mentioned in this post are intended to help people new to teaching to see different options in assessment. When assessing students, multiple approaches are often the best. The provide a fuller picture of what the student can do. Therefore, when looking to assess students consider several different approaches to verify that learning has occurred.

Portfolio Assessment

One type of assessment that has been popular a long time is the portfolio. A portfolio is usually a collection of student work over a period of time. There are five common steps to developing student portfolios. These steps are

  1. Determine the purpose of the portfolio.
  2. Identify evidence of skill mastery to be in the portfolio.
  3. Decide who will develop the portfolio.
  4. Pick evidence to place in portfolio
  5. Create portfolio rubric

1. Determine the Purpose of the Portfolio

The student needs to understand the point of the portfolio experience. This helps in creating relevance for the student as well as enhancing the authenticity of the experience. Common reasons for developing portfolios includes the following…

  • assessing progress
  • assigning grade
  • communicating with parents

2. Identify Evidence of Skill Mastery

The teacher and the students need to determine what skills will the portfolio provide evidence for. Common skills that portfolios provide evidence for are the following

  • Complex thinking processes-The use of information such as essays
  • Products-Development of drawings, graphs, songs,
  • Social skills-Evidence of group work

3. Who will Develop the Portfolio

This step has to do with deciding on who will set the course for the overall development of the portfolio. At times, it is the student who has complete authority to determine what to include in a portfolio. At other times, it is the student and the teacher working together. Sometimes, even parents provide input into this process.

4. Pick the Evidence for the Portfolio

The evidence provide must support the skills mention in step two. Depending on who has the power to select evidence, they still may need support in determining if the evidence they selected is appropriate. Regardless, of the requirement, the student needs a sense of ownership in the portfolio.

5. Develop Portfolio Rubric

The teacher needs to develop a rubric for the purpose of grading the student. The teacher needs to explain what they want to see as well as what the various degrees of quality are.

Conclusion

Portfolios are a useful tool for helping students in assessing their own work. Such a project helps in developing a deeper understanding of what is happening in the classroom. Teachers need to determine for themselves when portfolios are appropriate for their students.

After the Exam: Grading Systems II

In this post, we conclude our discussion on grading systems by looking at less common approaches. There are at least three other approaches to grading. These systems are comparison with aptitude, comparison with effort, and comparison with improvement.

Comparison with Aptitude

In this approach, a student is compared with their own potential. In other words, the teacher grades the student on whether or not the student is reaching their full potential on an assignment as determined by the teacher. For example, if an average student does average work, they get an “A.” However, if an excellent student does average work they get a “C”.  To get an “A”, the excellent student must do excellent work as determined by the teacher.

The advantage of this system is everyone, regardless of ability, has a chance at earning high grades. However, the disadvantages are serious. The teacher gets to decide what potential a student has. If the teacher is wrong, weak students are pushed too hard, strong students may not be pushed hard enough, and or vice versa. This grading is also unfair to stronger students as weaker students earn the same grade for inferior work.

Comparison with Effort

This approach does not look at potential as much as it looks at how hard a student works. To receive a higher grade an average student must demonstrate a great deal of effort on a test. For the strong student, if they show little effort on an assessment they will receive a lower grade.

This system has the same advantages and disadvantages of the aptitude system. It is unfair to the stronger students to be held to a different standard in comparison to their peers. Also, it is hard to be objective when determining the amount of effort a student puts forth.

Comparison with Improvement

This system of grading looks at the progress a student makes over time to assign a grade. Students who improve the most will receive the highest grade. Students who show little improvement will not do so well.

This system is more objective than the previous two examples because it relies on data collected over time that is more than a teacher’s impression. However, one significant drawback is the student who does well from the beginning. If a student is strong from the beginning there will be little improvement. Committing to this grading system could hurt high-performing students.

Conclusion

Which system to use depends on the context and needs of your students. The number rule for grading is to maintain consistency within one assessment but it is perhaps okay to flexible from one assignment to the next.

After the Exam: Grading Systems

After the students submit their exams and they have been marked by you, it is time to determine the grades. This can actually be very controversial as there are different grading systems. In this discussion, we will look at two of the most common grading systems and examine their advantages and disadvantages. The grading systems discussed in this blog are comparison with other students and comparison with a standard.

Comparison with Students

Comparison with students is the process of comparing the results of one student with the results of another student. Another term for this is “grading on the curve.” For example, if a test is worth 100 points and the highest score is 85, the total points possible would be reduce to 85. The removal of 15 points raises the grade of all of the students significantly because the standard is the 85 of the highest performing student rather than the absolute value of 100.

Students, particularly the average and low performing ones, love this approach. The reason for this is that they get a boost in their grade without having to demonstrate any further evidence of proficiency in meeting the objectives. Teachers often appreciate this method as well, as it helps students and reduces the pressure of having to fail individuals or give students low grades.

A drawback to this approach is the pressure it places on high-performing students. The good students face pressure to not study as much in order to have a lower grade that benefits the group. Students also have a way of finding out who got the highest score and this can lead to social problems for stronger students.

One way to avoid the pressure on the top student is specify a percentage of students who will receive a certain grade. For example, the top 10% of students will receive an “A” the next 10% of students will receive a “B” and so on. This makes the top performers a group of students rather than an  individual. However, student performance becomes categorical rather than continuous, which some may claim is not accurate.

A question to ask yourself when determining the appropriateness of “grading on a curve” is the context of the subject. It may be okay for someone with an 85 to get an “A” in philosophy. However, do you want a heart doctor operating on you who earned an “A” by earning an 85 or a heart doctor who earned an “A” by scoring a 100? Sometimes this difference is significant.

Comparison with a Standard

Comparison with a standard is comparing students to a specific criteria such as the ABCDF system. Each letter is assigned a percentage out of a hundred and the grade is determined from this. For example, using a traditional grading scale, a student with a “94” would receive an A.

The advantage of this system is the objectivity of the grading system (marking is highly subjective, especially for essay items). Either student received an 94 or they did not. There is no subjective curve. Those who received a high grade truly earned it while those who received a low grade deserved it.

One problem is that different places can use different scales. For example, an “A” in many US Universities is normally 90% and above. However, an “A” in Thailand universities is set at only 80%. Both are seen as “excellent” students. This makes comparisons of students difficult. Using the doctor analogy, who do you want to perform heart surgery on you the 80% “A” doctor or the 90% “A” doctor?

Conclusion

In the next post, we will look at lesser known grading systems that will provide alternatives for teachers searching for ways to help their students. If you have any suggestion or ways of dealing with grading, please share this information in the comments.