# Assessment items in need of improvement

I’m being nice with that title. I was helping my daughter prepare for an upcoming state assessment (state to remain anonymous) and I was looking over the released items before giving it to her. I was dismayed at a number of items that were written in an attempt to be “real life” problems. Roll the highlight reel:

My daughter is ten years old and actually asked me what a roll of film was. I had to pause a bit and recall that all she knows is a digital camera. Let’s get with the times and at least pick a context that students can relate to. Camera film is so 20th century!

The context of the problem makes no sense at all when you read the question: “Which of the following do all of these numbers have in common?” Why put the numbers on t-shirts? Why not put the numbers on NASCAR race cars and make some money from the corporate sponsors? Why not put the numbers on those little sweaters they put on dogs? Bottom line, what context do the shirts add to this problem? This question is better off putting them in boxes so as to separate them from each other so the kids don’t think the number is 3012906024. The t-shirts add nothing to the problem and I don’t think anything can.

The question asks, “Which design is composed of a prime number of tiles?” Based on the question, I’m guessing that the question is testing whether or not the student knows what a prime number is. But one thing can get in the way of this item: what if the student counts incorrectly? The student could actually know what prime numbers are but if he/she counts incorrectly, then the question misses the mark as it is written. I could argue that design 3 is one square tile with just four little squares drawn into it. If the question wants to test if the student knows their prime numbers, then just write them. Why the tiles? This is like the t-shirt problem above.

Now to be fair, a good majority of the problems are well written and can accurately assess the standard it set out to assess. So why do I go on this rant in a blog about Common Core State Standards? The standards are one step in this wave of educational reform. A lot of teachers are holding their judgment of the CCSS until they see the actual assessment. I sure hope the items they use to measure a student’s mastery of the standards will be well written and will actually match the standard.

## 13 thoughts on “Assessment items in need of improvement”

1. Those questions look like perfect examples of what Dan Meyer calls “pseudo-context.” There is an attempt to make the questions “fit” the real world, but it falls completely flat in all of these cases and does nothing to improve the question.

I’m going to go out on a limb and suggest that if you cannot find a real world application of a particular piece of mathematics, it should not be on an assessment required to graduate from high school.

2. @David Wees…I was thinking the same thing as I was writing this. I think pseudo-context will be on every item writers mind as I’m sure they would not want that label placed on their items.
I don’t think you are going out on a limb with your suggestion. The challenge for any test/item writer is to create an item that assesses the student’s mastery of the intended learning target. As you saw, some things (bias, poor writing, pseudo-context) get in the way of the student hitting that target.
We need to present students with items that allow them to hit that target. For my daughter, I would hope that the assessment items would be rich problems that allows her to demonstrate how she assembles all she learns and presents them in a clear and coherent manner. Filling in bubbles just doesn’t do it for me.

3. Test design, to be effective, has to be culturally neutral, gender non-specific, race non-specific, etc., if you want to exclude bias. The point is that you’re never going to exclude a certain amount of bias, the challenge is to minimize it (Ok, I’m wearing my statistician’s hat again!). I’d much prefer to see a series of ‘graded’ questions to assess depth on a topic rather than just a series of checkmarks against the list of standards.

Of course, that would require students to provide (some) written answers which had to be assessed by people, instead of marked by machines… Ooh, how about the teacher assesses student work against a State standard on a continuous basis, and the student graduates with a portfolio of work showing the levels they achieved across the whole of their schooling…?

If you are going to slap ‘context’ onto the mathematical content, it should at least be correct too… don’t t-shirts in sports usually have the numbers on the back of the shirt, or am I missing something?

I like that you said “State to remain unanimous” if you mean single-minded and single-voiced in its blinkered attitude towards testing, maybe a Freudian slip for anonymous!

4. @Colin, you are correct. Bias will be difficult to eliminate on an assessment and minimizing it is the goal of a test writing team.
There are other ways to assess students and their learning. Your suggestion of Portfolio Assessment is a great assessment tool. I used it when I taught the Interactive Mathematics Program (IMP) and it gave me great insight into what the students could do as well as what they struggled with. I learned a lot about my teaching by reading their portfolios over the course of a year.
Yes, the typo is my bad. I meant anonymous. But I’m sure that I could go to any state DOE site, download their released exams and find some items that can use some improvement.
Thanks for the comment! keep ’em coming folks!

5. While I don’t know anyone directly working on the assessments for the Core Common Standards, I know someone who knows someone. There are two separate groups working on it, and they’re both computer-based. At least one of them is based on innovative thinking (not just “answer this multiple choice question” but the full range of things a computer is capable of assessing, including to some extent open responses).

Now, given the hearsay nature, the final product could be all different, but since you’re curious I thought I’d throw it out there.

6. I heard about the two groups working on the assessment but I haven’t heard about it being computer based. This will be interesting to see how technology is leveraged to inform us about what the students’ thought process is as they work on the assessment item. I’m always interested in what the kids are thinking as they solve problems as opposed to what is their final answer.

7. I agree that these problems are silly, and all too typical, but I would like to point out that there is something interesting going on in the tile problem: a prime number of tiles could only be grouped into a 1 x P rectangle. Thus, you can eliminate two of the answers without counting.

Maybe this problem began it’s life as an interesting insight into decomposing composite numbers, and just ended up as a dog with a handkerchief around its neck.

8. @patrick: you beat me to the punch. That was my thought as well: a potentially good problem gone bad because in all likelihood the writers were too busy realize they had a chance to test an important geometric fact about prime v. composite numbers. Of course, had the assessment been other than multiple choice, there might have been a way to let students observe that fact (in both senses of “observe”), and show that they understood something.

My repeated beef with multiple choice problems is that there’s really no way to know what kids do or don’t understand strictly based on what answer choice they bubble in: is it a clueless wild guess? A case of multiple errors “canceling one another out” so that the correct answer is selected for reasons that actually indicate misunderstanding rather than comprehension? A simple (or not so simple) literacy issue that reflects little or nothing about the student’s understanding or lack thereof of the pertinent mathematics? A case where the wrong answer doesn’t tell us WHAT error the student actually is making (because there are too many mathematical issues packed into one problem)? The list is endless as to what may be completely masked by the “results” of individual answers unless there are other data of a different sort to triangulate with, particularly ASKING the student why s/he put down a given answer or giving her/him a chance to show/explain the thinking that went into arriving at an answer. And that is ignoring the glaringly obvious fact that teachers virtually never are given a student’s individual score, let alone what is actually needed, which is what that student picked on each problem with a chance to speak with the student about why (and in a timely enough matter that the student might actually recall).

There is little doubt that while many item authors mean well, they operate under constraints and with mandates that have little or nothing to do with investigating students’ mathematical abilities (at any given point, not as a fixed amount of “talent”) or thinking, let alone leading to information that teachers can then use to improve instruction and provide kids with specific, constructive feedback.

Whether individual teachers do or would provide such feedback were they not under such pressure from high-stakes standardized multiple-choice dominated tests is another matter. If you read “Inside the Black Box” from a 1998 issue of the KAPPAN, you’ll start to see just how much the vast majority of our current assessment – high-stakes and externally-driven or not – misses the boat by several million miles.

9. I actually like the tile problem. I agree, kids could miscount the number of tiles and get the problem wrong, which might be a shame. But if kids do that, they’re most likely applying a procedural understanding of prime number to the problem (is the number on the list 2, 3, 5, 7, 11, 13, 17, 19, etc.) rather than a conceptual one. And they’re not checking their work.

Kids can be guaranteed of success on the problem if they try to arrange the tiles in arrays other than 1xn arrays. That hinges on their conceptual knowledge of what prime means, and the context helps inspire that conceptual knowledge. In addition, if kids have that conceptual knowledge, then they can generate pretty fool-proof methods for solving the problem that allow them to check their work against their understanding of prime. The worst possible tactic is to count tiles and check that count against the memorized list of prime numbers, and the context of the problem both encourages kids to use a different tactic and penalizes those who don’t.

10. danny says:

When did people say that School Mathematics must equal Real-World Math?
School Math was never meant to be Real Math.

There IS value in School Math, but who ever got the idea to tell students that School = Real? It’ll be 100 years before that mistake gets undone.

11. I believe that what the students study has to be relevant to their world or the world that they will have to contend with when they graduate from high school, college and beyond. One cannot study mathematics (or any discipline) in a vacuum. The reason we have mathematics is because mankind has been solving problems that affect the quality of life. Mathematics, combined with other disciplines, made the Mayan pyramids, the aqueducts of the Roman Empire, and the moon landing possible.
We would be remiss in our mission as educators to not create connections between what they study in school to the real world.

12. john says:

I think the criticism of the questions are a bit over the top. In fact, if the question designers had followed any of your “suggestions” you would’ve also criticized them for providing mere “memorization” problems. Why not provide some actual improvement suggestions rather than just bash away? That’s the real problem with mathematics assessment today. Everyone is so ready to just complain about the other side without honestly trying to reach consensus. At least the question designers were trying to offer some sort of context to the t-shirt problem and I’m sorry, but counting is a pretty important thing to assess. Maybe the questions was trying to test BOTH skills. Counting and knowledge of prime numbers.

This site uses Akismet to reduce spam. Learn how your comment data is processed.