12-13-06 Standardized Testing

This Weekly Reader is about standardized testing. Given how controversial a topic testing can be, I thought it might be useful to provide a little information on standardized tests generally and on New York’s tests in particular.

There is no magic to the term “standardized” test. A test is “standardized” when it is designed to be given in the same way every time, usually with the same degree of difficulty. This allows you to fairly compare the results among different groups of test takers. A “standardized” test is a good way to see how any particular test taker compares to others, or to compare groups of test takers. We use standardized tests in situations where we want to see if the test taker has attained a certain performance standard; in competitive situations, like college admissions; and in situations where we want to see how particular schools, districts or states compare to others.
It is important to differentiate between a test and the purpose for which the test is used. A test is simply a tool, kind of like a scale. I may have a perfectly accurate scale that gives me very useful information about whether I need to lay off the cookies. But it would usually not be considered appropriate to make weight a condition of employment. Same with tests. A test may be perfectly valid and informative in measuring the skills it purports to measure, yet the outcome may be used for an irrelevant or inappropriate purpose. Thus, in forming opinions about tests, we should ask two questions: first, is the test a fair tool for measuring what we want to measure, and second, are we using the results in a reasonable way?

Like almost all states, New York requires that students take state proficiency tests. These tests are designed to measure mastery of concepts and material that are embodied in the state’s learning standards. These standards are the official statement of what students in the state should fairly be expected to know and be able to do at particular grade levels. New York’s Board of Regents has adopted a set of
28 learning standards for grades 3-8, for seven subject areas: Mathematics, Science and Technology; English Language Arts; Social Studies; Languages Other than English; Health, Physical Education and Family Consumer Sciences; The Arts; and Career Development and Occupational Studies. You can see the standards here:
http://www.emsc.nysed.gov/3-8/home.html

The standards give rise to a set of core curricula, which add depth to the standards and provide more specificity about exactly what should be taught at each grade level:
http://www.emsc.nysed.gov/ciai/cores.htm

The state’s testing program is “designed to evaluate the implementation of the learning standards at the student, school, district, and state level.” New York’s tests are developed by CTB/McGraw Hill, one of the largest commercial test developers. (CTB/McGraw Hill also develops various forms of the Terra Nova test, which is one of the most commonly used standardized tests in the country.)
Test developers begin by fashioning a pool of questions that they believe will measure students’ mastery of the learning standards. Typically the tests contain a mix of multiple choice and what we call “constructed response” (or short answer) questions. Depending on the test, there may also be “extended response” (or long answer) questions. Care is taken to develop multiple choice questions that measure skills and understanding, rather than memorized answers. Constructed response questions in math often ask students to illustrate the concepts and processes involved in finding the answer.


The questions are reviewed from several perspectives, including readability at grade level; alignment to the learning standards; contexts that are appropriate; and clear and concise language. Based on the recommendations of the review committees, the test questions are accepted, revised, or rejected, and an approved pool of questions is created. These questions are then field tested, in order to determine how they work with real kids. Based on field testing, some items may be discarded. After the questions are field tested, range-finding meetings are held to establish guidelines for scoring each question. Committees of teachers participate in selecting sample papers that exemplify each score point. These anchor papers form the basis of the scoring guide that will be used in scoring the operational tests.

Tests are then “normed” and “scaled”. Norming is simply the process of giving the test to a sufficiently large, representative group of students to see how they do. Scaled scores convert raw scores into a scoring pattern that fits a normal curve. Because scaled scores represent equivalent levels of difficulty, scaled scores allow comparisons among different test forms or across years.

New York City Schools administers state tests in English Language Arts and Mathematics in grades 3 through 8, Social Studies in grades 5 and 8, and Science in grades 4 and 8. (The results of the annual English Language Arts and Mathematics tests allow us to see the year-over-year growth in achievement that we talk about when we refer to “value-added” student achievement growth.) There are a number of other tests for specialized situations such as English as a second language or competitive high school admissions.

Another important test taken by New York students is the National Assessment of Educational Progress, often referred to as NAEP or "the Nation's Report Card.” (A previous weekly reader discussed New York City’s outcomes on this test.) NAEP is a “nationally representative and continuing assessment of what America's students know and can do in various subject areas.” Since many states develop their own tests, the NAEP is seen as a way to compare educational outcomes across states and cities. In 2005, ten urban school districts, including New York, participated in the NAEP “Trial Urban District Assessment” in
reading, mathematics, and science at grades 4 and 8. New York City compared favorably to other urban districts in many ways. You can see all the results here:
http://nces.ed.gov/nationsreportcard/nrc/tuda_reading_mathematics_2005/t0002.asp?printver=


And now for the fun part: here are a couple sample questions from the 4th and 8th grade NAEP math tests, along with a link to more, as well as a link to sample questions at the 4th and 8th grade levels for the state mathematics tests. See what you think:


Sample question 4 is a multiple-choice question in the algebra content area. This question asked students to infer a rule and find the next term in a sequence. The terms in this sequence are the squares of consecutive odd numbers.
1, 9, 25, 49, 81,...
4. The same rule is applied to each number in the pattern above. What is the 6th number in the pattern?

A) 40
B) 100
C) 121
D) 144
E) 169
60 percent of eighth-graders answered this question correctly.
For more NAEP questions, go here:

http://nces.ed.gov/nationsreportcard/nrc/tuda_reading_mathematics_2005/t0026.asp?printver=

For sample state questions, go here:
http://www.emsc.nysed.gov/3-8/math-sample/home.htm