In November, the second installment of our Capitol Hill Education series featured three experts on testing issues. Surprisingly, there is strong opposition to testing students and holding schools accountable for their scores. Some claim that schools are reduced to teaching only what is on the tests. Others assert that the tests are discriminatory because not every student performs the same. Our panel of education experts addressed these claims in a lively discussion.



Do Standardized Tests Discriminate?

DAVID MURRAY
Director of Research, Statistical Assessment Service



DAVID MURRAY
Director of Research, Statistical Assessment Service


DAVID MURRAY
Director of Research, Statistical Assessment Service


The primary focus of my discussion is the use of the Scholastic Assessment Test, the SAT. I am, by and large, a proponent of the SAT, even acknowledging its potential weaknesses.


We have standardized testing as a function of the fundamental problem of supply and demand. The demand is for very desirable positions in elite universities, and those who hope for this greatly exceed the supply of available spaces. Therefore, we need to have some mechanism to sort and establish the criteria of eligibility.


So why use a standardized test? In setting entrance criteria, you want the same racetrack, the same expectation for all students to perform. As we know, people come not only with different preparation, skills, and abilities, but the settings for their elementary and secondary training are widely diverse. A standardized mechanism allows all students to be measured against each other.


Standardized tests are not the best possible way, but the only workable way of sorting people on a national level with regard to the criterion of greatest consequence to the university.


What drives the critics of standardized tests is the uncomfortable realization that when we apply them to every community in America, we get disparate outcomes. The path of least resistance is to file a lawsuit against the test rather than to fix the schools. There has been an enormous amount of social engineering brought to bear on this standardized test for admissions within the last decade. For example, in 1995 and 1996, the SAT was reconstructed with different scores and calibrations in order to enhance the performance of various demographic groups.


Does the SAT discriminate in American society? The answer is transparent: Of course it does. It is supposed to. But let’s be careful about what discrimination means. The point of the test is discernment, making exact and careful distinctions between performances.


What we mean to ask is, does the SAT discriminate unjustly? Is it unfair in its application, or is it measuring something that we feel is unfair to be measured? If the test is for entrance to a university, an institution of intellectual pursuit, then it is not inherently unfair to measure intellectual performance as the SAT does. Can it be abused? Absolutely. Can it be overstressed at the expense of other qualities and features? Absolutely. But, without this instrument, what have we got?


The alternative measures — portfolios, writing assessments, personality evaluations, and leadership skills?are socially approved criteria of groups that are up, or groups that are down. They are based on political fashion. These alternatives are not only less reliable as predictors of academic performance, they are inherently dangerous. We know, historically, they will be more subjective and more susceptible to abuse.


The SAT is a sort of bulwark. It’s got a fairly strong correlation of predictiveness for the first year of college grades. It’s better than any other measure we have. It can be a path of social mobility and it brings us information we need to have about how we’re doing. I think we eliminate it at our peril.



The National Perspective on Tests

CHESTER E FINN, JR.

President, Thomas B. Fordham Foundation



CHESTER E FINN, JR.

President, Thomas B. Fordham Foundation


CHESTER E FINN, JR.

President, Thomas B. Fordham Foundation


CHESTER E FINN, JR.

President, Thomas B. Fordham Foundation


I? don’t want to talk about testing individuals. I want to talk about testing in its audit function, in its accountability function, with respect to the educational system. Because we are on Capitol Hill, I will focus on testing vis-á-vis the federal government. The biggest national test is known as the National Assessment of Educational Progress or NAEP.


NAEP has been around for roughly 30 years. It started as a federal grant to an educational organization that began to give a test to a sample of kids around the country at ages 9, 13, and 17. It was later converted to test kids in the fourth, eighth, and twelfth grades. The test is a sample of kids around the country, not to see how any individual child, or school, or school district is doing, but rather to see how the country is doing.


Originally, the NAEP did not provide state-by-state data because there was resistance in the education community. They did not want people auditing them from the outside and showing how they were really doing.


Initially, the test wasn’t related to anything resembling a standard of how well the kids should be doing. It was simply offered as descriptive data. For example, the test could tell us that 43 percent of 13-year-olds could solve a particular problem and 57 percent could not.


In 1988, NAEP underwent a substantial overhaul and we are now into the 12th year of what I think could fairly be called the second generation of NAEP. In this second generation, NAEP has seen several important developments. The first is permission to report results state-by-state. A state is not required to participate in NAEP, but most do. Roughly 100 schools in a state must participate to make the state sample statistically valid.


A second important change to NAEP was the decision to run the test by an independent governing board. The governing board also sets standards for NAEP results. In other words, instead of just reporting how many kids got something right, the governing board is able to state how many kids should get it right. More to the point, which items should all kids get right if you want to be deemed at the proficient level as a country or as a state? The governing board set three standards on the NAEP curve-basic, proficient, and advanced-and now reports how many kids in each grade are at each level.


NAEP has begun to be used for a wide variety of things. It has become the principal national instrument tracking educational performance over time. Unlike the SAT, for which students select themselves as participants, NAEP offers a statistical sample of all kids in the country, and is generally believed to be a more reliable barometer of how the country’s doing educationally.


NAEP has also become very important for many states in tracking how they’re doing with their own education reforms. Some states have their own achievement tests, but like to know how they’re doing on NAEP because it functions as a kind of external audit.


Life could have gone on with NAEP in its current form, but about three years ago, President Clinton concluded that NAEP ought to become the basis for something new called the Voluntary National Test (VNT). This was meant to actually provide data at the district level, the school level, and right down to the individual child level. Essentially, the proposal was that the federal government would create, fund, and administer a single national test that would work its way down to the child level. It produced an enormous eruption of opposition from both the left and the right.


The Clinton proposal died, but was reignited during the presidential race. Both Gore and Bush proposed using NAEP in ways that would make it more entangled in federal education policy and programs than it has ever been. The intention is to use NAEP as a form of high stakes test to monitor whether the federal Title I program is working. This would change the character of NAEP from a low stakes audit to a form of reward and punishment for states, and might alter behavior with respect to it. I am personally cool toward the idea of NAEP turning into a high stakes test as opposed to a reliable audit instrument.


Applying Testing to the Classroom

MEGHAN FARNSWORTH
Bradley Fellow, The Heritage Foundation



MEGHAN FARNSWORTH
Bradley Fellow, The Heritage Foundation


MEGHAN FARNSWORTH
Bradley Fellow, The Heritage Foundation


Almost every single state has some sort of testing that they do as part of accountability measures in education reform. Not all folks like this testing and there’s been a lot of moaning back and forth among teachers, parents, and others saying that students are getting tested too much, that this is too much pressure on the kids and that the stakes are too high.


Not all state tests are created equal, and not all states have the same standards. Any state test, in order to be a good test, should really be measuring how well the students are learning the particular standards they are supposed to be learning.


There are some definite reasons why we need testing, and why students need to be tested. The key issue stems from accountability. We need to know, and teachers need to know, what students should be learning. We also need to know what students aren’t learning and then work on those issues, and also look at teacher quality in relation to those tasks.


Many people don’t like tailoring curricula to state tests because they think teachers don’t have choices on what they’re teaching. I disagree. I think it is absolutely essential that teachers know what they are supposed to be teaching, and are held accountable to teaching those particular issues.


I will use Christopher Columbus as an example because I’ve been in countless classrooms as a teacher, as a state auditor, and as a curriculum specialist where this has come up. I can’t tell you how many lesson plans I’ve seen that cover Christopher Columbus. Kids at every grade level learn about Christopher Columbus. When I ask teachers why they are teaching this particular unit, they almost always respond with, “Because it is my favorite unit.” Children can recite in detail all the specifics about Christopher Columbus, but they don’t necessarily know other information they are supposed to be learning at a particular grade level.


If teachers have an interest in a subject and know a lot about it, they like to teach it. But there needs to be additional motivation to teach certain things. I’d love to give teachers credit for teaching things that are on the standards, but until recently, most weren’t being held accountable to do so.


We also need to take a look at what students aren’t learning. At the Heritage Foundation, we’ve published a book called No Excuses, which looks at high-poverty schools with high performance scores. It is typically assumed that poor students won’t do well on tests. All of the students featured in this book are doing very well, and one of the underlying reasons for that success is the fact that the students are tested on a regular basis.


Test results from students need to be used for staff evaluations and for professional development. Far too often, I’ve seen teachers being given professional development activities that have nothing to do with their actual needs. Their actual needs should be based on how well their students are doing. It should be tied together with student performance and test scores. Hopefully, as states continue to change their testing and improve upon it, we’ll have fewer instances of kids knowing a lot about Christopher Columbus, and not enough about all the other things they need to learn.