Frequently Made Objections to Assessment and How to Respond
Alias: http://bit.ly/Assess-FMO

Handbook and Other Materials l Asking the Right Questions (ARQ) l Training, Consulting, & External EvaluationFAQ

 Related links: Diagnosing and Responding to Resistance to Evaluation l Confusors l Evaluation Humor l
Return to Flashlight Evaluation Handbook Table of Contents


Usually, when people "resist" evaluation or assessment, it's essential to probe deeply into their reasons, to discuss them, and to use what's learned in designing the study.  This list of "Frequently Made Objections"  is designed to help you listen with an informed ear, and to begin such a dialogue
.

The objections are written in boldface, while the responses are in bullets below.  For more insights into how to clear the air about assessment, while bringing real disagreements and conflicts into the light, see "'Dangerous Discussions' about Assessment."

 

Studies are a waste of time. Either the findings confirm what people already believe or, if findings contradict beliefs, decision-makers still don't believe the findings. Instead they assume that there was a flaw in the methods. And there is always a flaw in the methods!

  • Unfortunately, this is too often true and it's wise to admit it. That's one reason why, if you have a choice, you should focus your own studies on issues where decision-makers really are uncertain and eager for information to help them make a choice.  For more on this, see "Finding a Great Evaluative Question: The Divining Rod of Emotion."

Evaluation/assessment is something we have to do to satisfy outsiders: administrators, government agencies, accreditors... It does us no good so we ought to do the minimum needed to satisfy them.

  • External demands are sometimes unreasonable and sometimes of no benefit to the program. But these days it's increasingly the case that those external agencies have requirements which, if met proactively, can be of great benefit to the program itself: helping steer risky programs, reduce costs, raise money, etc.  The real question ought to be, "Can we meet these requirements in ways that are well worth the time and money needed to do the assessment/evaluation?"

Evaluation/assessment requirements are all about setting objectives and trying to meet them. That misses some of what's important about what we're trying to do. Worse, if we do that kind of assessment, it would distort what we try to do.

  • There's a lot of justice in this objection, too.  There's a thin line (and sometimes no line) between simplifying learning enough to measure or grade it, and oversimplifying or distorting it. That's as true for grading, or guiding teaching by watching students' faces, as it is for authentic assessment or mass testing.

    Beyond that general risk, there's a specific problem that is rarely articulated.  Many people assume that evaluation and assessment must always begin with some measurable goal(s) that apply to everyone. We call that perspective "uniform impact" because it assumes that each student (or other beneficiary) must be measured in the same, objective way.  But that's not true. There's another perspective (which we call "unique uses") that looks for whether each beneficiary has gained in some important, relevant way. Imagine a teaching center whose goal is to serve two faculty members per day.  On this particular day they've done that. Let's assume that the assessment is based on measuring whether faculty skills have developed in important ways. Let's assume that the center has chosen two such goals: help with lecturing skill and help with skills in moderating student discussion. On this day, one faculty client came wanting to improve her lecturing skill and left with much better lecturing skill, while the other faculty member wanted to improve his skills of moderating a discussion and left much better at that.

    By uniform impact standards, the center failed half the time because each client left with one of the two skills unchanged. By unique uses standards, the center succeeded 100% of the time because both clients got what they each needed.  Unique uses looks for whether important change was achieved, whatever it might be.  This example is over-simplified in one important way: unique uses studies usually don't begin with statements of behavioral goals. Instead they begin by describing a zone of concern (faculty should learn to teach better and feel better about their teaching, somehow). Then each user (or a sample of users) are studied, one at a time, to see what whether change has occurred and, if so, what kind. (To some degree, it's analogous to the way that faculty grade essays.) Only after each case has been examined does the assessment effort attempt to identify patterns and make generalizations.

    For more on these two perspectives, and the kinds of rigorous assessment and evaluation for which they each can be used, click here.
    (TLT Group username and password required; to see if your institution is a subscriber and to get the password, look at this list of subscribing institutions and contacts.)

I can't really understand what the objection is to assessment (or evaluation). The problem  may have something with our definitions but I'm not sure.

  • "Evaluation" and "assessment" have an amazing number of clashing definitions. For some people their definitions are loaded with value - positive or negative. "Assessment is something you do for yourself while evaluation is done to you (often in a bid for power over you)," is how some people define these terms. Meanwhile other people assume that assessment (or evaluation) are inherently virtuous activities, essential for ethical, effective teaching. And so on.   There's nothing wrong with words having more than one definition: the dictionary is full of such words!  But sometimes unnecessary arguments are ignited because people are each using the same word in their discussion without realizing they're defining that word differently. We call such terms "confusors" and here's a list of them, with their clashing definitions. The safest route: define your terms and, if you're in a conversation, ask others to do the same. Then, if the definitions clash, decide whether you're each going to stick with your own definitions (reminding each other from time to time that you're doing so) or whether you're going to all use the same definition during this particular conversation.

Trying to measure X, and then use those findings to make decisions, will just corrupt  the measure and probably corrupt X, too.  One important outcome of college is the friends you make for life.  But if we quizzed and graded students every month on how many friends they'd made, the findings would be increasingly meaningless. And the friendships would be, too.

  • This concern is sometimes called "Campbell's Law," and it's a limit that every evaluator needs to consider: Campbell wrote in 1976, "The more any quantitative social indicator is used for social decisionmaking, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."  Evaluators have been more prone to consider the other side: that continual observation can influence behavior for the better, not just because of the value of the evaluative feedback but also because people sometimes may behave better if they know they're being evaluated.  But the opposite, Campbell's concern, is a legitimate worry, and should be discussed openly.  It may justifying or even cancelling plans for an evaluation or assessment.

We're just now investing in technology X. We should wait until it's running well before we put time and money into evaluation of its contribution to better educational practice or outcomes.

  • If you're just about to make a new investment in technology, or just did make one, now is an ideal time to begin a series of studies. The value of using technology stems from how it's used, not just whether it's available.  So for example, two classic ways of using technology to improve learning outcomes involve improving student-faculty interaction and active learning. So now is a good time to begin assessing student-faculty interaction and active learning (including the roles played by current technology).  Later, when the new investments are in place, another evaluation can help you discover whether and how much that investment is indeed helping to improve faculty-student interaction and active learning (or whatever other educational activity or outcome is to be aided).

People might say it's virtuous to study educational uses of technology. But no one around here has seen a single study that produced findings useful enough to justify the time and money needed to do the study.

  • That's often true, and a little puzzling because many useful studies have been done. (See for example, the lists of articles and case studies on the Flashlight Web site.) 

It's not possible (or especially important) to do a study of costs.  Everyone will just think it's an excuse to cut jobs or make them work harder.

  • That may be true.  Most institutions would benefit in the long term by doing fewer (but more carefully chosen and designed) studies.  As for cost studies, the chief element of costs in education are the ways people use their time.  So a "cost study" is often a study of how people can use their time (and their budgets, and space) in more productive and satisfying ways.  The most successful cost studies are usually done by a group of individuals and units and focus on a process in which they all participate and whose costs no single person understands, e.g., the costs of helping faculty use technology in their courses. Cost studies can be helpful because so many of the costs are ordinarily hidden. If the cost study helps people understand the total activity in which they all have been taking part, including the ways in which that activity may be sapping their energy and budgets, it can be the first step toward redesigning the process so it works better while taking less of a toll on people and budgets.

Our board (or donors, or legislature) don't understand what we're doing with technology, and don't want to. At most they want to know numbers: how many machines do we have, and the like.  No evaluation is necessary.

  • That also may be true. But it's likely that at least some of them would like help in understanding the real benefits for students and alumni of the money.  And it can be in the institution's long-term interest to use evaluative findings, including case studies, to help them understand.  Good results can help fuel enthusiasm. Equally important, a track record of using studies to find problems and make improvements can help build confidence that, when you seek the next big infusion of capital, that you're not flying blind.

Our faculty will object to evaluation. They know more about good teaching than any course evaluation instrument can tell them. Anyway we already have a course evaluation system.  (Although it can probably be distorted by being a "nice guy" and giving mostly high grades!)

  • Flashlight was designed to help faculty, their departments and institutions conduct studies of teaching-learning strategies and support services. Flashlight studies of courses are usually done by faculty members in order to improve their own courses, and to contribute to the scholarship of teaching. Institutional studies using Flashlight tools are aimed to help create a more successful academic program.

By the time the data come in, the findings will be obsolete � we'll have changed technology, or the courses will be different, or both.

  • Flashlight studies focus on activities (e.g., student-student collaboration, "library-type" research, time on task). These activities were important twenty years ago and they will be important twenty years from now, no matter what the discipline, level of education, type of technology, or details of teaching-learning strategy.

Flashlight Online surveys can help us gather information from students but that's just anecdotal information and can't be trusted.

  • Flashlight Online (and other Flashlight tools) focus on descriptive information ("how often did the respondent do something) and expert judgment ("when you tried to use this software to do that task, how well did it work?".  If an institution wants data about how often students interact with one another using technology, students provide the single best source of data. To complement the student data, the Flashlight Faculty Inventory provides some parallel questions to ask faculty members about student-student collaboration and other educationally important activities.

Studies like this are inappropriate. They apply models from fields like physics (controlled studies, quantitative data) to education and that just distorts education while not proving anything.

  • Flashlight tools and items can be used for many kinds of study designs. It's most common to begin with interviews and searching discussions, in order to focus the study.  Investigators then often use designs that derive more from the discipline of history than physics: they use interviews and discussion to describe a chain of events describing how the technology is to be used, and how that use may lead to hoped-for or feared outcomes. For example, a study of a course may begin with the hope that students will work together, face to face and online, in doing homework and projects and that this collaborative learning will increase student interest in the course.  This interest and energy are meant to lead to more time spent studying and better test scores. A Flashlight-type study might scan this whole chain of events, to see if any of the links are "weak," e.g., one of the early "links" is student use of e-mail to do homework. But if students don't do this, then perhaps the chain breaks. A deeper investigation might focus on why so few students are using e-mail to do homework together: problems with Internet access? lack of training in using attachments? Have they had bad experiences with "freeloaders" so that they are now reluctant to team up? Do they think that the course is graded on a curve so that helping others may harm their own grades?  Feedback like this can help fix the problem and make it more likely that the end of the chain � improved learning outcomes � will be reached.

We don't have the staff to carry out studies or help others

  • This is a very common problem but increasingly institutions are assigning this duty to current staff and/or hiring new staff.  If you don't have people who taking leadership responsibility, it is extremely difficult for the institution to carry out studies that affect practice.

We only have a half-time person available to do this.

  • Even a half-timer with a budget can help many other people do studies, so long as the work is really shared, especially if the half-timer can occasionally help the researchers get released time.  A half-time person can organize brown-bag lunches among people doing studies and interested in doing studies, make sure that support services are available and help publicize findings. A half-timer is even more effective if supported by a TLT Roundtable or other leadership unit in the institution.

If you have other "Frequently Made Objections" or Responses to add to the list, please share them with us by sending them to Ehrmann@tltgroup.org .  If this page was useful, you might want to see another article to this site on this topic, "Diagnosing and Responding to Resistance to Evaluation."

NEW! Finally, here's a model form for workshops and courses, listing a few such objections. The form is designed to be administered before a workshop or course on assessment and evaluation (e.g., for faculty) in order to begin a process of discussing the reasons for such objections, and how to respond.

-Stephen C. Ehrmann, Director, The Flashlight Program

 

PO Box 5643
Takoma Park, Maryland 20913
Phone
: 301.270.8312/Fax: 301.270.8110  

To talk about our work
or our organization
contact:  Sally Gilbert

Search TLT Group.org

Contact us | Partners | TLTRs | FridayLive! | Consulting | 7 Principles | LTAs | TLT-SWG | Archives | Site Map |