|
Related links: Diagnosing and Responding
to Resistance to Evaluation l
Confusors l
Evaluation Humor l
Return to
Flashlight Evaluation Handbook Table of Contents
Usually, when people "resist" evaluation or
assessment, it's essential to probe deeply into their
reasons, to discuss them, and to use what's learned in
designing the study. This list of "Frequently Made
Objections" is designed to help you listen with an
informed ear, and to begin such a dialogue. The
objections are written in boldface, while the responses are in
bullets below. For more insights into
how to clear the air about assessment, while bringing real
disagreements and conflicts into the light, see "'Dangerous
Discussions' about Assessment."
Studies are a waste of time.
Either the findings confirm what people already believe or,
if findings contradict beliefs, decision-makers still don't
believe the findings. Instead they assume that there was a
flaw in the methods. And there is always a flaw in the
methods!
- Unfortunately, this is
too often true and it's wise to admit it. That's one
reason why, if you have a choice, you should focus your
own studies on issues where decision-makers really are
uncertain and eager for information to help them make a
choice. For more on this, see "Finding a Great
Evaluative Question: The Divining Rod of Emotion."
Evaluation/assessment is
something we have to do to satisfy outsiders:
administrators, government agencies, accreditors... It does
us no good so we ought to do the minimum needed to satisfy
them.
- External demands are
sometimes unreasonable and sometimes of no benefit to
the program. But these days it's increasingly the case
that those external agencies have requirements which, if
met proactively, can be of great benefit to the program
itself: helping steer risky programs, reduce costs,
raise money, etc. The real question ought to be, "Can
we meet these requirements in ways that are well worth
the time and money needed to do the
assessment/evaluation?"
Evaluation/assessment
requirements are all about setting objectives and trying to
meet them. That misses some of what's important about what
we're trying to do. Worse, if we do that kind of assessment,
it would distort what we try to do.
- There's a lot of justice
in this objection, too. There's a thin line (and
sometimes no line) between simplifying learning enough
to measure or grade it, and oversimplifying or
distorting it. That's as true for grading, or guiding
teaching by watching students' faces, as it is for
authentic assessment or mass testing.
Beyond that general risk, there's a specific problem
that is rarely articulated. Many people assume that
evaluation and assessment must always begin with some
measurable goal(s) that apply to everyone. We call that
perspective "uniform impact" because it assumes that
each student (or other beneficiary) must be measured in
the same, objective way. But that's not true. There's
another perspective (which we call "unique uses") that
looks for whether each beneficiary has gained in some
important, relevant way. Imagine a teaching center whose
goal is to serve two faculty members per day. On this
particular day they've done that. Let's assume that the
assessment is based on measuring whether faculty skills
have developed in important ways. Let's assume that the
center has chosen two such goals: help with lecturing
skill and help with skills in moderating student
discussion. On this day, one faculty client came wanting
to improve her lecturing skill and left with much better
lecturing skill, while the other faculty member wanted
to improve his skills of moderating a discussion and
left much better at that.
By uniform impact standards, the center failed half the
time because each client left with one of the two skills
unchanged. By unique uses standards, the center
succeeded 100% of the time because both clients got what
they each needed. Unique uses looks for whether
important change was achieved, whatever it might be.
This example is over-simplified in one important way:
unique uses studies usually don't begin with statements
of behavioral goals. Instead they begin by describing a
zone of concern (faculty should learn to teach better
and feel better about their teaching, somehow). Then
each user (or a sample of users) are studied, one at a
time, to see what whether change has occurred and, if
so, what kind. (To some degree, it's analogous to the
way that faculty grade essays.) Only after each case has
been examined does the assessment effort attempt to
identify patterns and make generalizations.
For more on these two perspectives, and the kinds of
rigorous assessment and evaluation for which they each
can be used,
click here.
(TLT
Group username and password required; to see if your
institution is a subscriber and to get the password,
look at this list of subscribing institutions and contacts.)
I can't really understand
what the objection is to assessment (or evaluation). The
problem may have something with our definitions but I'm not
sure.
- "Evaluation" and
"assessment" have an amazing number of clashing
definitions. For some people their definitions are
loaded with value - positive or negative. "Assessment is
something you do for yourself while evaluation is done
to you (often in a bid for power over you)," is how some
people define these terms. Meanwhile other people assume
that assessment (or evaluation) are inherently virtuous
activities, essential for ethical, effective teaching.
And so on. There's nothing wrong with words having
more than one definition: the dictionary is full of such
words! But sometimes unnecessary arguments are ignited
because people are each using the same word in their
discussion without realizing they're defining that word
differently. We call such terms "confusors" and
here's a list
of them, with their clashing definitions. The safest
route: define your terms and, if you're in a
conversation, ask others to do the same. Then, if the
definitions clash, decide whether you're each going to
stick with your own definitions (reminding each other
from time to time that you're doing so) or whether
you're going to all use the same definition during this
particular conversation.
Trying to measure X, and then use those
findings to make decisions, will just corrupt the
measure and probably corrupt X, too. One important
outcome of college is the friends you make for life.
But if we quizzed and graded students every month on how
many friends they'd made, the findings would be increasingly
meaningless. And the friendships would be, too.
- This concern is sometimes called "Campbell's
Law," and it's a limit that every evaluator needs to
consider: Campbell wrote in 1976, "The more any
quantitative social indicator is used for social
decisionmaking, the more subject it will be to
corruption pressures and the more apt it will be to
distort and corrupt the social processes it is intended
to monitor." Evaluators have been more prone to
consider the other side: that continual observation can
influence behavior for the better, not just because of
the value of the evaluative feedback but also because
people sometimes may behave better if they know they're
being evaluated. But the opposite, Campbell's
concern, is a legitimate worry, and should be discussed
openly. It may justifying or even cancelling plans
for an evaluation or assessment.
We're just now investing
in technology X. We should wait until it's running well
before we put time and money into evaluation of its
contribution to better educational practice or outcomes.
- If you're just about to
make a new investment in technology, or just did make
one, now is an ideal time to begin a series of studies.
The value of using technology stems from how it's used,
not just whether it's available. So for example, two
classic ways of using technology to improve learning
outcomes involve improving student-faculty interaction
and active learning. So now is a good time to begin
assessing student-faculty interaction and active
learning (including the roles played by current
technology). Later, when the new investments are in
place, another evaluation can help you discover whether
and how much that investment is indeed helping to
improve faculty-student interaction and active learning
(or whatever other educational activity or outcome is to
be aided).
People might say it's
virtuous to study educational uses of technology. But no one
around here has seen a single study that produced findings
useful enough to justify the time and money needed to do the
study.
- That's often true, and a
little puzzling because many useful studies have been
done. (See for example, the lists of
articles and
case studies on the
Flashlight Web site.)
It's not possible (or
especially important) to do a study of costs.
Everyone will just think it's an excuse to cut jobs
or make them work harder.
- That may be true. Most
institutions would benefit in the long term by doing
fewer (but more carefully chosen and designed) studies.
As for cost studies, the chief element of costs in
education are the ways people use their time. So a
"cost study" is often a study of how people can use
their time (and their budgets, and space) in more
productive and satisfying ways. The most successful
cost studies are usually done by a group of individuals
and units and focus on a process in which they all
participate and whose costs no single person
understands, e.g., the costs of helping faculty use
technology in their courses. Cost studies can be helpful
because so many of the costs are ordinarily hidden. If
the cost study helps people understand the total
activity in which they all have been taking part,
including the ways in which that activity may be sapping
their energy and budgets, it can be the first step
toward redesigning the process so it works better while
taking less of a toll on people and budgets.
Our board (or donors, or
legislature) don't understand what we're doing with
technology, and don't want to. At most they want to know
numbers: how many machines do we have, and the like.
No evaluation is necessary.
- That also may be true.
But it's likely that at least some of them would like
help in understanding the real benefits for students and
alumni of the money. And it can be in the institution's
long-term interest to use evaluative findings, including
case studies, to help them understand. Good results can
help fuel enthusiasm. Equally important, a track record
of using studies to find problems and make improvements
can help build confidence that, when you seek the next
big infusion of capital, that you're not flying blind.
Our faculty will object to
evaluation. They know more about good teaching than any
course evaluation instrument can tell them. Anyway we
already have a course evaluation system.
(Although it can probably be distorted by being a "nice guy"
and giving mostly high grades!)
- Flashlight was designed
to help faculty, their departments and institutions
conduct studies of teaching-learning strategies and
support services. Flashlight studies of courses are
usually done by faculty members in order to improve
their own courses, and to contribute to the scholarship
of teaching. Institutional studies using Flashlight
tools are aimed to help create a more successful
academic program.
By the time the data come
in, the findings will be obsolete � we'll have changed
technology, or the courses will be different, or both.
- Flashlight studies focus
on activities (e.g., student-student collaboration,
"library-type" research, time on task). These activities
were important twenty years ago and they will be
important twenty years from now, no matter what the
discipline, level of education, type of technology, or
details of teaching-learning strategy.
Flashlight Online
surveys can help us gather information from students but
that's just anecdotal information and can't be trusted.
- Flashlight Online (and
other Flashlight tools) focus on descriptive information
("how often did the respondent do something) and expert
judgment ("when you tried to use this software to do
that task, how well did it work?". If an institution
wants data about how often students interact with one
another using technology, students provide the single
best source of data. To complement the student data, the
Flashlight Faculty Inventory provides some parallel
questions to ask faculty members about student-student
collaboration and other educationally important
activities.
Studies like this are
inappropriate. They apply models from fields like physics
(controlled studies, quantitative data) to education and
that just distorts education while not proving anything.
- Flashlight tools and
items can be used for many kinds of study designs. It's
most common to begin with interviews and searching
discussions, in order to focus the study. Investigators
then often use designs that derive more from the
discipline of history than physics: they use interviews
and discussion to describe a chain of events describing
how the technology is to be used, and how that use may
lead to hoped-for or feared outcomes. For example, a
study of a course may begin with the hope that students
will work together, face to face and online, in doing
homework and projects and that this collaborative
learning will increase student interest in the course.
This interest and energy are meant to lead to more time
spent studying and better test scores. A Flashlight-type
study might scan this whole chain of events, to see if
any of the links are "weak," e.g., one of the early
"links" is student use of e-mail to do homework. But if
students don't do this, then perhaps the chain breaks. A
deeper investigation might focus on why so few students
are using e-mail to do homework together: problems with
Internet access? lack of training in using attachments?
Have they had bad experiences with "freeloaders" so that
they are now reluctant to team up? Do they think that
the course is graded on a curve so that helping others
may harm their own grades? Feedback like this can help
fix the problem and make it more likely that the end of
the chain � improved learning outcomes � will be
reached.
We don't have the staff to
carry out studies or help others
- This is a very common
problem but increasingly institutions are assigning this
duty to current staff and/or hiring new staff. If you
don't have people who taking leadership responsibility,
it is extremely difficult for the institution to carry
out studies that affect practice.
We only have a half-time
person available to do this.
- Even a half-timer with a
budget can help many other people do studies, so long as
the work is really shared, especially if the half-timer
can occasionally help the researchers get released
time. A half-time person can organize brown-bag lunches
among people doing studies and interested in doing
studies, make sure that support services are available
and help publicize findings. A half-timer is even more
effective if supported by a TLT Roundtable or other
leadership unit in the institution.
If you have other "Frequently
Made Objections" or Responses to add to the list, please
share them with us by sending them to
Ehrmann@tltgroup.org
. If this page was useful, you might want to see another article to this site on this
topic, "Diagnosing and Responding
to Resistance to Evaluation."
NEW!
Finally, here's a
model form for workshops and courses, listing a few such
objections. The form is designed to be administered before a
workshop or course on assessment and evaluation (e.g., for
faculty) in order to begin a process of discussing the
reasons for such objections, and how to respond.
-Stephen C. Ehrmann,
Director, The Flashlight Program
|