Definition l
Why use Rubrics? l
Why Not? l Examples
l Types of Rubrics l
Rubric to Assess
Other Rubrics l
References
Return to Flashlight Evaluation Handbook
A
rubric is an explicit set of criteria
used for assessing a particular type of work or performance.
A rubric usually also includes levels of
potential achievement for each criterion, and sometimes also
includes work or performance samples that typify each of those
levels. Levels of achievement are often given numerical
scores. A summary score for the work being assessed may be
produced by adding the scores for each criterion. The
rubric may also include space for the judge to describe the
reasons for each judgment or to make suggestions for the author.
Rubric Tools: First generation tools (starting
with word processors and including free web-based rubric
generators such as
Rubistar
and sites such as
Teachnology) produce a rubric that one person can use to
judge one assignment, project, or set of performances at a time.
In contrast, a second generation tool such as
Flashlight Online
2.0 enables an author, or set of authors, to create or
collect a bank of criteria (e.g., A-J), choose a subset of those
criteria to judge each project (judge one assignment with
criteria A, B, and D, while later judging another assignment
with criteria B, C and E), and then to analyze data gathered
cumulatively, criterion by criterion, across projects. With a
second generation rubric tool, it's easier to use rubrics to
assess progress over many projects or performances, even though
each one requires a different mix of criteria. Second generation
tools can also be used to provide different reports to different
stakeholders. For example educators can use a second generation
tool in student hands to gather peer critiques of their drafts,
while also producing reports course by course, and also
producing a report for departmental evaluation. .
Why use
rubrics?
-
To produce assessments that are far more
descriptive than a
single, holistic grade or judgment can be. Instead of merely
saying that this was a
"B- paper," the rubric-based assessment describes the
quality of work on one or more criteria. For example, a
English paper might be assessed on its use of sources, the
quality of the academic argument, and its use of English
(among other criteria). A department's strategic plan
might be assessed using a rubric that included the clarity
of its learning goals for students, the adequacy of staffing
plans, the adequacy of plans for advising, and other
criteria.
-
To let those who are producing work ("authors") know in
advance what criteria judge or judges will apply to
assessing that work
-
To provide a richer and more multidimensional description of
the reasons for assigning a numerical score to a piece of
work. (See, for example,
these rubrics
created with Flashlight Online -- each criteria is
described in 2-3 different but parallel ways.)
-
To enable multiple judges to apply the same criteria to
assessing work. For example, student work can be assessed by
faculty, by other students and by working professionals in
the discipline. If a rubric is applied to program
review, a panel of visiting experts could use the same
rubric to assess the program's performance. (Both
of these uses of rubrics are being developed at Washington
State University.)
-
To enable authors to elicit formative feedback (e.g., peer
critique) for drafts of their work before final submission;
-
To help authors understand more clearly and completely what
judges had to say about their work
-
To enable comparison of works across settings. For
example, imagine an academic department trying to develop
skills A-G among their students. One first year course
focuses on teaching goals A, B, and D, while another first
year course teaches A, C, and E. One second year
course is trying to deepen skill B while introducing skill
E. And so on. If faculty use the same rubrics and then pool
data (which can be done with Flashlight Online), the
department can monitor student progress as they work toward
graduation. It's a far more informative way to assess
student progress and guide changes in the curriculum than to
monitor student GPAs: faculty can see which skills are
developing as hoped, and where there are systemic problems
in teaching and learning.
In what circumstances should one not
use rubrics, or be cautious about their use?
-
Rubrics apply the same, preset criteria to
each piece of work being assessed. It may not be
appropriate to use rubrics if an assessor were to say of two
different pieces of work. "They have absolutely nothing in
common but they are each excellent, in different ways."
-
Rubrics are ordinarily created in advance, in
order to let authors know in advance how their work will be
judged. But that's not always appropriate. Sometimes judges
prefer to create criteria inductively, after seeing the
work. In those instances, it may still be appropriate
to create the rubric as the works are being judged. The
rubric would then be used to help assure that the works are
being judged consistently and to communicate the reasoning
to the authors.
To give you a
better idea of what
rubrics are and what you
can do with them, this
web page includes three
rubrics of increasing
complexity. All
three of the following examples are purely conceptual. Rubrics
sometimes also include associated examples of work that typify
each stage. For additional examples of
scoring sheets in
various disciples, see
Barbara Walvoord and
Virginia Anderson, Effective
Grading: A Tool for
Learning and Assessment
(Jossey-Bass, 1998).
I.
Grading Sheet for
Journals in Beginner's
Spanish III, by Dorothy
Sole, Univ. Cincinnati
4
-
The content of
the journal is by and
large comprehensible.
Although there
are errors, verb tenses
sentence structure, and
vocabulary are in the
main correctly used.
The author has
taken some chances,
employing sentence
structures or expressing
thoughts that are on the
edge of what we have
been studying.
The entries are
varied in subject and
form.
3
-
There is some use
of appropriate verb
tenses and correct
Spanish structure and
vocabulary but incorrect
usage and/or vocabulary
interferes with the
reader's comprehension.
2
-
The reader finds
many of the entries
difficult to understand,
and/or many entries are
simplistic and/or
repetitious.
1
-
The majority of
the entries are
virtually
incomprehensible.
In
addition to this scale,
part of the grade is
based on the number of
entries and their
length.
II.
Grading Sheet for
First-Year Western
Civilization Course
Required as Part of Gen
Ed, by John Breihan,
History, Loyola College
in Maryland
The
scale describes a
variety of common types
of paper but may not
exactly describe yours;
my mark on the scale
denotes roughly where it
falls.
More precise
information can be
derived from comments
and conferences with the
instructor [Breihan
would offer written
comments on the paper,
in addition to his mark
on this scale.]
Grade:
1.
The paper is dishonest
F
2. The paper
completely ignores the
questions set.
3.
The paper is
incomprehensible due to
errors in language or
usage.
4.
The paper contains very
serious factual errors.
D
5. The paper
simply lists, narrates,
or describes historical
data, and includes
several factual errors
6.
The paper correctly
lists, narrates, or
describes historical
data but makes little or
not attempt to frame an
argument or thesis.
7.
The paper states an
argument or thesis, but
one that does not
address the question
set.
C
8. The paper
states an argument or
thesis, but supporting
subtheses and factual
evidence are:
a.
Missing
b.
Incorrect or
anachronistic
c.
Irrelevant
d.
Not sufficiently
specific
e.
All or partly obscured
by errors in language or
usage
9.
The paper states an
argument on the
appropriate topic,
clearly supported by
relevant subtheses and
specific factual
evidence, but
counterarguments and
counterexamples are not
mentioned or answered.
B
10. The paper
contains an argument,
relevant subtheses, and
specific evidence;
counterarguments and
counterexamples are
mentioned by not
adequately answered:
A.
Factual evidence
incorrect or missing or
not specific
B.
Linking sub-theses
either unclear or
missing
C.
Counterarguments and
counterexamples not
clearly stated;
“strawman”
A
11. The paper
adequately states and
defends an argument, and
answers all
counterarguments and
counterexamples
suggested by lectures
and textbook.
III.
Grading Sheet for Scientific
Experiment in Biology Capstone
Course, by Virginia Johnson
Anderson, Towson University, Towson,
MD
Assignment:
Semester-long assignment to design
an original experiment, carry it
out, and write it up in scientific
report format.
Students are to determine
which of two brands of a commercial
product (e.g. two brands of popcorn)
are “best.”
They must base their judgment
on at least four experimental
factors (e.g. “% of kernels
popped” is an experimental factor.
Price is not, because it is
written on the package).
Title
5
-
Is appropriate in tone and
structure to science journal;
contains necessary descriptors,
brand names, and allows reader to
anticipate design.
4
-
Is appropriate in tone and
structure to science journal; most
descriptors present; identifies
function of experimentation,
suggests design, but lacks brand
names.
3
-
Identifies function, brand
name, but does not allow reader to
anticipate design.
2
-
Identifies function or brand
name, but not both; lacks design
information or is misleading
1
-
Is patterned after another
discipline or missing.
Introduction
5
-
Clearly identifies the
purpose of the research; identifies
interested audiences(s); adopts an
appropriate tone.
4
-
Clearly identifies the
purpose of the research; identifies
interested audience(s).
3
-
Clearly identifies the
purpose of the research.
2
-
Purpose present in
Introduction, but must be identified
by reader.
1
-
Fails to identify the purpose
of the research.
Scientific
Format Demands
5
-
All material placed in the
correct sections; organized
logically within each section; runs
parallel among different sections.
4
-
All material placed in
correct sections; organized
logically within sections, but may
lack parallelism among sections.
3
-
Material place is right
sections but not well organized
within the sections; disregards
parallelism.
2
-
Some materials are placed in
the wrong sections or are not
adequately organized wherever they
are placed.
1
-
Material placed in wrong
sections or not sectioned; poorly
organized wherever placed.
Materials
and Methods Section
5
-
Contains effectively,
quantifiably, concisely organized
information that allows the
experiment to be replicated; is
written so that all information
inherent to the document can be
related back to this section;
identifies sources of all data to be
collected; identifies sequential
information in an appropriate
chronology; does not contain
unnecessary, wordy descriptions of
procedures.
4
-
As above, but contains
unnecessary information, and/or
wordy descriptions within the
section.
3
-
Presents an experiment that
is definitely replicable; all
information in document may be
related to this section; however,
fails to identify some sources of
data and/or presents sequential
information in a disorganized,
difficult pattern.
2-
Presents an experiment that
is marginally replicable; parts of
the basic design must be inferred by
the reader; procedures not
quantitatively described; some
information in Results or
Conclusions cannot be anticipated by
reading the Methods and Materials
section.
1
-
Describes the experiment so
poorly or in such a nonscientific
way that is cannot be replicated.
Non-experimental
Information
5
-
Student researches and
includes price and other
nonexperimental information that
would be expected to be significant
to the audience in determining the
better product, or specifically
states non-experimental factors
excluded by design; interjects these
at appropriate positions in text
and/or develops a weighted rating
scale; integrates nonexperimental
information in the Conclusions.
4
-
Student acts as above, but is
somewhat less effective in
developing the significance of the
non-experimental information.
3
-
Student introduces price and
other non-experimental information,
but does not integrate them into
Conclusions.
2
-
Student researches and
includes price effectively; does not
include or specifically exclude
other non-experimental information.
1
-
Student considers price
and/or other non-experimental
variables as research variables;
fails to identify the significance
of these factors to the research.
Designing
an Experiment
5
-
Student selects experimental
factors that are appropriate to the
research purpose and audience;
measures adequate aspects of these
selected factors; establishes
discrete subgroups for which data
significance may vary; student
demonstrates an ability to eliminate
bias from the design and bias-ridden
statements from the research;
student selects appropriate sample
size, equivalent groups, and
stastitics; student designs a
superior experiment.
4
-
As above, but student designs
an adequate experiment.
3
-
Student selects experimental
factors that are appropriate to the
research purpose and audience;
measures adequate aspects of these
selected factors; establishes
discrete subgroups for which data
significance may vary; research is
weakened by bias OR by sample size
of less than 10.
2
-
As above, but research is
weakened by bias AND inappropriate
sample size
1
-
Student designs a poor
experiment.
Defining
Operationally
5
-
Student constructs a stated
comprehensive operational definition
and well-developed specific
operational definitions.
4
-
Student constructs an implied
comprehensive operational definition
and well-developed specific
operational definitions.
3
-
Student constructs an implied
comprehensive operational definition
(possible less clear) and some
specific operational definitions.
2
-
Student constructs specific
operational definitions, but fails
to construct a comprehensive
definition.
1
-
Student lacks understanding
of operation definition.
Controlling
Variables
5
-
Student demonstrates, by
written statement, the ability to
control variables by experimental
control and by randomization;
student makes reference to, or
implies, factors to be disregarded
by reference to pilot or experience;
superior overall control of
variables.
4
-
As above, but student
demonstrates an adequate control of
variables.
3
-
Student demonstrates the
ability to control important
variables experimentally; Methods
and Materials section does not
indicate knowledge of randomization
and/or selected disregard of
variables.
2
-
Student demonstrates the
ability to control some, but not
all, of the important variables
experimentally.
1
-
Student demonstrates a lack
of understanding about controlling
variables.
Collecting
Data and Communicating Results
5
-
Student selects quantifiable
experimental factors and/or defines
and establishes quantitative units
of comparison; measures the
quantifiable factors and/or units in
appropriate quantities or intervals;
student selects appropriate
statistical information to be
utilized in the results; when
effective, student displays results
in graphs with correctly labeled
axes; data are presented to the
reader in text as well as graphic
forms; tables or graphs have
self-contained headings.
4
-
As 5 above, but the student
did not prepare self-contained
headings for tables or graphs.
3
-
As 4 above, but data reported
in graphs or tables contain
materials that are irrelevant.
and/or not statistically
appropriate.
2
-
Student selects quantifiable
experimental factors and/or defines
and establishes quantitative units
of comparison; fails to select
appropriate quantities or intervals
and/or fails to display information
graphically when appropriate.
1
-
Student does not select,
collect, and/or communicate
quantifiable results.
Interpreting
Data: Drawing
Conclusions/Implications
5
-
Student summarizes the
purpose and findings of the
research; student draws inferences
that are consistent with the data
and scientific reasoning and relates
these to interested audiences;
student explains expected results
and offers explanations and/or
suggestions for further research for
unexpected results; student presents
data honestly, distinguishes between
fact and implication, and avoids
overgeneralizing; student organizes
non-experimental information to
support conclusion; student accepts
or rejects the hypothesis.
4
-
As 5 above, but student does
not accept or reject the hypothesis.
3
-
As 4 above, but the student
overgeneralizes and/or fails to
organize non-experimental
information to support conclusions.
2
-
Student summarizes the
purpose and findings of the
research; student explains expected
results, but ignores unexpected
results.
1
-
Student may or may not
summarize the results, but fails to
interpret their significance to
interested audiences.
Table
10.1: Student
Scores on PTA for Science Reports,
Before and After Anderson Made
Pedagogical Changes
|
Trait
|
Before
|
After
|
P
Values*
|
|
Title
|
2.95
|
3.22
|
.24
|
|
Introduction
|
3.18
|
3.64
|
.14
|
|
Scientific
Format
|
3.09
|
3.32
|
.31
|
|
Methods
and Materials
|
3.00
|
3.55
|
.14
|
|
Non-Experimental
Info
|
3.18
|
3.50
|
.24
|
|
Designing
the Experiment
|
2.68
|
3.32
|
.07
|
|
Defining
Operationally
|
2.68
|
3.50
|
.01
|
|
Controlling
Variables
|
2.73
|
3.18
|
.10
|
|
Collecting
Data
|
2.86
|
3.36
|
.14
|
|
Interpreting
Data
|
2.90
|
3.59
|
.03
|
|
Overall
|
2.93
|
3.42
|
.09
|
These
judgments of student work in
Anderson's course (both before and
after) were made by external graders
using Anderson's rubrics.
- Stephen C. Ehrmann.
This page draws substantially on the work of Barbara
Walvoord, University of Notre Dame
|