Rubrics: Definition, Tools, Examples, References

Handbook and Other Materials l Asking the Right Questions (ARQ) l Training, Consulting, & External EvaluationFAQ

Definition l Why use Rubrics? l Why Not? l Examples l Types of Rubrics l Rubric to Assess Other Rubrics l References
Return to Flashlight Evaluation Handbook

 

A rubric is an explicit set of criteria used for assessing a particular type of work or performance. A rubric usually also includes levels of potential achievement for each criterion, and sometimes also includes work or performance samples that typify each of those levels.  Levels of achievement are often given numerical scores.  A summary score for the work being assessed may be produced by adding the scores for each criterion. The rubric may also include space for the judge to describe the reasons for each judgment or to make suggestions for the author.

Rubric Tools: First generation tools (starting with word processors and including free web-based rubric generators such as Rubistar and sites such as Teachnology) produce a rubric that one person can use to judge one assignment, project, or set of performances at a time.

In contrast, a second generation tool such as Flashlight Online 2.0 enables an author, or set of authors, to create or collect a bank of criteria (e.g., A-J), choose a subset of those criteria to judge each project (judge one assignment with criteria A, B, and D, while later judging another assignment with criteria B, C and E), and then to analyze data gathered cumulatively, criterion by criterion, across projects. With a second generation rubric tool, it's easier to use rubrics to assess progress over many projects or performances, even though each one requires a different mix of criteria. Second generation tools can also be used to provide different reports to different stakeholders. For example educators can use a second generation tool in student hands to gather peer critiques of their drafts, while also producing reports course by course, and also producing a report for departmental evaluation.   .

 Why use rubrics?

  • To produce assessments that are far more descriptive than a single, holistic grade or judgment can be. Instead of merely saying that this was a "B- paper," the rubric-based assessment describes the quality of work on one or more criteria. For example, a English paper might be assessed on its use of sources, the quality of the academic argument, and its use of English (among other criteria).  A department's strategic plan might be assessed using a rubric that included the clarity of its learning goals for students, the adequacy of staffing plans, the adequacy of plans for advising, and other criteria.

  • To let those who are producing work ("authors") know in advance what criteria judge or judges will apply to assessing that work

  • To provide a richer and more multidimensional description of the reasons for assigning a numerical score to a piece of work. (See, for example, these rubrics created with Flashlight Online -- each criteria is described in 2-3 different but parallel ways.)

  • To enable multiple judges to apply the same criteria to assessing work. For example, student work can be assessed by faculty, by other students and by working professionals in the discipline.  If a rubric is applied to program review, a panel of visiting experts could use the same rubric to assess the program's performance. (Both of these uses of rubrics are being developed at Washington State University.)

  • To enable authors to elicit formative feedback (e.g., peer critique) for drafts of their work before final submission;

  • To help authors understand more clearly and completely what judges had to say about their work

  • To enable comparison of works across settings.  For example, imagine an academic department trying to develop skills A-G among their students.  One first year course focuses on teaching goals A, B, and D, while another first year course teaches A, C, and E.  One second year course is trying to deepen skill B while introducing skill E. And so on. If faculty use the same rubrics and then pool data (which can be done with Flashlight Online), the department can monitor student progress as they work toward graduation. It's a far more informative way to assess student progress and guide changes in the curriculum than to monitor student GPAs: faculty can see which skills are developing as hoped, and where there are systemic problems in teaching and learning.

In what circumstances should one not use rubrics, or be cautious about their use?

  • Rubrics apply the same, preset criteria to each piece of work being assessed.  It may not be appropriate to use rubrics if an assessor were to say of two different pieces of work. "They have absolutely nothing in common but they are each excellent, in different ways." 

  • Rubrics are ordinarily created in advance, in order to let authors know in advance how their work will be judged. But that's not always appropriate. Sometimes judges prefer to create criteria inductively, after seeing the work.  In those instances, it may still be appropriate to create the rubric as the works are being judged. The rubric would then be used to help assure that the works are being judged consistently and to communicate the reasoning to the authors.

To give you a better idea of what rubrics are and what you can do with them, this web page includes three rubrics of increasing complexity.  All three of the following examples are purely conceptual. Rubrics sometimes also include associated examples of work that typify each stage. For additional examples of scoring sheets in various disciples, see Barbara Walvoord and Virginia Anderson, Effective Grading: A Tool for Learning and Assessment (Jossey-Bass, 1998).

I. Grading Sheet for Journals in Beginner's Spanish III, by Dorothy Sole, Univ. Cincinnati

4 -        The content of the journal is by and large comprehensible.  Although there are errors, verb tenses sentence structure, and vocabulary are in the main correctly used.  The author has taken some chances, employing sentence structures or expressing thoughts that are on the edge of what we have been studying.  The entries are varied in subject and form.

3 -        There is some use of appropriate verb tenses and correct Spanish structure and vocabulary but incorrect usage and/or vocabulary interferes with the reader's comprehension.

2 -        The reader finds many of the entries difficult to understand, and/or many entries are simplistic and/or repetitious.

1 -        The majority of the entries are virtually incomprehensible.

In addition to this scale, part of the grade is based on the number of entries and their length.

II. Grading Sheet for First-Year Western Civilization Course Required as Part of Gen Ed, by John Breihan, History, Loyola College in Maryland

The scale describes a variety of common types of paper but may not exactly describe yours; my mark on the scale denotes roughly where it falls.  More precise information can be derived from comments and conferences with the instructor [Breihan would offer written comments on the paper, in addition to his mark on this scale.]

Grade:

1. The paper is dishonest

F          2. The paper completely ignores the questions set.

3. The paper is incomprehensible due to errors in language or usage.

4. The paper contains very serious factual errors.

D         5. The paper simply lists, narrates, or describes historical data, and includes several factual errors

6. The paper correctly lists, narrates, or describes historical data but makes little or not attempt to frame an argument or thesis.

7. The paper states an argument or thesis, but one that does not address the question set.

C         8. The paper states an argument or thesis, but supporting subtheses and factual evidence are:

a. Missing

b. Incorrect or anachronistic

c. Irrelevant

d. Not sufficiently specific

e. All or partly obscured by errors in language or usage

9. The paper states an argument on the appropriate topic, clearly supported by relevant subtheses and specific factual evidence, but counterarguments and counterexamples are not mentioned or answered.

B          10. The paper contains an argument, relevant subtheses, and specific evidence; counterarguments and counterexamples are mentioned by not adequately answered:

A. Factual evidence incorrect or missing or not specific

B. Linking sub-theses either unclear or missing

C. Counterarguments and counterexamples not clearly stated; “strawman”

A         11. The paper adequately states and defends an argument, and answers all counterarguments and counterexamples suggested by lectures and textbook.

III. Grading Sheet for Scientific Experiment in Biology Capstone Course, by Virginia Johnson Anderson, Towson University, Towson, MD

Assignment: Semester-long assignment to design an original experiment, carry it out, and write it up in scientific report format.  Students are to determine which of two brands of a commercial product (e.g. two brands of popcorn) are “best.”  They must base their judgment on at least four experimental factors (e.g. “% of kernels popped” is an experimental factor.  Price is not, because it is written on the package).

Title

5 -        Is appropriate in tone and structure to science journal; contains necessary descriptors, brand names, and allows reader to anticipate design.

4 -        Is appropriate in tone and structure to science journal; most descriptors present; identifies function of experimentation, suggests design, but lacks brand names.

3 -        Identifies function, brand name, but does not allow reader to anticipate design.

2 -        Identifies function or brand name, but not both; lacks design information or is misleading

1 -        Is patterned after another discipline or missing.

Introduction

5 -        Clearly identifies the purpose of the research; identifies interested audiences(s); adopts an appropriate tone.

4 -        Clearly identifies the purpose of the research; identifies interested audience(s).

3 -        Clearly identifies the purpose of the research.

2 -        Purpose present in Introduction, but must be identified by reader.

1 -        Fails to identify the purpose of the research.

Scientific Format Demands

5 -        All material placed in the correct sections; organized logically within each section; runs parallel among different sections.

4 -        All material placed in correct sections; organized logically within sections, but may lack parallelism among sections.

3 -        Material place is right sections but not well organized within the sections; disregards parallelism.

2 -        Some materials are placed in the wrong sections or are not adequately organized wherever they are placed.

1 -        Material placed in wrong sections or not sectioned; poorly organized wherever placed.

Materials and Methods Section

5 -        Contains effectively, quantifiably, concisely organized information that allows the experiment to be replicated; is written so that all information inherent to the document can be related back to this section; identifies sources of all data to be collected; identifies sequential information in an appropriate chronology; does not contain unnecessary, wordy descriptions of procedures.

4 -        As above, but contains unnecessary information, and/or wordy descriptions within the section.

3 -        Presents an experiment that is definitely replicable; all information in document may be related to this section; however, fails to identify some sources of data and/or presents sequential information in a disorganized, difficult pattern.

2-         Presents an experiment that is marginally replicable; parts of the basic design must be inferred by the reader; procedures not quantitatively described; some information in Results or Conclusions cannot be anticipated by reading the Methods and Materials section.

1 -        Describes the experiment so poorly or in such a nonscientific way that is cannot be replicated.

Non-experimental Information

5 -        Student researches and includes price and other nonexperimental information that would be expected to be significant to the audience in determining the better product, or specifically states non-experimental factors excluded by design; interjects these at appropriate positions in text and/or develops a weighted rating scale; integrates nonexperimental information in the Conclusions.

4 -        Student acts as above, but is somewhat less effective in developing the significance of the non-experimental information.

3 -        Student introduces price and other non-experimental information, but does not integrate them into Conclusions.

2 -        Student researches and includes price effectively; does not include or specifically exclude other non-experimental information.

1 -        Student considers price and/or other non-experimental variables as research variables; fails to identify the significance of these factors to the research.

Designing an Experiment

5 -        Student selects experimental factors that are appropriate to the research purpose and audience; measures adequate aspects of these selected factors; establishes discrete subgroups for which data significance may vary; student demonstrates an ability to eliminate bias from the design and bias-ridden statements from the research; student selects appropriate sample size, equivalent groups, and stastitics; student designs a superior experiment.

4 -        As above, but student designs an adequate experiment.

3 -        Student selects experimental factors that are appropriate to the research purpose and audience; measures adequate aspects of these selected factors; establishes discrete subgroups for which data significance may vary; research is weakened by bias OR by sample size of less than 10.

2 -        As above, but research is weakened by bias AND inappropriate sample size

1 -        Student designs a poor experiment.

Defining Operationally

5 -        Student constructs a stated comprehensive operational definition and well-developed specific operational definitions.

4 -        Student constructs an implied comprehensive operational definition and well-developed specific operational definitions.

3 -        Student constructs an implied comprehensive operational definition (possible less clear) and some specific operational definitions.

2 -        Student constructs specific operational definitions, but fails to construct a comprehensive definition.

1 -        Student lacks understanding of operation definition.

Controlling Variables

5 -        Student demonstrates, by written statement, the ability to control variables by experimental control and by randomization; student makes reference to, or implies, factors to be disregarded by reference to pilot or experience; superior overall control of variables.

4 -        As above, but student demonstrates an adequate control of variables.

3 -        Student demonstrates the ability to control important variables experimentally; Methods and Materials section does not indicate knowledge of randomization and/or selected disregard of variables.

2 -        Student demonstrates the ability to control some, but not all, of the important variables experimentally.

1 -        Student demonstrates a lack of understanding about controlling variables.

Collecting Data and Communicating Results

5 -        Student selects quantifiable experimental factors and/or defines and establishes quantitative units of comparison; measures the quantifiable factors and/or units in appropriate quantities or intervals; student selects appropriate statistical information to be utilized in the results; when effective, student displays results in graphs with correctly labeled axes; data are presented to the reader in text as well as graphic forms; tables or graphs have self-contained headings.

4 -        As 5 above, but the student did not prepare self-contained headings for tables or graphs.

3 -        As 4 above, but data reported in graphs or tables contain materials that are irrelevant. and/or not statistically appropriate.

2 -        Student selects quantifiable experimental factors and/or defines and establishes quantitative units of comparison; fails to select appropriate quantities or intervals and/or fails to display information graphically when appropriate.

1 -        Student does not select, collect, and/or communicate quantifiable results.

Interpreting Data: Drawing Conclusions/Implications

5 -        Student summarizes the purpose and findings of the research; student draws inferences that are consistent with the data and scientific reasoning and relates these to interested audiences; student explains expected results and offers explanations and/or suggestions for further research for unexpected results; student presents data honestly, distinguishes between fact and implication, and avoids overgeneralizing; student organizes non-experimental information to support conclusion; student accepts or rejects the hypothesis.

4 -        As 5 above, but student does not accept or reject the hypothesis.

3 -        As 4 above, but the student overgeneralizes and/or fails to organize non-experimental information to support conclusions.

2 -        Student summarizes the purpose and findings of the research; student explains expected results, but ignores unexpected results.

1 -        Student may or may not summarize the results, but fails to interpret their significance to interested audiences.

Table 10.1:  Student Scores on PTA for Science Reports, Before and After Anderson Made Pedagogical Changes

 

Trait

 

Before

 

After

 

P Values*

 

Title

 

2.95

 

3.22

 

.24

 

Introduction

 

3.18

 

3.64

 

.14

 

Scientific Format

 

3.09

 

3.32

 

.31

 

Methods and Materials

 

3.00

 

3.55

 

.14

 

Non-Experimental Info

 

3.18

 

3.50

 

.24

 

Designing the Experiment

 

2.68

 

3.32

 

.07

 

Defining Operationally

 

2.68

 

3.50

 

.01

 

Controlling Variables

 

2.73

 

3.18

 

.10

 

Collecting Data

 

2.86

 

3.36

 

.14

 

Interpreting Data

 

2.90

 

3.59

 

.03

 

Overall

 

2.93

 

3.42

 

.09

These judgments of student work in Anderson's course (both before and after) were made by external graders using Anderson's rubrics.

 

- Stephen C. Ehrmann.
This page draws substantially on the work of Barbara Walvoord, University of Notre Dame

 

PO Box 5643
Takoma Park, Maryland 20913
Phone
: 301.270.8312/Fax: 301.270.8110  

To talk about our work
or our organization
contact:  Sally Gilbert

Search TLT Group.org

Contact us | Partners | TLTRs | FridayLive! | Consulting | 7 Principles | LTAs | TLT-SWG | Archives | Site Map |