Grading

SCHOOL
Few issues have created more controversy among educators than those associated with grading and reporting student learning. Despite the many debates and multitudes of studies, however, prescriptions for best practice remain elusive. Although teachers generally try to develop grading policies that are honest and fair, strong evidence shows that their practices vary widely, even among those who teach at the same grade level within the same school. In essence, grading is an exercise in professional judgment on the part of teachers. It involves the collection and evaluation of evidence on students' achievement or performance over a specified period of time, such as nine weeks, an academic semester, or entire school year. Through this process, various types of descriptive information and measures of students' performance are converted into grades or marks that summarize students' accomplishments. Although some educators distinguish
between grades and marks, most consider these terms synonymous. Both imply a set of symbols, words, or numbers that are used to designate different levels of achievement or performance. They might be letter grades such as A, B, C, D, and F; symbols such as &NA;+, &NA;, and &NA;;
1
descriptive
words
such
as Exemplary,
Satisfactory, andNeeds
Improvement; or numerals such as 4, 3, 2, and 1. Reporting is the process by which these judgments are communicated to parents, students, or others.
Grading Systems
The two most common types of grading systems used at the University of Minnesota are norm-referenced and criterion-referenced. Many professors combine elements of each of these systems for determining student grades by using a system of anchoring or by presetting grading criteria which are later adjusted based on actual student performance. GRADING To grade or not to grade: Student perceptions of the effects of grading a course in work-integrated learning Assessment is an integral component of a students education and is recognised as an important factor in student learning .Assessment involves making judgements about the extent to which the performance of students meets particular standards. It also plays a significant role in fostering learning and the accreditation of students . As universities struggle to keep pace with a rapidly changing global context, assessment practices need to be reviewed and re-evaluated, particularly in relation to work integrated learning. If we establish appropriate assessment processes, effective teaching and learning will follow . The act of assessment signals the importance of what is being assessed, so assessment is a driver for learning . and summarised the purposes of assessment as:
3
considers that the answer to the first purpose may lie in formative assessment. He considers several aspects of assessment contribute to its formative effects. Firstly, assessment needs to be embedded into learning routines e.g. prompting learners with questions on aspects of a problem or task. Secondly, such assessment needs to occur frequently, thus the need for embedding in routine learning and teaching practices. Thirdly, it must be accompanied by informative feedback to learners about their learning progress. Good teachers are able to identify the gap between what students know and can do now and the goals for what students should know and be able to do . A fourth element of formative assessment is that learners must engage in the learning process. One of the most effective ways to encourage students to engage in assessment is through self-assessment, although peer-assessment is also used who argued: By deliberately keeping assessment out of the hands of the hands of learners, we are denying them one of the essential tools perhaps the essential tool
which enables them to become lifelong learners. Assessment of work integrated learning produces different challenges for students who are accustomed to assignments and examinations. The
strategies that students have used in these types of assessment may not necessarily be successful in the workplace setting. . indicate several differences of assessment conducted during fieldwork placement when
compared to assessment in formal academic settings:
assessed;
ing during assessment;
x concepts are assessed; and

5
Assessing student performance in work integrated learning is a difficult task involving many decisions by a number of stakeholders . Validity and
reliability are particular concerns due to the multiple variables that affect both the design and subsequent implementation of assessment practices Criterion-referenced assessment, which compares an individuals score with a specific criterion, is the form of assessment most commonly used in work integrated learning and considers the competency of a student along a
continuum of achievement. The purpose of this process is to determine the extent to which the standards have been achieved to allow more consistent and objective judgment . In the context of higher education, suggested that the use of grades for learning has been the subject of a long, ongoing debate. Original research by associated students pursuit of good grades with workers performance for pay and concluded that grades reward students for high academic standards examined the semester grade point averages (GPA) outcomes between students whose grades are averaged into their cumulative GPA with those who take courses that use a pass/fail basis. They found that students in the former category had an increase of 11.4% above the average in the mean semester GPA and that for study-abroad students who took
6
courses on a pass/fail basis, the results suggested that academic incentives were adversely affected by this grade transfer policy. noted that there was no universal agreement on the meaning of the term grading and defined it as the practice of assessing and reporting levels of performance in ... competency-based vocational education and training, which is generally used to recognise merit and excellence defined graded assessment as an approach that provides grades for combinations of demonstrated knowledge and performance. Other terms that have been utilised to describe grading include performance levels , levels of , levels of competency and levels of achievement conducted a comprehensive investigation into graded
competency-based assessment in Australia. They considered the practices and policies of grading levels of performance in vocational education and training programs provided by technical and further education (TAFE) institutions. Many of the results of their study are relevant to courses taught within Australian universities. The major proponents for the use of graded assessment were private educational providers, particularly with fee-paying students enrolled in tourism and hospitality courses. Thompson et al. (1996) provided an analysis if the particular stakeholder groups which advocate specific purposes for the use of grades. They indicated that teachers or trainers supported graded
7
assessment for motivational purposes and also as a reward for excellence. This group also emphasised the role of graded assessment in improving the level of confidence in the assessment process, as well as providing information about the amount and quality of learning achieved. Similarly, the employer group supported the use of graded assessment for its capacity to motivate and reward, for the provision of feedback on learning outcomes and for the purposes of promotion and recognition for entry into other educational programs. Tertiary institutions also supported its use for assisting in decisions about selection, whilst community groups supported the use of grading for
providing feedback about learning achieved. However, the authors stressed that other members of these same stakeholder groups did not support assessment indicated that the
grading. Opponents to the use of graded
practice was inconsistent with the principles of competency-based assessment (Thompson et al. 1996). Some respondents suggested that
grading created a competitive environment between students, where greater emphasis was placed on comparing individual students, rather than meeting an identified standard. Thompson et al. (1996) recommended further
research be conducted to develop quality instruments
and investigate the influence of graded assessment of student learning. Williams & Bateman (2003) reviewed further research conducted on the grade debate and concluded that grading added to the complexity of
assessment. They reported that the main drivers for graded assessment came from industry and students, who demonstrated dissatisfaction with the competent/ notyet competent reporting. Rumsey (1997) and Smith (2000) identified that some training providers used graded assessment as a marketing tool in the belief that dispensing a significant number of high grades makes the provider look good. Strong (1995) suggests that grades in TAFE courses are relied upon to predict success in further study. Dickson & Bloch (1999) suggested graded
assessment added value to competency standards, where these standards provided a starting point for improvement, whilst Griffin et al. (2001) indicated that the selection paradigm, drives the need for graded assessment. Rumsey & Associates (2002) developed the West Australian Model, which has been a significant contributor assessment in competency-based to the Australian debate on graded programs. The main principle
underpinning this model is that:
achievement of additional standards based on the Mayer key competencies (Maxwell, 2008). y competencies as:
assessment task;
nisation As feedback on performance can have significant influence, it is important that it should be as close as possible to true achievements. Johnson (2008) asserts that graded reporting affords such an outcome as it potentially provides more information than binary reporting techniques. Smith (2000) reported a majority of respondents interviewed for his research suggested that (ungraded) competency-based training and assessment were promoting mediocrity in the learning process. He considered the merits of grading from
10
the viewpoint of an assessor and reported that grading can improve the validity and consistency of assessments because it forces assessors to
analyse students performances with greater care than in non - graded reporting systems, particularly as they have to consider the evidence of a performance in more specific detail in relation to set criteria. On the other hand, Williams and Bateman (2003) reported that lower ability students might be adversely affected by grading. Thus, the effects of grading may not be consistent for all learners and that the characteristics of specific learner groups need to be considered. Andre (2000) indicated that the use of graded competency-based performance measures in assessing workplace performance needs
consideration. With the current international trend for nurse education and other clinical sciences to be situated within the university sector, clinical assessment based on merit rather than pass/fail or non-graded pass is
becoming more relevant. Benner (1984) suggested that varied levels of performance occur in clinical practice beyond what is regarded as an
acceptable standard. Thus, the use of pass-fail grades limits the reporting of performance standards to acceptable and non-acceptable practice. A meritorious grading system denotes standards beyond a mere pass,
including the communication of exemplary levels of performance. High

11
achieving students are disadvantaged by non-graded or pass/fail grading systems, as their achievements are not reported to employing bodies,
selection committees for postgraduate programs and scholarships (Biggs, 1992). Andre (2000) suggested that grading categories should be consistent with standard university graded assessment policy e.g. 85-100% would be classified as a high distinction, 75-84% as a distinction. In relation to medical education, Miller (2009) suggested that the primary purpose of any grading system is to measure the achievement of specific learning objectives. The information collected allows individual students to know where they stand in relation to the development of needed competencies, as well as supplying faculty and medical administration with information about the effectiveness of teaching strategies. Miller (2009) indicated that a traditional grade stratifies students according to their levels of achievement, can motivate students, rewards effort and may demonstrate suitability for a potential area of study, whereas a pass/fail grade indicates simply that a student has achieved an expected level of competence. She noted that if students are
hypercompetitive, it is unlikely that the grading system alone is responsible for creating that behaviour.
12
Similarly, if students consistently aim at minimal standards, the teachinglearning environment might lack the ingredients to inspire excellence. Competency-based assessment may differ from other forms of assessment because its outcomes may be based on observed performances carried out in a variety of contexts, such as workplaces or university-based simulations. Measures need to be taken to ensure that assessment decisions are consistent across these contexts (Johnson, 2008). Another problem is that competencybased qualification stakeholders can perceive graded assessments to be the same as norm-referenced assessments (Peddie, 1997; Williams and Bateman, 2003). This situation may lead to competent being equated with average or pass. Schofield and McDonald (2004) suggest that the status of a competent result might be devalued by graded assessments. They highlight the potential risk of competent judgments might equate with a bare minimum, rather than acknowledgement that the learner has reach a pre-determined standard. Hager, Athanasou and Gonczi (1994) suggest that it is possible to both support and oppose graded assessment, depending on the circumstances. They infer that the decision to grade or not to grade is ultimately a policy decision, which should be based on the benefits to be gained and whether grading is the most appropriate
13
strategy to achieve the desired benefits. Quirk (1995) adopted a similar approach, noting that the benefits and purposes must be clearly identified when making a decision to grade or not to grade.
14
TYPES OF GRADING Norm-Referenced Systems Definition

In norm-referenced systems, students are evaluated in relationship to one another (e.g., the top 10% of students receive an A, the next 30% a B). This grading system rests on the assumption that the level of student performance will not vary much from class to class. In this system, the instructor usually determines the percentage of students assigned each grade, although it may be determined (or at least influenced) by departmental policy.
Criterion-Referenced Systems
Definition
Norm-referenced tests measure students relative to each other. Criterionreferenced tests measure how well individual students do relative to predetermined performance levels. Teachers use criterion-referenced tests when
15
they want to determine how well each student has learned specific knowledge or skills. In criterion-referenced systems, students are evaluated against an absolute scale, normally a set number of points or a percentage of the total (e.g., 95100 = A, 88-94 = B). Since the standard in this grading system is absolute, it is possible that all students could get As or all students could get Ds.
Advantages
Students are not competing with each other and are thus more likely to actively help each other learn. A student's grade is not influenced by the caliber of the class.
Disadvantages
It is difficult to set a reasonable standard for students without a fair amount of teaching experience. Most experienced faculty set criteria based on their knowledge of how students usually perform; thus, criterion-referenced systems often become fairly similar to norm-referenced systems.
16
Possible modifications
Instructors sometimes choose to maintain some flexibility in their grading system by telling the class in advance that the threshold for grades may be lowered if it seems appropriate. Thus, if a first exam was more difficult for students than the instructor imagined, she/he can lower the grading criteria rather than trying to compensate for the difficulty of the first exam with an easier second exam. Raising the criteria because too many students achieved As, however, is never advisable. Another way of doing criterion-referenced grading is by listing class objectives and assigning grades based on the extent to which the student achieved them. For example, A = Student has achieved all major and minor objectives of the course; B = Student has achieved all major objectives and several minor objectives; etc.. Before deciding on a criterion-referenced system, consider: How will you determine reasonable criteria for students? If you are
teaching the class for the first time, it is advisable to maintain some flexibility. Other Systems
17
Some alternate systems of grading include contract grading, peer grading, and self-evaluation by students. In contract grading, instructors list activities students can participate in or objectives they can achieve, usually attaching a specified number of points for each activity (e.g., book report = 30 points, term paper = 60 points). Students select the activities and/or objectives which will give them the grade they want and a contract is signed. It is advisable to have qualitative criteria stated in the contract in addition to listing the activities. In some classes, a portion of a student's grade is determined by peers' evaluation of his/her performance. If students are told what to look for and how to grade, they generally can do a good job. The agreement between peer and instructor rating is about 80%. Peer grading is often used in composition classes and speech classes. If used, it should always be done anonymously. Students can also be asked to assess their own work in the class and their assessment can be a portion of the final grade. This method has educational value since learning to assess one's own progress contributes to the university's goal of preparing students to be lifelong learners.
18
A research analysis found that the percentages of self-assessors whose grades agree with those of faculty graders vary from 33% to 99%. Experienced students tend to rate themselves similarly to faculty while less experienced students generally give themselves higher grades than a faculty grader. Students in science classes also produced self-assessments which closely matched faculty assessment. If self-assessment is used, the instructor and student should meet to discuss the student's achievement before the selfevaluation is made.
A Brief History Grading and reporting are relatively recent phenomena in education. In fact, prior to 1850, grading and reporting were virtually unknown in schools in the United States. Throughout much of the nineteenth century most schools grouped students of all ages and backgrounds together with one teacher in one-room schoolhouses, and few students went beyond elementary studies. The teacher reported students' learning progress orally to parents, usually during visits to students' homes.
19
As the number of students increased in the late 1800s, schools began to group students in grade levels according to their age, and new ideas about curriculum and teaching methods were tried. One of these new ideas was the use of formal progress evaluations of students' work, in which teachers wrote down the skills each student had mastered and those on which additional work was needed. This was done primarily for the students' benefit, since they were not permitted to move on to the next level until they demonstrated their mastery of the current one. It was also the earliest example of a narrative report card. With the passage of compulsory attendance laws at the elementary level during the late nineteenth and early twentieth centuries, the number of students entering high schools increased rapidly. Between 1870 and 1910 the number of public high schools in the United States increased from 500 to 10,000. As a result, subject area instruction in high schools became increasingly specific and student populations became more diverse. While elementary teachers continued to use written descriptions and narrative reports to document student learning, high school teachers began using percentages and other similar markings to certify students' accomplishments in different subject areas. This was the beginning of the grading and reporting systems that exist today.
20
The shift to percentage grading was gradual, and few American educators questioned it. The practice seemed a natural by-product of the increased demands on high school teachers, who now faced classrooms with growing numbers of students. But in 1912 a study by two Wisconsin researchers seriously challenged the reliability of percentage grades as accurate indicators of students' achievement. In their study, Daniel Starch and Edward Charles Elliott showed that high school English teachers in different schools assigned widely varied percentage grades to two identical papers from students. For the first paper the scores ranged from 64 to 98, and the second from 50 to 97. Some teachers focused on elements of grammar and style, neatness, spelling, and punctuation, while others considered only how well the message of the paper was communicated. The following year Starch and Elliot repeated their study using geometry papers submitted to math teachers and found even greater variation in math grades. Scores on one of the math papers ranged from 28 to 95a 67-point difference. While some teachers deducted points only for a wrong answer, many others took neatness, form, and spelling into consideration.
21
These demonstrations of wide variation in grading practices led to a gradual move away from percentage scores to scales that had fewer and larger categories. One was a three-point scale that employed the categories of Excellent, Average, and Poor. Another was the familiar five-point scale of Excellent, Good, Average, Poor, and Failing, (or A, B, C, D, and F). This reduction in the number of score categories served to reduce the variation in grades, but it did not solve the problem of teacher subjectivity. To ensure a fairer distribution of grades among teachers and to bring into check the subjective nature of scoring, the idea of grading based on the normal probability, bell-shaped curve became increasingly popular. By this method, students were simply rank-ordered according to some measure of their performance or proficiency. A top percentage was then assigned a grade of A, the next percentage a grade of B, and so on. Some advocates of this method even specified the precise percentages of students that should be assigned each grade, such as the 6-22-44-22-6 system. Grading on the curve was considered appropriate at that time because it was well known that the distribution of students' intelligence test scores approximated a normal probability curve. Since innate intelligence and school achievement were thought to be directly related, such a procedure
22
seemed both fair and equitable. Grading on the curve also relieved teachers of the difficult task of having to identify specific learning criteria. Fortunately, most educators of the early twenty-first century have a better understanding of the flawed premises behind this practice and of its many negative consequences. In the years that followed, the debate over grading and reporting intensified. A number of schools abolished formal grades altogether, believing they were a distraction in teaching and learning. Some schools returned to using only verbal descriptions and narrative reports of student achievement. Others advocated pass/fail systems that distinguished only between acceptable and failing work. Still others advocated a mastery approach, in which the only important factor was whether or not the student had mastered the content or skill being taught. Once mastered, that student would move on to other areas of study. At the beginning of the twenty-first century, lack of consensus about what works best has led to wide variation in teachers' grading and reporting practices, especially among those at the elementary level. Many elementary teachers continue to use traditional letter grades and record a single grade on the reporting form for each subject area studied. Others use numbers or
23
descriptive categories as proxies for letter grades. They might, for example, record a 1, 2, 3, or 4, or they might describe students' achievement as Beginning, Developing, Proficient, orDistinguished. Some elementary schools have developed standards-based reporting forms that record students' learning progress on specific skills or learning goals. Most of these forms also include sections for teachers to evaluate students' work habits or behaviors, and many provide space for narrative comments. Grading practices are generally more consistent and much more traditional at the secondary level, where letter grades still dominate reporting systems. Some schools attempt to enhance the discriminatory function of letter grades by adding plusses or minuses, or by pairing letter grades with percentage indicators. Because most secondary reporting forms allow only a single grade to be assigned for each course or subject area, however, most teachers combine a variety of diverse factors into that single symbol. In some secondary schools, teachers have begun to assign multiple grades for each course in order to separate achievement grades from marks related to learning skills, work habits, or effort, but such practices are not widespread.
Advantages
24
Norm-referenced systems are very easy for instructors to use. They work well in situations requiring rigid differentiation among students, where, for example, due to program size restrictions, only a certain percentage of the students can advance to higher level courses. They are generally appropriate in large courses which do not encourage cooperation among students.
Disadvantages
One objection to norm-referenced systems is that an individual's grade is determined not only by his/her achievements, but also by the achievements of others. In a large, non-selective lecture class, you can be fairly confident that the class is representative of the student population; however,in small classes (under 40) the group may not be a representative sample. One student may get an A in a low-achieving section while a fellow student with the same score in a higher-achieving section recieves a B. A second objection to norm-referenced grading is that it promotes competition rather than cooperation. When students are pitted against each other for the few As to be given out, they're less likely to be helpful to each other.
Possible Modification
25
When using a norm-referenced system in a small class, you need to modify the allocation of grades based on the caliber of students in the class. One method of modifying a norm-referenced system is anchoring. Jacobs and Chase in Developing and Using Tests Effectively: A Guide for Faculty (1992), describe the following ways to use an anchor: "If instructors have taught a class several times and have used the same or an equivalent exam, then the distribution of test scores accumulated over many classes can serve as the anchor. The present class is compared with this cumulative distribution to judge the ability level of the group and the appropriate allocation of grades. Anchoring also works well in multi-section courses where the same text, same syllabus, and same examinations are used. The common examination can be used to reveal whether and how the class groups differ in achievement, and the grade in the individual sections can be adjusted accordingly.... If an instructor is teaching a class for the first time and has no other scores for comparison, a relevant and well-constructed teacher-made pretest may be used as an anchor." Modifying the norm-referenced system by anchoring also helps mitigate feelings of competition among students since they may feel they are not directly in competition with each other.
26
Before deciding on a norm-referenced system, consider: What is your expected class size? If your class size is smaller than 40,
do not use a norm-referenced system unless you use anchoring to modify it. Is it important for students to work cooperatively in this class (e.g.,
form study groups or work on team projects)? If the answer is yes, a normreferenced system is not appropriate for your class.
CHARACTERISTICS OF A GOOD GRADING SYSTEM

1. Grades should be relevant to major course objectives. Although this may seem obvious, students often complain that there is no connection between the stated course objectives and the way they are evaluated. For example, one frequent lament goes something like this: "Professor X said the most important thing he wanted us to get out of this class is to be able to think critically about the material, but our entire grade was based on two multiple choice exams which tested our memory of names, dates, and definitions!"
27
When preparing your grading system for a course, begin with a list of your objectives for the course. Assign relative weights to the objectives in terms of their importance. Be sure the items you are including as part of the grade (e.g. exams, papers, projects) reflect the objectives and are weighted to reflect the importance of the objectives they are measuring. 2. Grades should have recognized meaning among potential users. Because the purpose of grades is to communicate the extent to which students have learned the course materials, grades should be based primarily on the students' performance on exams, quizzes, papers, and other measures of learning specified at the beginning of the course. Items such as effort, attendance, or frequency of participation, although contributing factors to student learning, do not actually reflect the extent to which students have learned the course materials. 3. The grading process should be impartial and compare each student to the same criteria. If you are willing to offer extra credit or opportunities to retake exams or rewrite assignments, the offer should be made to the whole class rather than only to individuals who request these opportunities.
28
4. Grades should be based on sufficient data to permit you to make valid evaluations of student achievement. It is rarely justifiable to base students' grades solely on their performance on one or two exams. Unless the exams are extremely comprehensive, one or two exams would provide an inadequate sampling of course content and objectives. There is also a likelihood that an off-day could lower a student's grade considerably and be an inaccurate reflection of how much he/she has learned. Generally speaking, the greater the number and variety of items used to determine grades, the more valid and reliable the grades will be. 5. The basis for the grading should be statistically sound. If you say that an exam is worth 15% of the total grade, use a procedure for combining scores that ensures that this will be the case. The University of Minnesota Office of Measurement Services can help you find an appropriate procedure.
Research Findings Over the years, grading and reporting have remained favorite topics for researchers. A review of the Educational Resources Information Center
29
(ERIC) system, for example, yields a reference list of more than 4,000 citations. Most of these references are essays about problems in grading and what should be done about them. The research studies consist mainly of teacher surveys. Although this literature is inconsistent both in the quality of studies and in results, several points of agreement exist. These points include the following: Grading and reporting are not essential to the instructional process. Teachers do not need grades or reporting forms to teach well, and students can and do learn many things well without them. It must be recognized, therefore, that the primary purpose of grading and reporting is other than facilitation of teaching or learning. At the same time, significant evidence shows that regularly checking on students' learning progress is an essential aspect of successful teaching but checking is different from grading. Checking implies finding out how students are doing, what they have learned well, what problems or difficulties they might be experiencing, and what corrective measures may be necessary. The process is primarily a diagnostic and prescriptive interaction between teachers and students. Grading and reporting, however,
30
typically involve judgment of the adequacy of students' performance at a particular point in time. As such, it is primarily evaluative and descriptive. When teachers do both checking and grading, they must serve dual roles as both advocate and judge for studentsroles that are not necessarily compatible. Ironically, this incompatibility is usually recognized when administrators are called on to evaluate teachers, but it is generally ignored when teachers are required to evaluate students. Finding a meaningful compromise between these dual roles is discomforting to many teachers, especially those with a child-centered orientation. Grading and reporting serve a variety of purposes, but no one method serves all purposes well. Various grading and reporting methods are used to: (1) communicate the achievement status of students to their parents and other interested parties; (2) provide information to students for selfevaluation; (3) select, identify, or group students for certain educational paths or programs; (4) provide incentives for students to learn; and (5) document students' performance to evaluate the effectiveness of instructional programs. Unfortunately, many schools try to use a single method of grading and reporting to achieve all of these purposes and end up achieving none of them very well.
31
Letter grades, for example, offer parents and others a brief description of students' achievement and the adequacy of their performance. But using letter grades requires the abstraction of a great deal of information into a single symbol. In addition, the cut-offs between grades are always arbitrary and difficult to justify. Letter grades also lack the richness of other, more detailed reporting methods such as narratives or standards-based reports. These more detailed methods also have their drawbacks, however. Narratives and standardsbased reports offer specific information that is useful in documenting student achievement. But good narratives take time to prepare and as teachers complete more narratives, their comments become increasingly standardized. Standards-based reports are often too complicated for parents to understand and seldom communicate the appropriateness of student progress. Parents often are left wondering if their child's achievement is comparable with that of other children or in line with the teacher's expectations. Because no single grading method adequately serves all purposes, schools must first identify their primary purpose for grading, and then select or develop the most appropriate approach. This process involves the difficult task of seeking consensus among diverse groups of stakeholders.
32
Grading
and
reporting
require
inherently
subjective
judgments. Grading is a process of professional judgmentand the more detailed and analytic the grading process, the more likely it is that subjectivity will influence results. This is why, for example, holistic scoring procedures tend to have greater reliability than analytic procedures. However, being subjective does not mean that grades lack credibility or are indefensible. Because teachers know their students, understand various dimensions of students' work, and have clear notions of the progress made, their subjective perceptions can yield very accurate descriptions of what students have learned. Negative consequences result when subjectivity translates to bias. This occurs when factors apart from students' actual achievement or performance affect their grades. Studies have shown, for example, that cultural differences among students, as well as their appearance, family backgrounds, and lifestyles, can sometimes result in biased evaluations of their academic performance. Teachers' perceptions of students' behavior can also significantly influence their judgments of academic performance. Students with behavior problems often have no chance to receive a high grade because their infractions over-shadow their performance. These effects are especially pronounced in judgments of boys. Even the neatness of
33
students' handwriting can significantly affect teachers' judgments. Training programs help teachers identify and reduce these negative effects and can lead to greater consistency in judgments. Grades have some value as rewards, but no value as
punishments. Although educators would undoubtedly prefer that motivation to learn be entirely intrinsic, the existence of grades and other reporting methods are important factors in determining how much effort students put forth. Most students view high grades as positive recognition of their success, and some work hard to avoid the consequences of low grades. At the same time, no studies support the use of low grades or marks as punishments. Instead of prompting greater effort, low grades usually cause students to withdraw from learning. To protect their self-image, many regard the low grade as irrelevant and meaningless. Other students may blame themselves for the low mark, but feel helpless to improve. Grading and reporting should always be done in reference to learning criteria, never "on the curve." Although using the normal probability curve as a basis for assigning grades yields highly consistent grade distributions from one teacher to the next, there is strong evidence that it is detrimental to relationships among students and between teachers and students. Grading on
34
the curve pits students against one another in a competition for the few rewards (high grades) distributed by the teacher. Under these conditions, students readily see that helping others threatens their own chances for success. Modern research has also shown that the seemingly direct relationship between aptitude or intelligence and school achievement depends on instructional conditions. When the quality of instruction is high and well matched to students' learning needs, the magnitude of this relationship diminishes drastically and approaches zero. Moreover, the fairness and equity of grading on the curve is a myth. Relating grading and reporting to learning criteria, however, provides a clearer picture of what students have learned. Students and teachers alike generally prefer this approach because they consider it fairer. The types of learning criteria teachers use for grading and reporting typically fall into three general categories: 1. Product criteria are favored by advocates of standards-based approaches to teaching and learning. These educators believe the primary purpose of grading and reporting is to communicate a summative evaluation of student achievement and performance. In
35
other words, they focus on what students know and are able to do at a particular point in time. Teachers who use product criteria base grades exclusively on final examination scores, final products (reports or projects), overall assessments, and other culminating demonstrations of learning. 2. Process criteria are emphasized by educators who believe product criteria do not provide a complete picture of student learning. From this perspective, grading and reporting should reflect not just the final results but also how students got there. Teachers who consider effort or work habits when reporting on student learning are using process criteria. So are teachers who count regular classroom quizzes, homework, class participation, or attendance. 3. Progress criteria, often referred to as improvement scoring, learning gain, or value-added grading, consider how much students have gained from their learning experiences. Teachers who use progress criteria look at how far students have come over a particular period of time, rather than just where they are. As a result, grading criteria may be highly individualized. Most of the research evidence on progress criteria in grading and reporting comes from studies of differentially paced instructional programs and special education programs.
36
Teachers who base their grading and reporting procedures on learning criteria typically use some combination of these three types. Most also vary the criteria they employ from student to student, taking into account individual circumstances. Although usually done in an effort to be fair, the result is a "hodgepodge grade" that includes elements of achievement, effort, and improvement. Researchers and measurement specialists generally recommend the use of product criteria exclusively in determining students' grades. They point out that the more process and progress criteria come into play, the more subjective and biased grades are likely to be. If these criteria are included at all, they recommend reporting them separately. Conclusion The issues of grading and reporting on student learning continue to challenge educators. However, more is known at the beginning of the twenty-first century than ever before about the complexities involved and how certain practices can influence teaching and learning. To develop grading and reporting practices that provide quality information about student learning requires clear thinking, careful planning, excellent communication skills, and an overriding concern for the well-being of
37
students. Combining these skills with current knowledge on effective practice will surely result in more efficient and more effective grading and reporting practices.
QUESTIONER 1) How valid did students consider the individual assessment items?
38
2) How fairly did students perceive the individual assessment items had been marked? 3) How has grading of the course affected: 4) Student motivation and effort in the course? 5) The level of student reflection and critical thinking? 6) Student interaction within the course?Group cohesion and competition? 7) Students sense of achievement at the completion of the course? 8) The attitudes of students towards lifelong learning? 9) Student overall enjoyment of the course? 10) Other relevant learning outcomes?
BIBLIOGRAPHY
39
AUSTIN, SUSAN, and MCCANN, RICHARD. 1992. "'Here's Another Arbitrary Grade for Your Collection': A Statewide Study of Grading Policies." Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA. BAILEY, JANE, and MCTIGHE, JAY. 1996. "Reporting Achievement at the Secondary Level: What and How." In Communicating Student Learning. 1996 Yearbook of the Association for Supervision and Curriculum Development, ed. Thomas R. Guskey. Alexandria, VA: Association for Supervision and Curriculum Development. BLOOM, BENJAMIN S. ; MADAUS, GEORGE F.; and HASTINGS, J. THOMAS. 1981. Evaluation to Improve Learning. New York: McGrawHill. BRACEY, GERALD W. 1994. "Grade Inflation?" Phi Delta Kappan 76 (4):328329. BROOKHART, SUSAN M. 1991. "Grading Practices and
Validity." Educational Measurement: Issues and Practice 10 (1):3536.
40
41

Grading

Cargado por

Información del documento

Descripción original:

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Grading

Cargado por

Copyright:

Formatos disponibles

SCHOOL

compared to assessment in formal academic settings:

ing during assessment;

x concepts are assessed; and

grading. Opponents to the use of graded

research be conducted to develop quality instruments

underpinning this model is that:

including the communication of exemplary levels of performance. High

TYPES OF GRADING Norm-Referenced Systems Definition

CHARACTERISTICS OF A GOOD GRADING SYSTEM

Validity." Educational Measurement: Issues and Practice 10 (1):3536.

También podría gustarte