Está en la página 1de 17

INTRODUCTION

Testing, assessment, and evaluation are part of life in this modern era. A lot of school in
the world are constantly assessed, whether to observe their educational development and also to
evaluate the quality of systems in the school. Entrance to educational establishments, to
professions and even to entire countries is sometimes controlled by tests. Tests play an essential
and controversial role in allowing access to the limited resources and opportunities that our
world provides. The importance of understanding what we test, how we test and the impact that
the use of tests has on individuals and societies cannot be overstated. Testing is more than a
technical activity; it is also an ethical enterprise. The practice of language testing draws upon,
and also contributes to, all disciplines within applied linguistics. However, there is something
fundamentally different about language testing.

Language testing as a methodology for inquiring and investigating language ability and
learning needs. According to Surianto 2013, language testing is an assessment to know how far
students are able to understand the language learner. As being central to teaching and learning
language, it provides goals for language teaching and its can monitors for both teachers and
learners success in reaching these goals. Language testing also provides a methodology for
conduct experiment and investigating both language teaching and language learning. Spolsky
(2001) states that in the course of the first 2000 years that human abilities have been assessed
formally, assessment and evaluation have become progressively more powerful. Language
testing is a sub-field within applied linguistics, has evolved and expanded in a number of ways in
the past decades. Bachman (1999) presents a brief review of language testing at the turn of the
century in the newsletter of the American Association for Applied Linguistics. He investigates
language, when language testing practice, as reflected in large-scale institutional language testing
and in most language testing textbooks, was informed essentially by a theoretical view of
language ability as consisting of skills (listening, speaking, reading, and writing) and
components (e.g., grammar, vocabulary, pronunciation). The approach taken to test design was
based on the idea of testing isolated discrete points of language, while the primary concern was
psychometric reliability. Language testing research was dominated largely by the hypothesis that
language proficiency consisted of a single trait, and required a quantitative statistical research
methodology.

The parts of language testing are test, assessment and evaluation that are significant
components in teaching and learning English language. Devoid of an effective evaluation
program it is not possible to know whether students have learned, whether teaching has been
effective, or how best student to learning needs. The quality of the test, assessment and
evaluation in the educational process has an unfathomable and predetermined relation to student
performance.

The purpose of writing this paper is to explain about the definition of a test, assessment,
and evaluation, to explain the types of tests based on the categories, to explain what and how to
test listening and reading, to explain what and how to test speaking and writing and the last to
explain about the principles of a good test. This paper can help the students and the teachers to
understand about the language testing.

The writer will try to explain about the definition of a test, assessments, and evaluation.
Second is explains about the types of tests based on the categories. The third explain about what
and how to test listening and reading, speaking and writing. Then explanation about the
principles of a good test and the last is conclusion of this paper.

The definition of test, assessment, and evaluation

A. Test

Test is a matter of concern to all teachers-whether they are in the classroom or engaged in
syllabus or materials, administration or research (Suryanto, 2013). According to Carroll 1968,a
psychological or educational test is a procedure design to elicit certain behavior from which one
can make inferences about certain characteristic of an individual. Therefore, a test is meeting
point, considered to measure a set of skills at a certain time. Spolsky (2001) He further mentions
five major purposes for test using as follows: the first, using tests as a competitive selection
device, second is using tests in order to provide information on the quality of the product to
those who are paying for an education system, the third is using tests to process and certify that
an individual has achieved a specific level of technical or professional skill, then using tests for
prediction or prognosis of the probable results of training and the last is using tests as an integral
part of all good teaching.

B. Assessment

Assessment is the systematic collection, review and use of information about educational
programs to improve student learning. Gipps and Lynch (2005) they said that an assessment is a
Judging the ability of learner based on a test or otherwise and using the judgment as constructive
element in learning over time. Assessment focuses on what students know, what they are able to
do, and what values they have when they graduate. Assessment is concerned with the collective
impact of a program on student learning and Assessment is provides feedback on knowledge,
skills, attitudes, and work products for the purpose of elevating future performances and learning
outcomes. Then, according to Suryanto (2013), Assessment is a broader in scope and involves
gathering information over a period of time. This information might include formal test,
classroom observation, student self assessments, of from other data sources.

Assessment provides information on whether in the teaching and the learning has been
successful. However the information it provides has a number of potential different audiences
whose precise requirements may vary. Classroom teachers need regular information on how
pupils knowledge, skills and understanding are developing, both to inform how they should
adjust their teaching and to determine what kind of feedback is needed to improve pupils
learning. On the other hand, school principals and policy makers need additional, broader
information on the quality of education in a school or country. The sort of comparative data
required for this purpose needs a high level of reliability and uniformity. In the case of language
as school subject this requirement is challenging because it is difficult to create tests which are
manageable but at the same time faithful to the aims of the subject. Employers and society at
large also need reliable information which can help certify achievement and provide a basis for
selection. Parents too require information which can help them understand their childrens
achievements. Learners also need to know how they are progressing and how to improve their
performance but they many need to be protected from the potentially of effects in negative
assessment. A starting point for resolving tensions related to matters of assessment is to develop
understanding of other points of view. A key challenge is to develop a system of assessment that
acknowledges the different functions of assessment and it helps to see these as complementary
rather than being in opposition to each other.

C. Evaluation

Andrews and Werner (1988) provide a fairly comprehensive definition of


evaluation. According to them, to evaluate is to make an explicit judgment about the worth of
all or part of a program by collecting evidence to determine if acceptable standards have been
met. This definition of evaluation has two key terms: Standards are ideals or desired qualities or
conditions against which actual objectives are to be measured. Evidence is information
necessary to help us confirm whether or not the required standards have been met by the
program. For example, adoption of the no till practices in a watershed is the standard and
percent of farmers adopting the no till practice within the first five years of the project is the
evidence and standards or desired qualities or conditions against which program outcomes are
measured come straight from the written goals and objectives of the program.

There are some types of tests based on the categories:

1. When to test?

A. Summative Vs formative

Summative test is Summative test is the attempt to summarize students learning at some
point in time, in the end of the course. The sample of summative vs formative is UN (Ujian
Nasional). The purpose of this test is to know the ability of students. Meanwhile the formative of
test is the test that held to know whether or not the program is run a way. It gives to the students
or participants in order to plan what is should to do after doing the test and the sample of
formatives test are daily test after the material and mid semester test.

B. Pre-test Vs post-test

Pre-test is test before the end of learning and teaching activity, the purpose of this test is
preliminary test administered to determine a students baseline knowledge or preparedness for
an educational experience or course of the study. Then post-test is test after the end of learning
and teaching activity.

C. Placement

The aim of a placement test is to help sort new students into teaching groups of roughly the
same level. As they are not related to any particular course taken these tests often start simply
and get more difficult to cater for a range of abilities. It can identify a particular performance
level of the students and to place the test-takers at an appropriate level of instruction.

D. Aptitude

Aptitude tests are similar to skill and intelligence tests, and are used to determine an
individuals capability in performing particular tasks. Aptitude tests frequently consist of items
that are intended to evaluate the takers special abilities inside a designated area. There are many
kinds of aptitude tests that seek to gauge ones capacity in a certain area, such as verbal,
numerical, clerical, sensory, spatial or mechanical, and logic and reasoning skills.

E. Progress

A progress test will basically display the activities based on the material the teacher is
determined to check. To evaluate it the teacher can work out a certain system of points that later
will compose a mark. Typically, such tests do not influence the students final mark at the end of
the year.

F. Achievement

Achievement tests are meant to check the mastery of the material covered by the learners.The
test is based on a syllabus studied or a book taken during the course. This test could be described
as a fair test, for it focuses mainly on the detailed material that the students are supposed to have
studied or Is a test which is giving to know how much the material that the students have
already achieved after joining the learning process/instruction.

1. How to score?
Subjective Vs Objective

Subjective test in which the impression or opinion of the scorer determine the score or
evaluation of performance. While a test for which the scoring procedure is completely specified
enabling agreement among different scorers. It means that we can make answer key in advance
and the correct answer must be the same with the one in answer key.

2. How much to test?

Discrete-point Vs Integrative, According to Longman Dictionary, discrete point test is a


language test that is meant to test a particular language item, e.g. tenses. The basis of that type of
tests is that its can test components of the language (grammar, vocabulary, pronunciation, and
spelling) and language skills (listening, reading, speaking, and writing) separately. Discrete -
point test is a common test used by the teachers in our schools. for example having studied a
grammar topic or new vocabulary, having practiced it a great deal, the teacher basically gives a
test based on the covered material. This test usually includes the items that were studied and will
never display anything else from a far different field. According to Longman Dictionary, the
integrative test intends to check several language skills and language components together or
simultaneously. Hughes (1989:15) stipulates that the integrative tests display the learners
knowledge of grammar, vocabulary, spelling together, but not as separate skills or items. The
teacher should incorporate both types of testing for effective evaluation of the students true
language abilities.

3. How real is the test situation?

Direct Vs Indirect, direct testing is that it is intended to test some certain abilities, and
preparation for that usually involves persistent practice of certain skills. Indirect testing, tests the
usage of the language in real-life situation. Moreover, it suits all situations; whereas direct testing
is bound to certain tasks intended to check a certain skill. Hughes (ibid.) assumes that indirect
testing is more effective than direct one, for it covers a broader part of the language. It denotes
that the learners are not constrained to one particular skill and a relevant exercise. They are free
to elaborate all four skills; what is checked is their ability to operate with those skills and apply
them in various, even unpredictable situations. This is the true indicator of the learners real
knowledge of the language.

4. How and what are the students score compared to?

Norm-referenced Vs Criterion-referenced,

Norm-referenced test measures that the knowledge of the learner and compares it with the
knowledge of another member of his or her group. The learners score is compared with the
scores of the other students or a test that compares the result of the test among the population. It
has no criteria and the cutting score is not clear, for example test includes the UMPTN test.
Then, criterion-referenced test measures the knowledge of the students according to set standards
or criteria. This means that there will be certain criteria according to which the students will be
assessed. There will be various criteria for different levels of the students language knowledge.
Here the aim of testing is not to compare the results of the students. It is connected with the
learners knowledge of the subject. The example is TOEFL test.

What and how to test listening and reading (Barrets Taxsonomi)?

Language testing has skill and the sub-skills. The most popular is Barret Taxsonomi. Its the
best basis for testing listening and reading. In The levels in listening and reading consist of literal
comprehension, reorganization, inferential, comprehension, evaluation and the appreciation. The
first is Appreciation (Highest) Students give an emotional or image-based response and the
student get Critique, appraise, comment and appreciate. The second is evaluation, in the
evaluation the teacher give Analyze, appraise, evaluate, justify, reason, critics and judge for the
students or Evaluation Students make judgments in light of the material. The third, in Inference
students respond to information implied but not directly stated like predict, infer and guess.
Then, Reorganization (Classify, regroup, rearrange, assemble, collect and categorize) its mean
that Students organize or order the information in a different way than it was presented and the
last is literal comprehension (Lowest). The parts are Label, list, name, relate, recall, repeat and
state. Students identify information directly stated.

a. Testing of listening skills


Listening skills consist of sub skills and levels. For the first one is test about the students sub
skills that are about the identify main facts and details, relate cause and effect, identify sequence
of events, predicting outcomes, and inferring meaning from contextual clues. There are criteria
for selection of listening texts: language consisting of lexical complexity, semantic complexity
and syntactical complexity. The next criteria are accessibility, text length, authenticity, audio
quality, and ideas focus of student schema, experiences, cultures, and language proficient. The
last criterion is exploitability consisting of adaptability and simplification. The sources of
listening text are announcement, interviews, report, stories, lectures, dialogues, poems, play,
songs, advertisement, speeches, talks, news, this all can be use teacher to test listening skill of
student. Moreover ,the teacher can be test the student use listening text types (genre) such as
descriptive, narrative, expository, discussion, speech, talk, interview, poem, and etc.

b. Testing Reading Skill

Testing of reading skills has several criterions for selection of reading text. The criterions for
selection of reading text are: idea, exploitability, language, accessibility and text length. If we
will search the sources of reading text, we can search in journals, mass media, comics and
storybooks. Beside the sources in above, we can find out the source of reading text in internet,
reports, manuals and many others. The last of reading skills the written will explain about types
of reading test question. The type of reading test question are: true or false, rearrangement,
structured (controlled, guided), open-ended (subjective response, free writing), MCQ (question,
matching and completion with some options) and cloze procedure. In testing of reading skills the
teacher can use many sources of reading text such as theater, comics, reports, story books,
reference books, internets, text books, traveling agencies, restaurants, manuals, and mass media
like newspaper, magazine, radio or TV. In addition, descriptive, poem, play, table, graph, charts,
expository, report, and narrative text can be used to test the reading text types or genre.

What and how to test speaking and writing?

Speaking and writing tests using Blooms Taxonomy model. Its created by Benjamin
Bloom; The Blooms taxonomy is very useful for structure in which to categorize test questions.
This is a chart of Blooms taxonomy.
Arrange, define, label,list, memorize name, relate, recall,
repeat, state

Classify, describe, discuss, explain, express, identify,


indicate, locate, recognize, report.

Apply, choose, demonstrate, illustrate, interpret, operate,


solve, use, employ.

Analyze, appraise, calculate, categorize, contrast, criticize,


differentiate, and distinguish.

Arrange, assemble, collect, compose, plan, construct,


create, design, develop, purpose.

Appraise, argue, assess, justify, judge, rate, support, value,


evaluate.
a. Testing of speaking skills

The test of speaking skill the teacher is able to use the Blooms taxonomy. There are a
number of components that need to test in speaking skills, language consist of vocabulary,
syntax, grammar and sentence complexity, organization consist of appropriately and format,
pronunciation consist of accuracy and clarity, content / ideas consist of clarity, quality and
quantity, turn-taking and fluency, confidence, eye-contact and style. The example of test to
speaking skill is speech, talk, sing and etc.

b. Testing of writing skills

The last is test of writing skill. The several of skill should be measured testing of writing
skills are organization (appropriately and paragraphing), content/ideas (quantity, quality/level
and clarity) and language using vocabulary, sentence complexity, syntax and accuracy). One of
the samples testing of writing skill is make paper.

The Principle of Good Test

There are many principle of good test, for the first is reliability, validity, practically and the
last is discrimination.

Reliability

Reliability is meant the stability of test scores. A test cannot measure anything well
unless it measures consistently. Two somewhat different types of consistency or reliability are
involved: reliability of the test itself, and reliability of the scoring of the test. Test reliability may
be estimated in a number of ways, they are retest the same individuals with the same test, use of
alternate or parallel forms-that is, with different versions, and giving a single administration of
one form of the test and then. Finally, it must always be remembered that reliability refers
purely and simply to the precision with which the test measures. No matter how high the
reliability quotient, it is by no means a guarantee that the test measures what the test user wants
to measure. Data concerning what the test measures must be sought from some source outside
the test itself. This problem will be considered in the following section.
Validity

Empirical validity is of two general kinds, predictive and concurrent (or status) validity,
depending on whether test scores are correlated with subsequent or concurrent criterion
measures. According to Bynom (Forum, 2001), validity deals with what is tested and degree to
which a test measures what is supposed to measure (Longman Dictionary, LTAL). For example,
if we test the students writing skills giving them a composition test on Ways of Cooking, we
cannot denote such test as valid, for it can be argued that it tests not our abilities to write, but the
knowledge of cooking as a skill. Publishers of standard test should be expected to provide
evidence of the validity of their measures. Reliability as a measurement tool and the results are
used to make important decisions. Many people said that the best reliability in a score that is
generated has a consistent measurement results, not changeable, and can be trusted. Lionova
quoting Ebel (1986: 223) suggests that a test could not be said to be good if it does not indicate
the quality of reliability.

Practicality

In writing or selecting a test, we should certainly pay some attention to how long the
administering and scoring of it will take. Our task, then, is to select an instrument which is of
sufficient length to yield dependable and meaningful results but which will also fit comfortably
into the time that can be made available for testing. However, we need to have some general
guidance as to the meaning of test scores to begin with, for without this it is extremely difficult
to use an instrument in an efficient manner.

Discrimination

Its used to distinguish level students ability. Language test have discrimination and we can
find out how well a test discriminates by calculating its discrimination index
CONCLUSION

Language testing is a procedure for relationship and also a language test is a procedure
for gathering evidence of general or specific language abilities from performance on tasks
designed to provide a basis for predictions about an individuals use of those abilities in real
world contexts. The importance of language testing and the evolution of modern linguistics have
made teachers and testers aware of the significance of a need for a comprehensive analysis of the
language under consideration. Then, teachers are constantly revising their teaching strategies in
the light of advances in modern linguistics and psychology using test, assessment and evaluation.

The test of language have model to design the level of difficulty of the test. Many people
said that the test design consider the Blooms Taxonomy for speaking and writing testing and the
Barrets Taxonomy using for listening and reading test. On the other hand, testers are trying to
improve their techniques to test language ability more or the principles of good test like validly,
reliability practically and discrimination in compliance with advances in teaching. Certainly, not
all innovations in language science have had equal or similar effects upon teaching and testing;
each has paid certain attention to the relative importance of each skill or component. Thus, it is
the responsibility of the teacher to choose the most appropriate method of estimating learners
knowledge or ability, particularly where learning a second language is concerned.

REFERENCE

Davidson, G. F. (2007). Language Testing and Assessment. New York: British Library .

Suryanto, Jati. (2013). Bloomfield and Barrets Taxonomy as the basis for making a good
language testing. University of Muhammadiyah Yogyakarta.

Banerjee, J. C. (2011). Language testing and assessment (Part I). Cambridge University Press,
34,213-236.
Gipps and Lynch. (2005). An introduction to (English) Language testing. Retrieved mei 3, 2013,
From : https://www.An+introduction+to+%28English%29+Language+testing.com.

Ozerova, Anzelika. (2004). Types of Tests Used in English Language.University of Latvia: 12


May, 2004. From www.bestreferat.ru

Martyniuk, dkk.(2007). Evaluation and assessment within the domain of Language(s) of


Education. Jurnal of Education : November 10, 2007 From

http://www.coe.int/t/dg4/linguistic/Source/Prague07_Assessment_EN.

Spinello, Serena. (2010). The type of aptitude test. from

http://www.ehow.com/facts_4915004_types-aptitude-tests.html

Suvedi, Murari. (2011) Evaluation. from

http://hostedweb.cfaes.ohio-state.edu/brick/suved2.htm

http://www.teachers-corner.co.uk/four-types-of-tests/

http://ohmyluna.blogspot.com/2011/01/all-types-of-language-test.html

Leila, Behrahi (2010). The history of language testing . April 21,2010.


fromhttp://www.lorenglish.blogfa.com/post-5.aspx

LANGUAGE TESTING

A. Introduction to Language Testing

A test is a method of measuring a persons ability, knowledge, or performance in a given


domain. A test is an instrument or procedure designed to elicit performance from learners with
the purpose of measuring their attainment of specified criteria. The method may be intuitive and
informal or may be structured and explicit (Brown, 2001).

Language testing is the administration of test in order to assess and measure a persons language
competence and performance or testing language ability. It is an evaluation of an individuals
language proficiency.

B. The Interrelation between Language Teaching and Language Testing

Tests have become a way of life in the educational world and tests are often used for pedagogical
purposes, either as a means of motivating students to study, or as a means of reviewing material
taught. Thus language tests can be valuable sources of information the effectiveness of learning
and teaching. Language teachers regularly use tests to help diagnose students strengths and
weaknesses, to assess student progress, and to assisst in evaluating student achievement. As
sources of feedback on learning and teaching, language tests can thus provide useful input into
the process of language teaching (Bachman, 1990).

C. Purpose and Method of Language Testing

1. Purpose of Language Testing

Language tests have many uses in educational programs, and quite often the same test will be
used for two or more related purposes. There are six purposes of language testing.

a. To determine readiness for instructional programs.

It is used to separate those who are prepared for an academic or training program from those who
are not.

b. To classify or place individuals in appropriate language classes.

Other screening tests try to distinguish degrees of proficiency so that examinees may be assigned
to specific sections or activities on the basis of their current level of competence.

c. To diagnose the individuals specific strengths and weaknesses.

It generally consists of several short but reliable subtests measuring different language skills or
components of a single broad skill. On the basis of the individuals performance on each subtest,
we can plot a performance profile which will show his relative strengths in the various areas
tested.

d. To measure aptitude for learning.

An aptitude test serves to indicate an individuals facility for acquiring specific skills and
learning.

e. To measure the extent of student achievement of the instructional goals.

Achievement tests are used to indicate group or individual progress toward the instructional
objectives of a specific study or training program.

f. To evaluate the effectiveness of instructions.

Achievement tests are used exclusively to assess the degree of the instructional program
success.
For simplicity, the six categories can be grouped under three headings:

a. Aptitude

Aptitude test serves to indicate an individuals facility for acquiring specific skills and learning.

b. General Proficiency

General proficiency test indicates what an individual is capable is doing now (as the result of his
cumulative learning experiences), though it may also serves as a basis for predicting future
attainment.

c. Achievement

Achievement test indicates the extent to which an individual has mastered the specific skills or
body of information acquired in a formal learning situations.

2. Methods of Language Testing

a. Translation

b. Dictations

Dictation is certainly a useful pedagogical device with beginning and low-intermediate-level


learners of a foreign language, and the responses that such students make to dictations will
certainly tell the teacher something about their phonological, grammatical, and lexical
weaknesses. As attesting device, dictation must be regarded as generally both uneconomical and
imprecise.

c. Composition

A composition test allows the examinee to arrange his own relatively free and extended written
responses to problems set by the examiner. It also called written competence or free-response.
In foreign-language testing these responses may consist of single paragraphs or full essays. In
this case, the evaluation used grammatical structures, lexicon of the target language, ideas, and
also organization.

The difficulties in using and assessing compositions as a measurement device are: (1) eliciting
the specific language items that the test writer particularly wishes to test; (2) finding a way to
evaluate these free responses reliably and economically.

d. Scored Interview

Scored interview is the device for assessing oral competence. It also called free-response test
because it allows examinee to express their answer in their own words in a relatively
unstructured testing situations. Scored interview
e. Multiple-choice Items

Multiple-choice items types were developed to overcome a number of the weaknesses of the
composition test.

f. Short-answer Items

Short-answer tests combine some of the virtues of both multiple-choice and composition tests. It
requires the examinee either to complete a sentence or to compose one of his own according to
very specific directions.

También podría gustarte