Está en la página 1de 174

JMDE

Journal of MultiDisciplinary Evaluation


Number 2, February 2005
Editors E. Jane Davidson & Michael Scriven

Associate Editors Chris L. S. Coryn & Daniela C. Schrter

Assistant Editors Thomaz Chianca Nadini Persaud John S. Risley Regina Switalski Schinker Lori Wingate Brandon W. Youker

Webmaster Dale Farland

Mission The news and thinking of the profession and discipline of evaluation in the world, for the world

A peer-reviewed journal published in association with The Interdisciplinary Doctoral Program in Evaluation The Evaluation Center, Western Michigan University

Editorial Board Katrina Bledsoe Nicole Bowman Robert Brinkerhoff Tina Christie J. Bradley Cousins Lois-Ellen Datta Stewart Donaldson Gene Glass Richard Hake John Hattie Rodney Hopson Iraj Imam Shawn Kana'iaupuni Ana Carolina Letichevsky Mel Mark Masafumi Nagao Michael Quinn Patton Patricia Rogers Nick Smith Robert Stake James Stronge Dan Stufflebeam Helen Timperley Bob Williams

Table of Contents PART I Editorial In this Issue: JMDE(2)..........................................................................................1 Marketing Evaluation as a Profession and a Discipline .......................................3 E. J. Davidson Articles Monitoring and Evaluation for Cost-Effectiveness in Development Management........................................................................................................11 Paul Clements Network Evaluation as a Complex Learning Process ........................................39 Susanne Weber Practical Ethics for Program Evaluation Client Impropriety...............................................................................................72 Chris L. S. Coryn, Daniela C. Schrter, & Pamela A. Zeller Ideas to Consider Managing Extreme Evaluation Anxiety Though Nonverbal Communication...76 Regina Switalski Schinker Is Cost Analysis Underutilized in Decision Making? ........................................81 Nadini Persaud Is E-Learning Up to the Mark?. ..........................................................................83 Oliver Haas The Problem of Free Will in Program Evaluation........................................... 102 Michael Scriven PART II: Global ReviewRegions & Events Japan Evaluation Society: Pilot Test of an Accreditation Scheme for Evaluation Training............................................................................................................ 105 Masafumi Nagao Aotearoa/New Zealand Starting National Evaluation Conferences ................ 107 Pam Oliver, Maggie Jakob-Hoff, & Chris Mullins

Conference Explores the Intersection of Evaluation and Research with Practice and Native Hawaiian Culture........................................................................... 109 Matthew Corry Washington, DC: Evaluation of Driver Education.......................................... 111 Northport Associates African Evaluation Association ....................................................................... 123 AfrEA International Association for Impact Assessment ........................................... 125 Brandon W. Youker An Update on Evaluation in Canada ............................................................... 130 Chris L. S. Coryn An Update on Evaluation in Europe................................................................ 132 Daniela C. Schrter An Update on Evaluation in the Latin American and Caribbean Region ....... 137 Thomaz Chianca PART III: Global ReviewPublications Summary of American Journal of Evaluation, Volume 25(4), 2004. ............. 145 Melvin M. Mark New Directions for Evaluation. ....................................................................... 150 John S. Risley Education Update............................................................................................. 153 Nadina Persaud The Evaluation ExchangeHarvard Family Research Project ...................... 157 Brandon W. Youker Canadian Journal of Program Evaluation, Volume 19(2), Fall 2004 .............. 161 Chris L. S. Coryn Evaluation: The International Journal of Theory, Research and Practice, Volume 10(4), October 2004 ........................................................................... 164 Daniela C. Schrter Measurement: Interdisciplinary Research and Perspectives, Volume 1(1), 2003 .......................................................................................................................... 168 Chris L. S. Coryn

Editorial

In this Issue: JMDE(2) Michael Scriven

The journal homepage has had over 6,000 hits, and there have been around 2,500 downloads of all or part of the first issue. Our list of 932 people who want to be notified of new issues now includes residents from more than 100 countries. The current issue is a bit longer: it runs over 170pp. but you can download just the parts that interest you. Here are some highlights. There is an editorial by Jane Davidson on the perception of evaluation by others, and what we can and should do about it. One of the major articles is by Paul Clements, who raises serious concerns about the crucial matter of how the big (U.S. and other) agencies are evaluating their vast expenditures on development programs overseas. Hes unlike most critics in two respects: (i) he went to Africa to check things out on the ground for himself, and (ii) he suggests a way to raise the standards considerably. You will no doubt realize that both the problem he writes about, and his proposed solution, have obvious generalizations to other areas of public and private investment. The other major article is from Germany, in which Susanne Weber sets out an approach to monitoring and evaluation based on current abstract sociological theorizing. Her approach also bears on systems theory and organization learning, in case those are interests of yours. That article and the other German contribution (on the evaluation of online education) are interesting not only for

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

their content, but for the sense they provide of how evaluation is seen by scholars in Europe. We introduce a new featureIdeas to Considerfor short pieces, selected by the editors and ideally just of memo length, that canvas ideas we think deserve attention by evaluators. Theres a quartet of these to kick the feature off: one on the still-persisting shortage of cost analysis in published articles and reports on evaluation, one on the role of body language in creating and countering evaluation anxiety, one on approaches to evaluating online education, and one on the tricky problem of how to evaluate programs (or drugs) which depend on the motivation of users for their success (should attrition rate count as program failure or subject failure?). Our strong interest in international and cross-cultural evaluation continues with an update on several of our previous articles covering evaluations in regions and publications around the world. The review of evaluation in Latin America and the Caribbean in the last issue has already been reprinted in translation, as it well deserved, and its sequel here tells an impressive story of activity in that region. The sixteen articles in this section tell a remarkable story: evaluation is changing the world and the world is changing evaluation! MS

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

Marketing Evaluation as a Profession and a Discipline E. Jane Davidson Davidson Consulting Limited, Aotearoa/New Zealand

It can be a bit like pushing sand uphill with a pointy stick, as they say here in New Zealand. One of the great challenges in developing evaluation as a discipline is getting it recognised as being distinct from the various other disciplines to which it applies. In this piece, I offer a few reflections on the challenges with this, recount a story where a group of practitioners from outside the discipline actually sat up and took notice, and propose some possible solutions for moving us forward. Its all very well for us to come together in our evaluation communities around the world and talk to each other about our unique profession. Not that there isnt a lot to talk about. After all, we are still working on building a shared understanding of what it is exactly that makes evaluation distinct from related activities such as applied research and organisational development. But with a little more application, I am hopeful we can persuade enough of a critical mass to call it a reasonable consensus. Meanwhile, it seems to me that a more difficult yet equally important task is to articulate clearly to the outside worldto clients and to other disciplineswhat it is that makes evaluation unique. Right across the social sciences and in many other disciplines where evaluation is relevant in more than just its intradisciplinary application, it seems that the vast majority of practitioners consider it to be part of their own toolkit already, albeit
Journal of MultiDisciplinary Evaluation: JMDE(2) 3

Editorial

often under a different name. Most of these practitioners consider evaluators delusional when we suggest that evaluation is sufficiently distinct to call a profession, let alone an autonomous discipline. Heres a fairly typical response, in this case from an industrial/organisational psychologist: [A] discipline evaluation is not. Disciplines are systematic, coherent, founded more often than not on sound theory, and offered as programs in accredited colleges, universities, and professional schools. Evaluation, without detracting in the least from its multitude of contributions and creative authors and practitioners, is not systematic, coherent, theory-driven, and offeredoh perhaps with an exception here and thereas a program of study at institutions of higher learning. Evaluation is a helter-skelter mishmash, a stew of hit-ormiss procedures, notwithstanding the fact that it is a stew that has produced useful studies and results in a variety of fields, including education, mental health, and community development enterprises.1 Industrial and organisational psychology is a relatively young discipline itself, but obviously not quite young enough for its practitioners to recall the struggles they must have had in the late 1800s and early 1900s. Industrial psychology, which focuses primarily on personnel selection and ergonomics/human factors, grew out of a blend of industrial engineering and experimental psychology. The first doctoral degrees in industrial psychology did not emerge until about the 1920s.

For the full article, see R. Perloffs (1993) A potpourri of cursory thoughts on evaluation.

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

It seems likely to me that the fledgling discipline of industrial psychology had its share of critics in those days. Perhaps it was even called a helter-skelter mishmash. There was probably a lot of dissent in the ranks as to whether it really was any different from industrial engineering, measurement, experimental psychology, and a host of other disciplines. And I am sure there were furious debates about the definition of industrial psychology itself. Was it (or would it have been) reasonable to declare industrial psychology a discipline even though most insiders didnt agree on its definition, underlying logic, or the soundness of its theories? How much shared understanding constitutes a critical mass? Theres something of a chicken and egg argument here. It seems to me that little progress in theory or practice can be made beyond a certain point without first declaring evaluation to be a discipline and then seeing what develops. Sure, not everyone will buy the idea initially, but theres no point being put off by those who like to throw their hands in the air and declare the whole exercise impossible. These things take time, open minds, thinking and rethinking. Whether or not we have the courage and conviction to declare ourselves a discipline at this point, I think its fair to say we have a critical mass who are quite clear that evaluation is at least a professional practice with a unique skill set that is honed with reflective practice and other forms of learning. The challenge here is convincing non-evaluators (such as the I/O psychologist quoted earlier) of this. Consultants are a particularly hard nut to crack. Often trained to the graduate level in business and/or the social sciences, the almost universally held perception is that all one needs to do evaluation is some content expertise and perhaps a few measurement skills (and accounting skills).
Journal of MultiDisciplinary Evaluation: JMDE(2) 5

Editorial

What could possibly make seasoned professionals such as management consultants sit up and take notice of evaluation? Let me set the scene. A client organisation had put out an RFP asking for an independent evaluation of a leadership initiative. Interestingly, the RFP specifically stated that the client was looking for an evaluation expert with content expertise rather a content expert (e.g., a management consulting or industrial/organisational psychologist) with evaluation experience. This is very unusual in the evaluation of leadership initiatives. Most clients are unaware that there is such a thing as evaluation expertise, as distinct from the applied research skills a well-qualified management consultant or organisational psychologist might possess. Of 22 initial expressions of interest in the contract, just two (yes, 2!) of these were from people who identified as evaluators and participated actively as members of the [national and international] evaluation community. This was despite unusual efforts on the part of the client to attract expressions of interest from evaluators. Rather than simply posting the RFP on the usual electronic bulletin boards, which they had heard good evaluators do not usually respond to, they also sent out direct emails to evaluators who had been recommended by other evaluators and had the notice posted on an evaluation listserv. [It is interesting to note that the process used by the client to specifically target evaluators closely mirrors best practice for the recruitment of top-notch job candidates, especially underrepresented groupsdont just use the regular channels that yield the same old candidate pools; go to where you know the right people are and personally encourage them to apply.] The client in this case, under the guidance of an evaluator not bidding on the job, used a creative and unusual process to select the contractor. Rather than asking
Journal of MultiDisciplinary Evaluation: JMDE(2) 6

Editorial

shortlisted bidders to submit the usual 20-page proposal, the selection team invited them to a face-to-face meeting where they could present their thoughts on the evaluation. This was because credibility was a key element of the evaluation, which the client felt couldnt accurately be gauged without meeting the evaluator face to face. An added benefit of the face-to-face interview approach was that it increased the odds of both attracting and identifying a real evaluator. In a small community such as New Zealand, the vast majority of evaluators are solo practitioners who often partner with others for particular pieces of work. As such, they have inadequate resources to devote to compiling lengthy, slickly presented proposals that have less than an even chance of being successful. In contrast, larger consulting firms who do not have evaluation as their primary function are far more likely to have an extensive library of proposal templates and a number of junior staff trained in writing proposals. Therefore, the standard written proposal solicitation process is far more likely to yield bids from content experts than from evaluators. Prospective contractors were asked to submit a number of supporting documents for the interview, including an outline of their quality assurance procedures. The proposed quality assurance procedures turned out to be one of the more telling pieces of information. After all, what better way to understand an evaluators grasp of his or her profession than to ask how his or her work should itself be evaluated? One case in point was a large, multinational business consulting firm (Firm X) whose quality assurance procedure consisted of appointing one of their independent auditors to oversee the evaluation.

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

In the final round, only the two actual evaluators passed the interview process and made it onto the final short-shortlist for being awarded the contract. When the final decision was made, the runner-up was told the background and qualifications of the successful bidderand immediately recognised who the competitor was (New Zealand being a small evaluation community). By chance, the two met up a few days later and had a chuckle when they finally connected the dots. In contrast, Firm X was, by all accounts, extremely surprised not to make even the final short-shortlist of two. To their credit, they did send a junior employee to see the client to get feedback about why their bid had been unsuccessful. They were even more surprised to be told that the main reason was because they were not evaluators. And no, audits and reviews of the type they were well versed in were not the same as high-quality evaluations. The consultants from Firm X were flummoxed! Firm X asked who had been awarded the contract to evaluate the leadership initiative, and were told. They then asked who the runner-up was. The client quietly pointed out that, if they really were evaluators (as they claimed to be), they would already have found that out through their extensive evaluation networksin the same way as the top two contenders had found out about each other. There is a wonderful lesson here for evaluation as it strives for recognition as a distinct profession and as a discipline. I think weve all tried convincing the colleagues in our content disciplines that what we do is unique, complex, more than just measuring a couple of variables of interest, and something worth paying attention to. And every now and then we get a breakthrough with our evaluation evangelising. But the reality is that evaluation-savvy clients will likely sell us more

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

converts among this audience than we could possibly manage for ourselves. There is nothing quite like being denied a contract for not being an actual evaluator! What are some of the strategies we can use to educate clients? The simplest one that comes to mind is to highlight in our work what it is we are doing that is unique to evaluation. This might be serious and systematic attention to utilisation issues, the application of evaluation-specific methodologies not known to our nonevaluator colleagues, or the use of frameworks and models that have been developed specifically for evaluation. Whatever it is, we should be sure to highlight it in a way that makes it easy for a client to tell a real evaluation from the rest. A second client education strategy is to seek opportunities to help with the development of evaluation RFPs. This was the case in the organisation I described, and it made a very substantial difference to how well the task was outlined, the selection criteria, the quality of the selection process, and client satisfaction with the outcome. Although the organisation was constrained by regulations about how an RFP process could be managed, good evaluative thinking allowed individuals within the organisation to generate a creative solution that led to the right result. The third strategy for spreading the word about evaluation would be to follow the example of the Society for Industrial and Organizational Psychology (SIOP) in the States. Like us, I/O psychologists also have trouble getting the general public (especially managers in organisations) to understand what it is they are particularly skilled to do. In response to this need, SIOP has developed an extremely simple and straightforward leaflet, which it sends to members for distribution to managers they know. The goal was to have each member distribute the leaflet to five

Journal of MultiDisciplinary Evaluation: JMDE(2)

Editorial

managers.

copy

of

the

leaflet

may

be

viewed

online

at

http://siop.org/visibilitybrochure/siopbrochure.htm It is likely that by directing our educational efforts outwards toward clients, we will have the side effect of creating some better clarity within the evaluation profession, which will in turn let us make better sense to the outside world.

Journal of MultiDisciplinary Evaluation: JMDE(2)

10

Articles

Monitoring and Evaluation for Cost-Effectiveness in Development Management Paul Clements2

1. Development Assistance Requires a High Analytic Standard In the Malawi Infrastructure Project, the World Bank planned to rehabilitate 1500 rural boreholes at a cost of $4.4 million, with an estimated economic rate of return of 20%. At the projects Midterm Review, two years later, the rate of return was reduced to 14%, but the reasons for the reduction were not clear. The plan had anticipated that 85% of project benefits would come from the value of time the villagers saved that they would have spent collecting water, and 10% from the incremental water consumed. The Midterm Review estimated 31% of benefits from time savings, however, and 56% from incremental water consumed.3 No reason was given for reducing the estimate for time savings or for increasing the value for water consumption. The World Banks Fourth Population Project in Kenya aimed to decrease Kenyas total fertility rate to six births per woman by improving family planning services. The project was approved in 1990, and Kenyas total fertility rate fell from 6.4 in
2

Corresponding author: Paul Clements, Department of Political Science, Western Michigan

University, Kalamazoo, MI 49006, e-mail: clements@wmich.edu.


3

Carlos Alvarez, Ebenezer Aikins-Afful, Peter Pohland and Ashok Chakravarti, 1992, Malawi

Infrastructure Project: Mid-Term Review Report, September 1992, Lilongwe, Malawi: The World Bank, Appendix B. The economic analysis from the project plan comes from the Malawi Infrastructure Projects Staff Appraisal Report. I was given it upon agreeing not to reference it.

Journal of MultiDisciplinary Evaluation: JMDE(2)

11

Articles

1989 to 5.4 in 1993. The projects Implementation Summary Reports consistently indicated that All development objectives are expected to be substantially achieved, and a 1995 supervision report asserted that The project development objectives have been fully met.4 Project activities, however, mainly supporting the National Council for Population and Development, were largely unsuccessful, and in 1994 a large part of the project budget was reallocated to the fight against AIDS. There were many other development agencies with family planning projects in Kenya, some with much stronger performance. Documents for the Fourth Population Project do not explain how its development objectives were related to the activities it funded. The World Banks Water Supply and Sanitation Rehabilitation Project in Uganda aimed to rehabilitate the water and sewerage system in Kampala, the capital city, and in six other major towns. Its plan calculated a 20% economic rate of return based on incremental annual water sales of $5.5 million from 1988 to 2014. The completion report estimated actual returns at 18% because water production in 1991 was 10-20% below expectations.5 The project had indeed achieved its construction goals, but its efforts to strengthen the National Water and Sewerage Corporation (NWSC) had been undermined by the governments failure to raise water rates amidst hyperinflation and late payments on its water bills. The NWSC would have been unable to maintain the system without ongoing support, and

World Bank, 1995, Form 590, (unpublished project implementation summary for Third and

Fourth Kenya Population Projects), Washington, DC: The World Bank.


5

World Bank, 1991, Project Completion Report: Uganda Water Supply and Sanitation

Rehabilitation Project (credit 1510-UG), Washington, DC: The World Bank, p. 30.

Journal of MultiDisciplinary Evaluation: JMDE(2)

12

Articles

indeed by 1993, even with a major new project supporting the water company, it was once more operating in the red.6 These examples come from a blind selection of four World Bank projects that I studied for my doctoral dissertation.7 What is remarkable about these inconsistenciesan economic analysis in a midterm review that does not follow from the one in the project plan, development objectives that do not reflect project activities, an economic rate of return that anticipates 23 additional years of water sales based only on the current state of the infrastructureis that even though at least the second two are at face value analytically incorrect, they are presented as routine reporting information, with no attempt to hide them such as in obfuscating language. Indeed they reflect common analytic practice in the international development community, and this common practice reflects a structural problem of accountability. I would like to argue that the tasks undertaken by the large multilateral and bilateral donor agencies require a particularly high analytic standard, but several incentives that influence development practicepolitical incentives for donor and recipient governments, organizational incentives for development agencies, and personal incentives for managershave led to positive bias and analytic compromise. These incentives are structural in that they result from the pattern
6

Paul Clements, 1996, Development as if Impact Mattered: A Comparative Organizational

Analysis of USAID, the World Bank and CARE based on case studies of projects in Africa, doctoral dissertation for the Woodrow Wilson School of Public and International Affairs, Princeton University, p. 325.
7

Along with four projects of the US Agency for International Development and four from

CARE International, all located in Uganda, Kenya and Malawi. The projects were selected based on descriptions of less than a page with no information on results.

Journal of MultiDisciplinary Evaluation: JMDE(2)

13

Articles

of the flow of resources inherent in development assistance. The problem therefore requires a structural solution, and this paper proposes a possible solution involving a dramatic improvement in the quality and consistency of project evaluations. We can be confident that such an improvement is possible, first, because the evaluation problem facing development agencies has determinate features with specific analytic implications, and second, because a similar structural problem has already been addressed in the management of public corporations. Sooner or later, development assistance comes down to designing investments and managing projects. Unlike private sector investments, development projects aim not to make a profit, but to improve conditions for a beneficiary populationto reduce poverty, or to contribute to economic growth. There is no automatic feedback such as in sales figures, and no profit incentive to keep managers on task. Typically one needs to strengthen existing institutions or to build new ones, and/or to encourage beneficiaries to adopt new behaviors and to take on new challenges. Yet in the project environment there is likely to be weaker infrastructure, a less well-educated population, and more risk and uncertainty than in the environments facing most for-profit enterprises. Furthermore, in places that need development assistance one cannot assume that institutional partners will be competent and mission-oriented. These conditions in combination place particular demands on development managers. Project managers need to maintain a unified conception of the project, its unfolding activities, and its relations with its various stakeholders, a conception grounded in a view of its likely impacts. Donor agency officials need a conception of the relative merits of many actual and potential projects, and an analysis that turns problems on the horizon for developing countries into programmatic opportunities.

Journal of MultiDisciplinary Evaluation: JMDE(2)

14

Articles

The central challenge in the management of development assistance is to maintain this kind of consciousnessthis analytic perspectiveamong the corps of professional staff. Some might like to think that development can be achieved by getting governments to liberalize markets or by getting local participation in project management, and these may well be important tactics. Intuition suggests and experience teaches, however, that there can be no formula for successful development. Each investment presents a unique design and management challenge. There are two problems in maintaining the will and the capacity to address this challenge: an incentive problem and one we can call intellectual or cognitive. The key to solving both problems, or so I will argue, is strong evaluation. 2. But Accountability in Development Assistance is Weak 2.1 Donor agencies are responsible for the success of their projects According to the World Banks procurement guidelines, The responsibility for the execution of the project, and therefore for the award and administration of contracts under the project, rests with the Borrower.8 One might think that a development loan to a government is like a business loan to an entrepreneur. The donor agency makes the loan, but it is entirely the responsibility of the borrower government to spend the money. Whether a government manages its projects well or poorly, one might imagine, is primarily its own affair, with the donor providing technical assistance upon request. We know, of course, that this image is incorrectdonor agencies typically have the predominant influence over project design, and substantial influence over project administrationbut it is useful to
8

The World Bank, 1985, Guidelines: Procurement under IBRD Loans and IDA Credits,

Washington, DC: The World Bank, pp. 5-6.

Journal of MultiDisciplinary Evaluation: JMDE(2)

15

Articles

recall why this is so. One reason is parallel to a private banks prudential interest in the management of its loans. As the World Banks Articles of Agreement state, The Bank shall make arrangements to ensure that the proceeds of any loan are used only for the purposes for which the loan was granted, with due attention to considerations of economy and efficiency and without regard to political or other non-economic influences or considerations.9 The Bank wants to be repaid, and it also has an interest in promoting economic growth and enhancing well-being in borrower countries, so it may take pains to see that its loans are well spent. Many loans are to governments with limited bureaucratic capacity in countries with inconsistent management standards, so the Bank must retain enough control to ensure that the projects it supports are properly administered. By this logic we would expect relationships with bureaucratically stronger governments to be closer to the private sector model, and indeed some governments with coherent industrial strategies (consider South Korea in the 1970s) have succeeded in using Bank loans very much for their own purposes.10 Many development loans, however, are for projects at the edge of the borrowers
9

International Bank for Reconstruction and Development, 1991, International Bank for

Reconstruction and Development: Articles of Agreement (As amended effective February 16, 1989), Washington, DC: The World Bank, p. 7.
10

See e.g. Mahn-Je Kim, 1997, The Republic of Koreas Successful Economic Development

and the World Bank, in Devesh Kapur, John P. Lewis, and Richard Webb, ed., The World Bank: Its First Half Century, Volume Two, Washington, DC: Brokings Institution Press, pp. 1748.

Journal of MultiDisciplinary Evaluation: JMDE(2)

16

Articles

frontier of technological competence, and the Bank (like other donor agencies) is a repository of expertise in the sectors it supports. The Bank also has demanding requirements for project proposals, and many governments have been unable independently to prepare proposals that the Bank could accept, particularly in earlier years when patterns of Bank-borrower relations were established. Therefore the Bank has generally taken primary responsibility for designing the projects it funds,11 and the responsibility that comes with authorship cannot be lightly abandoned during implementation. A second reason that donor agencies take an interest in how their funds are spent is that donor funds come from (or are guaranteed by) governments, and, the Banks Articles of Agreement notwithstanding, governments do not release funds without taking an interest in their disposition. On one hand this political logic reinforces the prudential logic discussed above. Donor governments want their funds to contribute to the borrowers development, so they insist that donor agencies take responsibility for project results. Foreign aid is also, on the other hand, enmeshed in donor governments general promotion of their foreign policy agendas.12 It matters that the United States is not indifferent as to whether and when the World Bank will make loans to Cuba, and World Bank loans to Cte dIvoire have been subject to particular influence from France, the countrys former colonial master.13 Bilateral aid is even more closely linked to donor government interests than aid
11

Warren C. Baum and Stokes M. Tolbert, 1985, Investing in Development: Lessons of World

Bank Experience, New York: Oxford University Press for the World Bank, p. 353.
12

Paul Clements, 1999, Informational Standards in Development Agency Management, World

Development 27:8, 1359-1381, p. 1360.


13

Jacques Pgatinan and Bakary Ouayogode, 1997, The World Bank and Cte DIvoire, in

Kapur, Lewis and Webb, ed., pp. 109-160.

Journal of MultiDisciplinary Evaluation: JMDE(2)

17

Articles

through multilateral institutions. Not only from the donor side but from that of recipient governments too, the parameters of development spending cannot be understood merely in terms of the requirements for maximizing development impacts. As intermediaries between donor and recipient governments, donor agencies are required to take more responsibility than private banks for managing the loans they make. The analogy with the private sector breaks down even further, however, when we consider the incentives governing a donor agencys management of its portfolio. The main cause for different incentive structures between donor agencies and private banks, of course, arises from differential exposure to financial risk. With private loans, the borrower and often the lender suffer a financial loss if the investment fails. With most development projects, by contrast, neither the donor nor the implementing agency faces a financial risk if impacts are disappointing. For projects funded by loans it is the borrower government, typically the treasury, that is responsible for payments. But the treasury seldom has control over individual development projects. 2.2 The usual watchdogs are not there to hold donor agencies accountable The structural conditions of development assistance, therefore, create an accountability problem. Donor agencies have control over development monies but they face no financial liability for poor results (and no financial gain when impacts are strong). In this context their orientation to their task will depend largely on the demands and constraints routinely placed on them by other agents in their organizational environment, on the individual and corporate interests of their leaders and employees, and on the mechanisms of accountability that are institutionally (artificially) established.

Journal of MultiDisciplinary Evaluation: JMDE(2)

18

Articles

In regard to external agents, Wenar notes that there has been a historical deficiency in external accountability for donor agencies. Aid organizations have evolved to a great extent unchecked by the four major checking mechanisms on bureaucratic organizations. These four mechanisms are democratic politics, regulatory oversight, press scrutiny, and academic review.14 The electorates in donor countries want to believe that aid is helping poor people, but democratic politics also leads to pressures on donor agencies to support the agendas of well-organized interest groups.15 Some promote humanitarian and progressive agendas, but others have aims that create tensions with development goals. Generally, since the intended beneficiaries of aid cannot vote in donor country elections, the reliability of democratic politics as a source of accountability is limited. There has been significant regulatory oversight aiming to ensure that aid funds are not fraudulently spent, but external oversight of project effectiveness faces major practical hurdles. Aid projects are so widely dispersed, and the period between when monies are spent and when their results transpire is typically so substantial that effective oversight would require major bureaucratic capacity. Responsibility for project evaluation, however, has normally rested with the donor agencies themselves. This clearly leads to conflicts of interest, and it is the aim of this paper to suggest how these conflicts could be, if not removed, at least substantially ameliorated. Donor agencies have not, in any case, been subject to
14

Leif Wenar, 2003, What we owe to distant others, Politics, Philosophy & Economics, 2:3,

283-304, p. 296.
15

For example, American farmers have influenced U.S. food aid programs, which are overseen

by the U.S. Department of Agriculture.

Journal of MultiDisciplinary Evaluation: JMDE(2)

19

Articles

significant external accountability by way of regulatory oversight. Press scrutiny and particularly academic review, in contrast, have been significant sources of accountability, and academic studies have contributed to many foreign aid reforms. Given the strength of the political and bureaucratic interests that drive the programming of aid, however, and the above-noted dispersal of aid projects, scholars and journalists can only be expected to hold aid agencies accountable in a limited and inconsistent manner. Also, they are largely dependent, for information on aid operations on the donor agencies themselves. Few who have spent much time with development agency personnel can doubt their generally admirable commitment to development goals, and the reforms this paper will propose depend heavily on the personnels sustained interest in professionalism and effectiveness. Their behavior is also influenced, however, by their individual and corporate interests, and these interests take shape in the specific task environments that they face in their home offices and in the field. There are two aspects of the way their interests come to be constructed that are particularly relevant to the problem of accountability. First and most obviously, while institutional norms require donor agencies to maintain the appearance of a coherent system of responsibility for results, their institutional relationships require them to maintain the appearance that their operations are generally successful. They must evaluate, but it serves their individual and corporate interests if evaluation results are generally positive (or at least not often terribly negative). Since donor agencies have generally controlled their own evaluation systems, they have had the opportunity to design these systems in such a way that they would tend to reflect positively on the agencies themselves. Second, due in part to the long time span between the commitment of funds and the evaluation of results, internal personnel evaluations have tended to focus on variables only loosely

Journal of MultiDisciplinary Evaluation: JMDE(2)

20

Articles

correlated with good results, and sometimes on variables that conflict with good practice. 2.3 Lacking secure accountability for results, other less relevant criteria inform resource allocation decisions We will later consider some of the approaches donor agencies have taken to evaluation below. For the purposes of understanding the accountability problem in development assistance, it is enough for now to note that donor agencies have controlled their own evaluation systems. In the context of the general deficiency in external accountability, the priorities that have been enforced within donor agencies take on particular significance. Perhaps the most longstanding and sustained critique of donor agencies internal operations involves the imperative to move money. The classic account of the money-moving syndrome is Tendlers Inside Foreign Aid.16 Focusing on the U.S. Agency for International Development (USAID) and the World Bank, Tendler identifies a pressure to commit resources that is exerted on a donor organization from within and without, and finds that standards of individual employee performance place high priority on the ability to move money.17 In the context of her organizational analysis, she gives several examples of aid officials knowingly supporting weak projects in order to reach spending targets.18 Tendler also finds, reinforcing the present argument about evaluation,

16 17 18

Judith Tendler, 1975, Inside Foreign Aid, Baltimore: Johns Hopkins University Press. Ibid., p. 88. Ibid., p. 88-96.

Journal of MultiDisciplinary Evaluation: JMDE(2)

21

Articles

that in a political environment often hostile to foreign assistance, aid officials learned to self-censor reports that could provide ammunition for critics. For writing what he considered a straightforward description of a problem or a balanced evaluation of a project, an AID technician might be remonstrated with, What would Congress or the GAO [General Accounting Office] say if they got hold of that!? Words were toned down, thoughts were twisted, and arguments were left out, all in order to alleviate the uncomfortable feeling of responsibility for possible betrayal. Such a situation must have resulted in a certain atrophy of the capacity for written communication and, inevitably, for all communication through language.19 The World Bank typically required economic analysis of proposed projects, but Tendler found that many ostensibly economic projects were selected by noneconomic criteria.20 Much of the economic analysis that was carried out amounted to a post hoc rationalization of decisions already taken.21 While Tendler offers several political and organizational reasons to explain the money-moving imperative, I would like to emphasize what is absent from the organizational culture she describes. We do not find a sustained effort to consider how development funds can be employed to maximize their contribution to development. In such an environment we might expect well-intentioned professionals, once they win some organizational power, to act like policy
19 20 21

Ibid., p. 51. Ibid., p. 93. Ibid., p. 95.

Journal of MultiDisciplinary Evaluation: JMDE(2)

22

Articles

entrepreneurs, promoting their individual conception of a good development agenda in large measure despite the prevailing incentives. We might expect segments of a donor agency that have strong external allies to develop coherent agendas that they can implement themselves, as I believe reproductive health professionals at USAID have done. What we cannot expect, however, is that organizational decisions will routinely be taken on the basis of expected impacts. The World Banks project approval culture was recognized in its internal 1992 study, Effective Implementation: Key to Development Impact (popularly called the Wapenhans Report). The report cites a pervasive preoccupation with new lending,22 in part because signals from senior management are consistently seen by staff to focus on lending targets rather than results on the ground,23 noting also that [t]he methodology for project performance rating is deficient; it lacks objective criteria and transparency.24 Although the report describes the Banks evaluation system as independent and robust, it finds that [l]ittle is done to ascertain the actual flow of benefits or to evaluate the sustainability of projects during their operational phase.25 Since the appearance of the Wapenhans Report, the Bank has moved increasingly to spending modalities that further dilute accountability for results. The two kinds of programs that have become most central to Bank strategies particularly in lower income countries are adjustment loans of various kinds (structural, sectoral) and
22

Portfolio Management Task Force, 1992, Effective Implementation: Key to Development

Impact, Washington, DC: The World Bank, p. iii.


23 24 25

Ibid., p. 23. Ibid., p. iv. Ibid.

Journal of MultiDisciplinary Evaluation: JMDE(2)

23

Articles

Poverty Reduction Strategy Papers (PRSPs).26 Adjustment loans require borrowers to adopt free market reforms in order to better align economic incentives with development goals. They tend to operate on a wider scale than traditional projects, with more diffuse impacts. There is often a feeling that they are imposed, as the government receives the loan for policy changes it presumably otherwise would not have made, and they are often implemented only partially and inconsistently. These factors make them harder to evaluate. Poverty Reduction Strategy Papers typically push a larger part of the responsibility for evaluation onto the borrower government, and it seems that their more participatory approach to policy formation and implementation is intended to substitute, to some extent, for rigorous agency evaluation. They ask the government, as part of the process of generating a poverty reduction strategy, to identify a set of indicators for measuring the strategys impacts. If the World Bank has had such a hard time ascertaining the level and sustainability of impacts from its own portfolio, however, it is questionable whether governments of low-income countries will be able to do much better. 3. Independent and Consistent Evaluation Can Improve Accountability and Learning in Development Assistance 3.1 The basic idea of the proposed evaluation approach The problems discussed above present formidable obstacles to maintaining accountability in foreign assistance on the basis of program and project results. We should recall, however, what is at stake. In the absence of meaningful accountability there is little to counter-balance the pressures for aid resources to
26

David Craig and Doug Porter, 2003, Poverty Reduction Strategy Papers: A New

Convergence, World Development 31:1, 53-69.

Journal of MultiDisciplinary Evaluation: JMDE(2)

24

Articles

support political interests of donor and recipient governments, organizational interests of donor and implementing agencies, and personal interests of management stakeholders. The inconsistency and mixed reliability of evaluations have also undermined learning from experience, so the aid community has been slower than it would otherwise have been to identify successful strategies and to modify or abandon weak ones.27 One way to address the historical deficit of external accountability, for example to push the focus of management attention forward from moving money to achieving results, and to improve the incentive and the capacity to manage for impacts, is to
27

Indeed despite development agencies consistently reporting positive results from their overall

operations, there have been persistent doubts about the basic effectiveness of development assistance at improving economic and/or social conditions in recipient countries. In their comprehensive 1994 review of foreign aid on the basis of donor agency documents, Does Aid Work? Report to an Intergovernmental Task Force, second edition, Oxford, UK: Oxford University Press, Robert Cassen and associates find that most project achieve most of their objectives and/or achieve respectable economic rates of return. A series of cross-country econometric studies, however, have failed to find evidence of positive impacts from foreign aid. These include Paul Mosley, John Hudson, and Sara Horrell, 1987, Aid, the public sector and the market in less developed countries, The Economic Journal, 97:387, 616-641; P. Boone, 1996, Politics and the effectiveness of foreign aid, European Economic Review, 40:2, 289-329; and Craig Burnside and David Dollar, 2000, Aid, policies and growth, The American Economic Review, 90:4, 847-868. These results are reviewed and contested, however, in a recent paper by Michael Clemens, Steven Radelet and Rikhil Bhavnani, 2004, Counting chickens when they hatch: The short-term effect of aid on growth, Center for Global Development Working Paper 44, http://www.cgdev.org/Publications/?PubID=130. Clemens, Radelet and Bhavnani find positive country-level economic impacts from aid based on cross-country econometric studies focusing on the approximately 53% of aid that one would expect to yield short term economic impacts.

Journal of MultiDisciplinary Evaluation: JMDE(2)

25

Articles

institute independent and consistent evaluations of the impacts and costeffectiveness of donor-funded projects.28 This would mark a significant departure from existing practice, so I first explain the concept, then suggest how it could be implemented, and finally consider how it compares to established evaluation approaches. For both accountability and learning, the appropriate frame of reference is not the individual project but the donor agencys overall portfolio, and, for learning, the world-wide distribution of similar and near-similar projects. The donor agencys question is how to allocate its resources so as to maximize the impacts of its overall portfolio. The project planner or managers question is, in light of relevant features of the beneficiary population and the project environment, (and, for managers, in light of how the project is unfolding,) how to configure the project design so as to maximize impacts. In both cases the relevant conception of impacts is one that supports comparisons among projects. Both accountability and learning, for donor agencies, start from impacts and then work backwards in causality. They start, for example, with strong or weak results, and while accountability uses the discovery of causes to allocate responsibility, learning uses it to construct lessons based on schemes of similarities (so the lessons can be applied to other contexts). In this way rewards and sanctions can be allocated based on contributions to impacts and managers can gain a feel for what is likely to work in a new situation.

28

Impacts are defined as changes in conditions of the beneficiary population due to the project,

i.e. compared to the situation one would expect in the projects absence (compared to the counterfactual).

Journal of MultiDisciplinary Evaluation: JMDE(2)

26

Articles

Now this logic may sound quite general. It applies with particular force to large donor agencies because their other sources of accountability are so sparse, the tasks they undertake are so costly and complex, and the contexts in which they work are so often difficult and demanding. Accountability and learning bear a heavier burden than in other contexts. This is why projects should be evaluated not by the extent to which they achieve their individual, idiosyncratic objectives, but in terms of impacts expressed in consistent and comparable units. An evaluations units of analysis establish a perspective or orientation in terms of which the project and its activities come to be understood. In order to establish a consistent orientation across a donor agencys portfolio, therefore, (or, for a given type of project, across countries and donor agencies,) evaluations should be conducted in consistent units. Accountability requires consistent units precisely because the appropriate frame of reference for accountability is the donor agencys overall portfolio. 3.2 Comparing projects in terms of cost-effectiveness I would like to suggest that the unit that provides the appropriate frame of reference for donor agency evaluations is cost-effectiveness. Cost-effectiveness aims to achieve the greatest development impacts from the available resources. We can compare evaluation in terms of cost-effectiveness with two other approaches that donor agencies have often used. Bilateral donor agencies have typically evaluated their projects in terms of how far they have achieved their stated objectives,29 and the multilateral development banks (such as the World Bank) have historically evaluated most of their projects in terms of their economic rates of return.

29

This is the 'logical framework' approach.

Journal of MultiDisciplinary Evaluation: JMDE(2)

27

Articles

When projects are evaluated in terms of their objectives, comparisons among projects are likely to be misleading. Some projects have ambitious objectives while others are more modest, so a project that achieves a minority of its objectives can clearly be superior to another that achieves most of its own aims. Also, the criterion to achieve objectives bears no clear relation to costs. If this criterion is taken as the basis for accountability, it establishes an incentive to set easy targets and/or to over-budget. A projects economic rate of return (ERR) expresses the relation between the sum of the economic value of its benefits and its costs.30 It can also be described as the return on the investment, and the World Bank has typically expected an ERR of 10% or higher from its projects in the economic sectors. The difference between cost-effectiveness as I am defining it and an ERR is that an ERR measures benefits in terms of their economic values (ideally at competitive market prices) while costeffectiveness measures benefits in terms of the donors willingness to pay for them. The economic analysis of projects typically does not include improvements in health or education, and benefits to the poor generally count the same as benefits to households that are already well off.31 For an evaluation system based on costeffectiveness, a donor would need to establish a table of values specifying how much it is willing to pay for each of the various benefits that it expects from its projects, including those that may be expressed in qualitative as well as in

30

Specifically, the ERR is the discount rate at which the discounted sum of benefits minus costs

is equal to zero.
31

J. Price Gittinger, 1982, Economic Analysis of Agricultural Projects, second edition,

Baltimore, MD: Johns Hopkins University Press.

Journal of MultiDisciplinary Evaluation: JMDE(2)

28

Articles

quantitative terms. In this way a basis would be established for comparing, for example, primary health care and agricultural extension projects.32 3.3 The proposed evaluation approach in practice In practice, the proposed evaluation approach would work like this. At the completion of a project, an evaluator estimates the sum of impacts up to the present point in time and the magnitude of impacts that can be expected in the future that can be attributed to the project. The projects total impacts are compared to its costs based on the donors table of values, and on this basis the evaluator estimates the projects cost-effectiveness. To estimate impacts the evaluator lists the relevant impacts from a project of the present type and design,33 and carries out the appropriate quantitative and/or qualitative analysis of the projects activities and their results. The evaluator assigns each form of impact a numeric value and/or a qualitative rating based on his/her judgment of the projects likely effects in the respective areas over the lifetime of the projects influence. The impacts are summed together with appropriate weights from the table of values and compared to costs, and on this basis the evaluator estimates the projects likely costeffectiveness, for example, on a scale from one to six, with one representing failure and six indicating excellence (see Table 1). The evaluator also notes her degree of confidence in the cost-effectiveness score, and, if her confidence is moderate or low, indicates the range of cost-effectiveness scores in which she is confident that the true value of likely impacts lies. In this case she also specifies the additional
32

This approach to establishing the value of project impacts is described in Paul Clements, 1995,

A Poverty Oriented Cost-Benefit Approach to the Analysis of Development Projects, World Development, 23:4, 577-592.
33

These may be found in the project plan.

Journal of MultiDisciplinary Evaluation: JMDE(2)

29

Articles

information that could plausibly be collected that would allow a more precise estimate. The estimate of the projects cost-effectiveness anchors the evaluators analysis of the projects design and implementation. All four componentsthe analyses of impacts, cost-effectiveness, design and implementationserve as a basic unit to support accountability and learning within the project and the donor agency and across the development community. Table 1: Scale of Cost-Effectiveness
Economic Rate of Return 30% and above 20% - 29.9% 10% - 19.9% 5% - 9.9% 0% - 4.9% Below 0% Degree of Cost-Effectiveness 6 5 4 3 2 1 Interpretation Excellent Very good Good Acceptable Disappointing Failure

3.4 An evaluation association to address bias and inconsistency While evaluations in terms of cost-effectiveness may support learning and accountability, there are (at least) three problems with the proposed evaluation approach. First it does not address the bias arising from donor agency control of the evaluation process. Second the estimates of impacts that it requires, including impacts in the future, present methodological challenges. There are no widely accepted methodologies for some of the required impact estimates (e.g. for reproductive health and AIDS education projects). Third, even where accepted methodologies are available (such as economic cost-benefit analysis, for economic projects), the results are often highly sensitive to minor changes in assumptions. When evaluations are contracted out on a project by project basis, different

Journal of MultiDisciplinary Evaluation: JMDE(2)

30

Articles

assumptions are likely to be applied to different evaluations, undermining the validity of comparing and aggregating their results. There are strong parallels between the conditions for the problem of bias and inconsistency in the evaluation of foreign aid and conditions facing public corporations in the management of their internal finances. Stockholders want corporation managers to employ the corporations resources in such a way as to maximize profits, but managers face incentives to use the resources for their private purposes. There are elaborate rules governing how managers may appropriately use a corporations resources, and it is the task of accountants and auditors to ensure that these rules are followed. As with evaluators in foreign aid, however, accountants and auditors are employed by the very managers whom they are expected to hold accountable. In order to protect their independence from management, and to ensure that they have mastered the relevant techniques, accountants and auditors have established professional associations. These associations establish qualifications that their members must achieve and rules that they must follow in order to retain professional membership. It is these rules that are the source of accountants and auditors independence from corporate management. Although independence is not maintained perfectly, the consequences of major lapses can be quite severe, as evidenced by the collapse of the international accounting firm Arthur Anderson after the accounts it managed for the Enron Corporation were found to be unreliable. Amartya Sen lists transparency guarantees as one of five sets of instrumental freedoms that contribute to peoples overall freedom to live the way they would like to live. Society depends for its operations on some basic presumption of trust, which depends in turn on guarantees of disclosure and lucidity, especially in relations involving large and complex organizations. Sen points out that where
Journal of MultiDisciplinary Evaluation: JMDE(2) 31

Articles

these guarantees are weak, as they appear to be in foreign aid, society is vulnerable to corruption, financial irresponsibility, and underhand dealings.34 A professional association of development project evaluators could play a role guaranteeing disclosure and lucidity in the management of international development assistance similar to that of associations of accountants in the management of public corporations. Such an association could also address the problems of estimating project impacts and of comparing impacts in common units. In order to address the problems of bias and inconsistency, such an association would need the same structural features as associations of accountants qualifications for membership, a set of rules and standards governing how evaluations are to be carried out, and procedures for expelling members who fail to uphold the standards. In order to ensure that impacts are estimated and then compared in common units, the association would establish a constitutional principle asserting that each endof-project evaluation conducted by its members would estimate the projects impacts and cost-effectiveness to the best of the evaluators ability. One task in establishing the association would be to work out impact assessment approaches for different kinds of projects. The technical difficulties in estimating project impacts are objective problems, so it is possible to identify principles and practices for addressing them. An evaluation association would provide a forum for identifying better evaluation approaches and for ensuring consistency in their application. Over time, as its members gained experience, these approaches would be refined.

34

Amartya Sen, 1999, Development As Freedom, New York: Knopf Publishers, pp. 38-40.

Journal of MultiDisciplinary Evaluation: JMDE(2)

32

Articles

There are dozens of donor agencies and many thousands of implementing agencies in the development assistance community, and each agency has its own management culture and approaches. The universe for the evaluation associations operations would be the development assistance community overall, and it would support learning about better practices on the basis of the type of project throughout this community. Each evaluation completed by a member of the association would be indexed and saved in an online repository, which would be accessible to the entire development community. Since each evaluation estimates the projects cost-effectiveness, it would be a simple operation for someone planning, say, an urban water project, to review the approaches of the five to ten most cost-effective water projects in similar environments. 4. Monitoring and Evaluation (M&E) for Cost-Effectiveness Compared to Other M&E Approaches 4.1 Monitoring and evaluation for empowerment To explain M&E for cost-effectiveness it is useful to compare it with other evaluation approaches. The strongest challenge to standard approaches to aid evaluation in the last two decades has involved the elaboration and application of participatory approaches.35 These have aimed to involve beneficiary populations in project management, to assist them in taking responsibility for improving their own conditions and to incorporate them in more democratic processes of development decision making. Authors such as Korten and Chambers,36 whom
35

See e.g. B. E. Cracknell, 2000, Evaluating Development Aid: Issues, Problems and Solutions,

Thousand Oaks, CA: Sage Publications.


36

David C. Korten, 1980, Community Organization and Rural Development: A Learning

Process Approach, Public Administration Review, 40:5, 480-511; Robert Chambers, 1994, The 33

Journal of MultiDisciplinary Evaluation: JMDE(2)

Articles

Bond and Hulme describe as purists,37 have sought to reorient the development enterprise to support the goal of empowerment. They have promoted an approach I call M&E for empowerment because it emphasizes learning at the local level, seeking to empower project beneficiaries by involving them in the evaluation process. While M&E for cost-effectiveness appreciates that empowerment is an important development goal, it identifies the locus for the primary learning that evaluation should support among those who are responsible for resource allocation decisions. Donor agency officials are the primary audience for aid evaluation because they exercise primary control over these resources. It turns out, however, that the form of evaluation that can best inform these officials will also best inform officials of developing country governments, project managers, and the overall development community, as well as, with some additional synthesis, the legislatures that appropriate aid budgets. Evaluation and empowerment goals overlap in their management implications, and empowerment was certainly neglected by the development community prior to the mid-1970s. In many instances participatory strategies are more cost-effective than projects based on so-called blueprint approaches, so M&E for cost-effectiveness would promote participation in these cases. M&E for cost-effectiveness does not assume, however, that participatory approaches are right for all projects. The empowerment of project beneficiaries is interesting from an analytic viewpoint,
Origins and Practice of Participatory Rural Appraisal, World Development, 22:7, 953-969; Robert Chambers, 1994, Participatory Rural Appraisal (PRA): Analysis of Experience, World Development 22:9, 1253-1268; Robert Chambers, 1994, Participatory Rural Appraisal (PRA): Challenges, Potentials and Paradigm, World Development, 22:10, 1437-1454.
37

R. Bond and D Hulme, 1999, Process Approach to Development: Theory and Sri Lankan

Practice, World Development, 27:8, 1339-1358, p. 1340.

Journal of MultiDisciplinary Evaluation: JMDE(2)

34

Articles

because it can be seen both as a means to improving project designs and as an end in itself. For this reason M&E for cost-effectiveness views empowerment in a dual light. As a means, M&E for cost-effectiveness considers empowerment like any other possible means to be considered in program design. As an end, M&E for cost-effectiveness considers successful empowerment to be a benefit which must be valued and counted along with other benefits in the assessment of a projects cost-effectiveness. Under M&E for cost-effectiveness both more and less participatory projects are considered within the same evaluation framework. 4.2 Monitoring and evaluation for truth It is possible that a great practical barrier to useful evaluation arises from some of those most knowledgeable of and committed to evaluation as a science. It has been common practice to begin discussions of aid evaluation methodology with the experimental method of the natural sciences,38 and to present the various evaluation methods as, in effect, more or less imperfect approximations to randomized and controlled double-blind experiments. This approach often uses household surveys that measure conditions that a project seeks to influence, so that through appropriate comparisons changes attributable to the intervention can be identified in a statistically rigorous manner. I call it M&E for truth because it emphasizes making statistically defensible measurements of project impacts. This approach is right to insist that projects should be assessed primarily on the basis of their impacts, and that impacts should be understood as changes in the conditions
38

E.g. Dennis J. Casley and Denis A. Lury, 1982, Monitoring and Evaluation of Agriculture and

Rural Development Projects, Baltimore, MD: The Johns Hopkins University Press; Judy L. Baker, 2000, Evaluating the Impact of Development Projects on Poverty: A Handbook for Practitioners, Washington, DC: The World Bank.

Journal of MultiDisciplinary Evaluation: JMDE(2)

35

Articles

of the population compared to what would be expected in the projects absence (in evaluation jargon, as compared to the counterfactual). It is arguable, however, that in its orientation to statistical rigor it has established a gold standard that many evaluators are all too quick to disavow. Only a very small proportion of project evaluations present statistically rigorous impact estimates, and evaluations that do not often use the demanding requirements of statistical rigor as an excuse not to address the question of impacts at all. Also, evaluations that adhere rigorously to the maxims of statistical rigor seldom estimate the future impacts that can be attributed to a project. Monitoring and evaluation for cost-effectiveness is methodologically eclectic in its effort to reach reliable judgments of cost-effectiveness. It is grounded not in the first instance in the scientific method, but in the causal models of change inherent in project designs. Each project design presents a hypothesis as to the changes in beneficiary conditions that can be expected from the actions the project undertakes. It is the evaluators task to analyze how this hypothesis has unfolded, and on this basis to estimate the quantity of benefits that beneficiaries are likely to realize. A given project is taken as an instance of a project of its type, so impact estimates for other similar projects serve as a first approximation for the benefits that may be anticipated from the present project. Evaluators locate the present project along the continuum established by other similar projects based on how its design hypothesis unfolded as compared to theirs. Clearly, baseline surveys often provide critical information for estimating impacts, and statistical methodology of course provides central criteria for their analysis. As suggested above, M&E for cost-effectiveness employs participatory methodologies in many instances to elicit beneficiaries judgments of the significance of project outputs. The evaluators final estimate of a

Journal of MultiDisciplinary Evaluation: JMDE(2)

36

Articles

projects impacts and cost-effectiveness, however, is based on triangulation taking into account all the forms of information we have so far considered. 5. A More Analytic Development Assistance Community Although I have described the proposed approach as monitoring and evaluation for cost-effectiveness, the discussion up to this point has focused on evaluation only. For the proposed evaluation approach to address the accountability problem in foreign aid, however, it is essential that planners and managers should know in advance that upon its completion there will be an independent evaluation of their projects impacts and cost-effectiveness. The development assistance community as we know it has evolved under conditions of inconsistent and often limited and biased evaluation, but one could anticipate, if the proposed evaluation approach were implemented, that its effects would gradually suffuse through all stages of project planning and implementation. Project planners would soon learn to include an estimate of cost-effectiveness in their project designs, and to establish monitoring systems that would track the relevant impacts (or their main contributing factors) through the life of the project. It would soon be taken for granted that when targets or systems for estimating impacts are altered during project implementation, the reasons for these changes should be clearly documented. The development assistance community would soon learn what outcomes need to be tracked for different kinds of projects to inform subsequent impact estimates. While many individuals and groups in the contemporary development community are engaged in promoting development agendas of their own conception, the proposed reforms would enhance the experience of development work as a cooperative venture with shared goals. Development professionals would become

Journal of MultiDisciplinary Evaluation: JMDE(2)

37

Articles

more confident that others would endorse their sound justifications for their management strategies, and management strategies would be more rigorously grounded in expected impacts. Members of the development community generally would become more conscious of the pathways by which their actions contribute to improvements in beneficiary conditions, and their shared concern for efficiency would be enhanced. The development community would be quicker to identify and to adopt more successful strategies. Although I believe that outright corruption on any significant scale is uncommon in the development community,39 the higher analytic standard that the proposed reforms would bring about would reduce corruption even further. The general public has tended to be fairly skeptical of foreign aid, and the management standards described in this article provide good reasons for skepticism. The proposed reforms would make it straightforward to aggregate project impacts, for example, by country or by agency. The tax-paying public would receive better information on the consequences of foreign aid, and they would have better grounds for confidence in its integrity. In due course this could be expected to increase the generosity of citizens in the wealthier countries towards people in need.

39

I found large scale corruption in only one of the 12 projects in the sample for my dissertation.

Journal of MultiDisciplinary Evaluation: JMDE(2)

38

Articles

Network Evaluation as a Complex Learning Process Susanne Weber40

The following contribution will explicate, based on an understanding of networking as a reflexive process and on an approach working from a theory of regulation, one set of criteria for the development of evaluation designs in a networking context. Needs for evaluation and monitoring that is action- and futureoriented lead to other needs already established by social-ecological planning theory. From these can be generated questions for decisions in monitoring and evaluation within complex actor settings as well as criteria for concepts of evaluation and monitoring in a networking context. On this basis, four dimensions of network evaluation and monitoring are suggested and they are embedded in the multi-dimensional design approach of the learning network which puts collective competence development and future- and effect-orientation at the center of the developmental process. 1. Networks: Between myth, management and muddling through Networks are by now being discussed in all disciplines of the social sciences as the new paradigmatic form of organization and pattern for action. There are divergent assumptions about their status and range of applicability, their application contexts can be political, economic or social, and applications serve numerous possible
40

Corresponding author: Susanne Weber, PH.D. University of Applied Sciences, Fulda,

Department of Social Studies, Marquardstr. 35, D-36039 Fulda, Germany, telephone: 0049-6619640-224, e-mail: webers@mailer.uni-marburg.de or susanne.weber@sw.fh-fulda.de. Paper prepared for European Evaluation Society Sixth Conference, Berlin, 2004

Journal of MultiDisciplinary Evaluation: JMDE(2)

39

Articles

networking goals and purposes. For these reasons the term network is defined as a compressed term (Kappelhoff 2000:29): networking represents a perspective of hope, a factor conducive of democratization and successful cooperation, professional optimization, rationalization, market presence, and as a term employed almost as universally as the term system (Grunow 2000:314), it is often very nearly mythologized (Hellmer/Friese/Kollros/Krumbein 1999). It is used to represent a variety of possible meanings and forms of cooperation with different degrees of intensity: following Simmel (1908), society for instance is now once more increasingly explained in terms of network theory (Castells 2000; Messmer 1995; Wolf 1999), with network as one of the basic social categories. Networking is also the point of departure for more or less close forms of cooperation in a regional context, often initiated by support programs and generating research interest in practice and action, e.g. the Learning Region program. Here we are dealing not only with a clear accentuation of the term network, but with a school of thought, a line of orientation, a warmth metaphor including an accentuated demand for initiative: regions shall be guided out of their passive role, taking on an active part in dealing with their concerns (Gnahs 2003:100). As part of regional networking processes, intermediary agencies for regional learning networks are created which are supposed to tie different social fields together, to give creative support (Jutzi/Wllert 2003:130) and to serve as bridges for the initialization of regional processes by defining needs, giving orientation, maintaining and integrating patterns (ibid.:135). Sydow suggests a tighter definition, and thus a higher degree of intensity for networking, characterizing it from a micro-economic perspective on company networks as a form of organization of economic activity by enterprises which are independent by law but more or less interdependent economically. The relations
Journal of MultiDisciplinary Evaluation: JMDE(2) 40

Articles

that are introduced here are reciprocally complex and rather cooperative than competitive. They are relatively stable, they are created endogenously or induced exogenously and represent more or less polycentric systems (Sydow 2001:80). They can be categorized e.g. by their type of control and the stability of their relations (stable-dynamic) (ibid.:81). For applications in competence development Duschek and Rometsch (2004) suggested grouping the various network types into three main types: explorative versus exploitative, hierarchic versus heterarchic, and stable versus dynamic networks (ibid.:2). Risk and conflict are inherent to the structure of such institutional and organizational cooperations (Messner 1995). Due to their structural complexity, they are not always at an advantage over other forms of organization: Steger (2003) identifies contradictions like self-interest versus collective interest: in building common structures of action, a chance for creating common space for development curtails the flexibility of individual network actors; the commitment that becomes relevant in a networking context reduces the autonomy of the individual network partners, etc. (ibid.: 13f). Sydow (1999) presented the model of structural tension in network cooperation, which will be further discussed below since it can be made productive for the analysis and design of monitoring and evaluation in network cooperations. The theoretical framing of the network and network arrangements proves to be of decisive importance for the design of network cooperation as well as for the evaluation-theoretical and conceptual position of monitoring and evaluation in a network. In this paper, network cooperation is discussed on a social-scientific basis, as a social process in which the surfacing of specific conflict potentials, risks, and tensions, is to be expected. Theoretical perspectives that have complexity (Kappelhoff 2000) and structuring (Sydow 1999; Windeler 2001) as
Journal of MultiDisciplinary Evaluation: JMDE(2) 41

Articles

their starting point are capable of representing and analyzing this topic in a way that is adequate for design practice and network management at the same time. 2. The approach of network regulation as a theoretical foundation for monitoring and evaluation in networks The network regulation approach offers criteria for the conceptual level of monitoring and evaluation in networking contexts. The five characteristics of network regulation show us consequences for the design of monitoring and evaluation. Constitution One aspect of the five characteristics that belong to an approach following a theory of structuring is a procedural understanding of constitution, which transcends the static look at organizational networks. The network constitutes itself in time and space via social practices, as a collective social setting. It regulates itself systemically and contextually (Windeler 2001:203f). From the perspective of a theory of structuring, monitoring and evaluation are not outside the networking activity, they are part of the system and are systemically generated by it. In the context of a regulatory system monitoring and evaluation are also regulated and constitute themselves during that process. Multi-dimensional regulation The existence of different levels of actors is characteristic for networks: that of the individual, the group, the organization, the network itself and society as a whole. Multi-dimensional regulation means that divergent interests on the different levels are regarded as structurally unavoidable. The different levels of actors become relevant for the employment of complex monitoring and evaluation in networking
Journal of MultiDisciplinary Evaluation: JMDE(2) 42

Articles

processes. One has to deal conceptually with the question which levels should be included for the generation of knowledge and how, and what consequences are intended. One has to ask what different goals and goal achievements in a multilevel context are to be analyzed and how multiple goal structures figure in monitoring and evaluation designs. Contextualization The third assumption of a regulatory approach is that the constitution of organizational networks is a coordination of activities in time and space. Networks are embedded in specific contexts and environments that play an important role for conditions and cultures of action. Every network will develop its own contextspecific culture and specific social memory (Windeler 2001:325). Network monitoring and evaluation will be designed, and will have to be designed, according to the respective network culture. Thus we can distinguish sectorspecific evaluation cultures: In the profit field we find a strong orientation toward planning while networks close to the administration, which may e.g. be confronted with a need to legitimize their activities because they receive public funds, rely on summative and ex-post evaluation. When evaluation concepts are dealt with according to sectors, this then includes practice oriented toward planning and resources, toward process and correction, or toward summative legitimization. Co-evolution From a theoretical perspective of structuringand this is the fourth aspectthe development of organizational networks can be seen as a process of co-evolution with the relevant environment. Co-evolution means that context relevancy cannot be ignored, that the embedding in institutional contexts and relevant environments

Journal of MultiDisciplinary Evaluation: JMDE(2)

43

Articles

has to be considered. Not only the inner core of the network which is to change, but the participating organizations as well are exposed to change, so network monitoring and evaluation are capable of fulfilling a learning and development function for the inner core as well as for its environment. It remains an open question in each case to what degree the collective actors are able to reproduce their system reflexively and to establish reflexive monitoring. Subcultures, subgroups, subunits will describe and interpret themselves and their respective situations differently. System monitoring can establish practices which throw a light on the experiences and expectations of the network partners (Windeler 2001:326) and which co-evolutionarily reconnect the environment to the systems inner workings. Networking in terrains structured by dominance The fifth aspect of an approach according to a theory of structuring is recognition that organizations as collective actors interact competently and powerfully on a terrain structured by dominance (ibid.:30ff). Network membership is very intentional, discursive, strategically important and available (ibid.:251). Evaluation shows a very sensitive relation between contribution and use, and in antagonistic settings it can be contested between stakeholders. That is why procedures and programs need to be designed that analyze the contributions and potentials of the individual actors, the practices and activities as well as the networking context as a whole. General criteria for evaluation and responsibilities for their design, for monitoring, for compliance with these criteria, and for sanctioning, have to be developed reflexively. Network monitoring and evaluation face the challenge to analyze not only factual dimensions like the management of business activities, but also potential power-

Journal of MultiDisciplinary Evaluation: JMDE(2)

44

Articles

driven roadblocks in dominance-structured fields of action, e.g. veto and blocking positions, minimal consensus in goal definition, the curtailing of autonomy of network partners, refusal to learn, and the shifting of risks onto third parties (Sydow 1999:298). The evaluative function (ibid.) of network management is supposed to put the whole range of social, factual, and procedural aspects of network management to the test. Sydow bases the relationship between network management and network development on a theory of structuring (2001:82): network development is seen as observed change over time within a social system that is reproduced by relevant practices. Change takes place in a planned way through intervention and also in an unplanned way, through evolution. This perspective relates to the process by which network actors refer to network structures in their actions and attempts at guidance, reconstructing those structures by their actions (ibid.:83). Incorporated in this are structures, ways of development, and the possibilities of trans-organizational developmentbut also the possibility of failure, of unintended results, of alternative actions, of coincidence. Network development (and the effects and feedback effects it has on the organizations involved) can be described as the result of reflexive as well as non-reflexive structuring (Windeler 2001). To make networks more successfuland this is the procedural and future-oriented function of monitoring and evaluation that will be the center of our attention hereit makes sense to analyze and to design network management as reflexive network development. Monitoring and evaluation then gain central importance for network development: They facilitate the understanding of network development as a field of learning and of the collective development of competence (Weber 2002, 2003, 2004), and they suggest the importance of analyzing empirical networking projects (Weber 2001a).
Journal of MultiDisciplinary Evaluation: JMDE(2) 45

Articles

What then should concepts and designs for network evaluation and monitoring look like? This question leads to others, common in evaluation settings, e.g.: What information shall be generated, and how? What knowledge is needed and functional? What function should reflexivity have, what should it achieve? Who should generate knowledge and what should it be used for? If we take the program seriously that was elaborated at the 2003 DeGeVal conventionevaluation should lead to organizational development (Hanft 2003) and the focus should be shifted from summative and ex-post analysis towards process monitoring and future development (Weber 2001b)then it makes sense to follow the incremental theory of planning. The social-ecological theory of planning and the 1970s criticism of classical theories of planning give us criteria that can be used for the reflection and conceptualization of evaluation designs. 3. Selection decisions for the generation of knowledge in networks Uncertainty about the individual actors judgement, the comprehension of the original situation, the actors collective action, future developments and strategies under a perspective of transformation (Schffter 2001) can be made productive, if tied to monitoring and evaluation in a view supported by a defensible theory of science. Following an objectivist or constructivist understanding of reality we can distinguish an objectivist from a constructivist evaluation paradigm. These different understandings will now be considered in their polarity, and afterwards their functionality for monitoring and evaluation in a networking context will be discussed. Their polarity brings monitoring and evaluation into focus as not just instruments, but as networking practices.

Journal of MultiDisciplinary Evaluation: JMDE(2)

46

Articles

While in classical concepts (Rossi/Freeman 1989:18) evaluative approaches were regarded as analytical instruments without the ambition of serving as theoryguided science (Kuhlmann 1998:92), we here consider concepts and approaches to monitoring and evaluation as active practice which is part of and generates specific network cultures. We assume that settings for communicative evaluation are not just instruments, without preconditions and objective, but that in reality they have a generative quality, organizing observation and knowledge production according to underlying explicit or implicit criteria and models of evaluation. Working on organizational transformation processes, Roehl and Willke have pointed out theoften substantialconstructedness of evaluation settings, which is brought about by the choice of instruments and criteria. Evaluation designs are always subject to leading ideas of change, which include ideas about the validity of changes and which, in a context of complex structures of decisionmaking, predetermine the evaluative direction (Roehl/Willke 2001:29). Drawing on cybernetic, social-ecological or systemic criticism of planning in the 1970s and 1980s (Lau 1975, Atteslander 1976) decisions can be identified that become relevant to the selection of evaluation designs. E.g. in the 1970s criticism, dimensions of subjectivity, communication, and system orientation are emphasized in the face of a rationalist, technocratic paradigm. This criticism leads to an alternative planning paradigm that includes choices that are relevant for the design of planning and monitoring, such as the following: o Between technocratic feasibility and systemic irritation o Between legitimization of the past and planning of the future o Between the reproduction of the old and the generative production of the new
Journal of MultiDisciplinary Evaluation: JMDE(2) 47

Articles

o Between expert objectivity and subject participation o Between the completeness of what is known and the processing of what is not known/uncertain o Between result measurement and the development of competence These selective decisions can be found, in different manifestations, in todays evaluation practice in different social contexts, and their range, their deficits, and their chances for reality construction can be analyzed. The following presentation of decisions and questions relevant to evaluation in network settings does not pretend to cover all aspects comprehensively; instead it treats them by means of examples. 3.1 Evaluation knowledge between systemic irritation and technocratic feasibility Within the evaluation community there is a tension between two contradictory approaches, either of which follows from basic questions of a theory of planning. A technocratic approach builds on the assumption that existing knowledge can be used to give an intentional design to social conditions (Herrmann 2001:1365), that social processes can be rationally planned and influenced. On the other hand there is the contrary view, skeptical of a teleological regulative approach to social processes that presupposes predictable results. This view assumes that even the most advanced and differentiated instruments of planning eventually cannot handle social reality. In the 1970s, models that take an optimistic view of regulation are increasingly opposed by regulation-skeptical models calling for more open and dynamic approaches to planning and evaluation. Early on, Lau (1975) pleads for
Journal of MultiDisciplinary Evaluation: JMDE(2) 48

Articles

management of complexity through a participative concept of planning that retains a sense of flexibility. Atteslander presents a typology of different planning models and defines a dogmatic, a technocratic and a cybernetic or systemic type (Atteslander 1976:20). The systemic-constructivist assumption of the self-organization of institutional systems leads to a concept of planning and thus of measuring effectiveness and evaluation which is based rather on irritation than on technocratic feasibility. Reflexivity is encouraged and facilitated in order to partially produce uncertainty (Herrmann 2001:1365). 3.2 Evaluation knowledge between legitimization of the past and planning of the future Another dimension pertinent for todays evaluation debate is the directedness towards past, present or future. Evaluation or monitoring designs aim, to varying degrees, to create legitimacy or change and complex transformation. The directedness of evaluation designs towards past, present or future is today influenced by sectors and organizational cultures. An evaluation culture in the sense of a summative evaluation emphasizes the thorough analysis of the past, the evaluation of previous projects. Here the aim is often legitimization, and evaluation is rather geared towards a bureaucratic model of control, transparency, and the evaluation of goal attainment. The focus is set on the summative evaluation of individual measures and programs without strong references to organizational visions and goals, and the activities are relatively little strategically synchronized or planning-oriented.

Journal of MultiDisciplinary Evaluation: JMDE(2)

49

Articles

A monitoring culture on the other hand emphasizes a process-accompanying, formative evaluation and self-evaluation. Goals connected to monitoring and evaluation are endogenous development, motivation instead of control, processorientation, and improvement on the level of professional action. Possible risks lie in conducting many parallel activities on all levels (supervision, etc.) which do not receive feedback from each other, which are not directed toward the organizational or networking goal, and which see themselves as strategically oriented. A tendency towards monitoring with self-evaluation classically corresponds with the evaluation concepts preferred by the non-profit sector. Evaluation designs which are more strongly embedded in a planning culture emphasize diagnosis, feasibility studies, and conditions for success; they do not rely very much on summative evaluation. Their focus is on future orientation, financial aspects of a cost-benefit relation, numbers and control. The most effective interventions harmonize with visions and strategies of the system of reference, in this case the network. The aim is not the realization of individual activities but the strategic feedback relationship of all measures that is supposed to create an equal directedness of all activities. 3.3. Evaluation knowledge between reproduction of the old and generative production of the new There is also a tension in monitoring and evaluation between the reproduction of the old and the generative production of the new. This tension is already implicit in the demands made during the planning debate of the 1970s: instead of mechanistic models for planning, the generative production of the new was to be facilitated. Instead of prognoses of the future based on the status quo, anticipation was to be employed systematically. The inclusion of prophecies and projections of all kinds

Journal of MultiDisciplinary Evaluation: JMDE(2)

50

Articles

in a context of cybernetic models of planning was seen as more adequate to the challenges and demands of planning than dogmatic or technocratic models of planning (Atteslander 1976:53). 3.4. Expert objectivity or subject participation The fourth decision in evaluation represents the distance between evaluation by experts and by participants. Evaluation by experts is often oriented at utilitaristicrationalist models of action and leaves responsibility in the hands of the expert. The participants tend to become objects of the evaluation, not systemic partners in collective efficiency measurement and evaluation. In a heterarchic decision-making structure, democratized expertise is a given and the production of knowledge that becomes relevant for action has to work with network knowledgeif it does not, there are distinct risks of interest-guided dominance and colonization on the one hand, lack of acceptance and inner emigration by network partners on the other. Knowledge production in networks thus has to rely on the cooperative structures of participatory research (Atteslander 1976:53). The efficiency of the solution of material problems depends on the participation of those concerned, on openness to criticism, on horizontal structures of interaction and on democratic procedures for implementation. 3.5 Completeness of what is known or processing of what is not known/uncertain This decision is tied to different accentuationsis knowing or not-knowing the point of reference? Open and dynamic models of planning and of monitoring assume a transformable worldview and a comprehensive definition of goals. They do not presuppose knowledge but rather incomplete knowledge or no knowledge

Journal of MultiDisciplinary Evaluation: JMDE(2)

51

Articles

about the current situation and its structures. These approaches are synthesizing they methodically attempt to integrate ideological, technological and social aspects of networking contexts. The rationality and the kind of prognoses connected to cybernetic efficiency measurement and evaluation can be described as an operational rationality working with a combination of deductive and normative prognoses. In an incremental view, planning can never be final, it is always preliminary and influenced by a large amount of feedback (Atteslander 1976:55). It systematically needs monitoring and evaluation. Monitoring and evaluation are more than social technology in this case, they are reflexive practice and the creation of communicative contexts where the constitution of social meaning takes center stage. The focus is not on needs presumed to be objective, but on the needs and perspectives of the network actors. 3.6 Result measurement or the development of competence Contrary to a basic view of processes of planning, monitoring and evaluation as technology, an understanding of planning and monitoring as something to be negotiated is directed towards the development of competence. Contrary to placing planning, monitoring and evaluation before or after actual practice, an integrated view is suggested, which shifts away from a purely concept-oriented evaluation of efficiency towards one that also considers (micro- and meso-) political structures. Monitoring and evaluation are no longer primarily goal-oriented, instead the area of work becomes evaluation-oriented. Measurement of efficiency and evaluation can tune in to a daily networking routine that changes slowly. This understanding goes hand in hand with an increase in competence and with self-rationalization of the network partners. By taking into view the social aspects of the production of
Journal of MultiDisciplinary Evaluation: JMDE(2) 52

Articles

knowledge that is relevant for implementation, an open model aims at the development of competence, functionality in poly-centric and heterarchic structures, and the internal democratization of expertise (Atteslander 1976). Communicative planning concepts are process-oriented, not schematic; they follow the principle of negotiation. They do not pretend to be neutral in terms of values but facilitate working through the topic of value, the equivocal connection between ends and means in social contexts. Communicative approaches are the only planning approaches that attempt to bridge the gap between conceptual planning and practical action by conceptually integrating the problem of the implementation of planning results. Communicative planning practice further documents that such models represent adequate concepts for action within the complex and contradictory conditions and processes in areas of planning (Herrmann 1998; 2001:1378). These short sketches of considerations based on a theory of planning can be used for the design of instruments and concepts of evaluation and monitoring. They address central questions about the basic assumptions, the direction and starting points of analysis, about the status of evaluation in networking contexts, implicitly also about instruments and procedures, and they furnish a pattern for metaevaluation, in so far as evaluation concepts themselves become objects of evaluation with the help of certain criteria. 3.7 Consequences for the design of monitoring and evaluation in networking contexts It has become evident that classical evaluative approaches reach their limits in networking contexts. E.g. in complex program evaluation it could be shown that the fact has been neglected that programs follow multiple, conflicting and
Journal of MultiDisciplinary Evaluation: JMDE(2) 53

Articles

evolving purposes (Kuhlmann 1998:97), that the context of their conception is often not sufficiently understood, that evaluation is used as a killer, that the views of those who are responsible for the program are taken into consideration but not the interests of those concerned (ibid.:98). In the context of a multi-layer concept it is neither possible nor sensible to measure objective results exactly, in the sense of eternal truths (ibid.:85). Under a perspective of reflexivity in a networking context, communicative validation, process monitoring and evaluation become integral parts of network regulation as a design approach. A social-ecological planning paradigm becomes manifest, demanding a mainly communicatively oriented validation, an incremental communicative practice of planning and action that is more adequate to the necessities of the field than classical evaluation designs (Zipp 1976:77). These demands can be tied to the social-ecological approach to evaluation developed by Guba and Lincoln (1989), continued in participative approaches to evaluation (Ulrich/Wenzel 2004) and implemented empirically (Uhl/Ulrich/Wenzel 2004). Networks are exposed to structural uncertainty about the future, and in their intended and unintended reflexive practice, in the systematic form of evaluative, process-analytical and planning practice with a perspective of collective development of competence, they can be reconstructed as learning networks (Weber 2002), i.e., as a field of pedagogical rationality (among others) (Helsper/Hrster/Kade 2003). Intended and unintended qualities of learning will find their space here. Informal, quasi-evolutionary learning processes as well as orchestrated reflexive interventions can generate learning and reflexivity in a learning network. Learning (on the different levels of individual actors, groups of actors, the network structure and its relevant environment, up to the social body as a whole) is contingent and uncertain. As learning from experience it is
Journal of MultiDisciplinary Evaluation: JMDE(2) 54

Articles

intertwined with everyday working activity. If the knowledge-generating practice of making experiences on the job gets established, systemized, and structurally put into a feedback relation with the systems practice, then orderly procedures for institutional and network learning are created. Learning on the subject level and on different system levels also becomes systemized, and monitoring and evaluation are put into a context of a development-oriented strategy of collective learning within the network. Depending on to which dimensions the reflexive generation of knowledge within the network can be designed around, they will be the focus of the following section. 4. Dimensions of evaluative and planning-oriented learning within a network On the basis of empirical networking projects (Weber 2001a) and literature on networking theory we can determine four dimensions for a strategy of collective learning. These design dimensions of system monitoring and evaluation are the social dimension, the dimension of network functions, that of structural tensions created by networking processes and that of learning and learning arrangements (Weber 2002, Weber 2003). In our analysis of network functions and structural tensions we follow the works of Sydow (1999, 2001) and Windeler (2001). While these two aspects have already been objects of network regulation as well as reflexive approaches, the dimension of the social process and that of learning have not yet been considered systematically under a design point of view. 4.1 Social regulating and social monitoring The regulatory approach presupposes structuring by social actors, so the social dimension is thus structurally included. Cooperative inter-organizational relations are seen as based on social processes; personal and social closeness is regarded as a necessary condition for successful networking processes (Winkler 2002:37).
Journal of MultiDisciplinary Evaluation: JMDE(2) 55

Articles

Network knowledge is always social, it is created by and embedded in social practice, with its individual and collective elements. As a whole, network relationships are based on exchange, which is, in turn, based on stable expectations and a norm of reciprocity. Trust is also seen as a sine qua non for successful projects (Windeler 2001). This shows that the social dimension is indeed recognized as relevant, but so far it has not been addressed in its quality as a group context. Supported by group-dynamics and team-development approaches, we can here refer to categories of the group process that Tuckman uses to describe different group and team qualities (1965). Tuckmans model assumes that the first phase of groups encounters is friendly and noncommittal, while the second phase of the process is characterized by struggle for social status and power within the social structure. In a third phase the group then has to come together to a functional whole, and positions in social space have been negotiated. As a fourth phase we see the performing group. Tuckman extends the group, capable of working and performing, into the future. Theoretical positions based on structuring and complexity describe networks as coevolutionary entities (Kappelhoff 2000b:382) that do not show a linear development. Still, Tuckmans four-phase approach is useful for its qualitative criteria for the analysis and design of group-dynamical aspects. His definitions of Forming, Storming, Norming and Performing in a social context can be used for monitoring and for evaluation since they provide criteria for the analysis and design of the social context which can be found empirically with the help of indicators. 4.2 Functional network guidancemonitoring of network functions

Journal of MultiDisciplinary Evaluation: JMDE(2)

56

Articles

Another dimension of monitoring and evaluation is the functional dimension of network guidance introduced by Sydow (Sydow 1999). All elements of network regulationselection, allocation, evaluation, system integration, configuration of positions, constitution of borderscan be objects of evaluation: the selection of the actors belonging to the system, the allocation of resources, the evaluation of the process and the specifics of system integration, and of the configuration of positions and the constitution of borders (Windeler 2001:249). In a design approach, selection includes the question of who?who shall be included? This question becomes important at an early networking stage. After that the focus shifts to the allocation of tasks and resources, the distribution of responsibility among the partners. Regulation of cooperation within the network provides the development and implementation of rules between the organizations. Evaluation of network organizations can concern the network as a whole or just selected rules of cooperation (Sydow 1999:295f). Windeler adds two others to these four functional aspects: system integration and border management (Windeler 2001). Measures of system integration influence the selection of actors; the practice of configuration of positions and of the constitution of borders pose particular challenges to potential newcomers etc. (ibid.:251). These objects of network regulation are interconnected in a recursive relation. The six aspects of network guidance are open to analysis and elaboration under the focus of a functional dimension. While Sydow describes them as procedural, they do not just develop their relevance along the stages of the process but also across them: selection, allocation, regulation, evaluation, border management and system integration are necessary and have to be repeated perpetually and circularly. They

Journal of MultiDisciplinary Evaluation: JMDE(2)

57

Articles

offer a catalogue of questions, criteria and indicators for network monitoring and evaluation along the emergence of design necessities. In Sydows approach to network functions (1999:298) monitoring and evaluation have systematic value. The characteristics of reflexive network regulation provide a concrete basis to the function and design of monitoring and evaluation. All in all it becomes evident that network monitoring and evaluation have to be integral parts of a complexity-oriented reconstruction of networking processes. Sydow assumes that monitoring and evaluation become important factors in the design of paths of development within reflexive network development. They furnish the informational basis for a (more) reflexive network development by network management. While evaluation aims at the contributions of individual network organizations, at the quality of the network relations that have been developed or at the network effect, and while as a function of management it is concerned with the practice of evaluating, reflexive monitoring is designed as a tool for supervision of ones own actions, of the conditions and effects of actions and of the actions of others (Sydow 2001:90). From a design perspective, monitoring and evaluation facilitate the systematic regulation of networking risks and the increase of networking success. 4.3 Structural tensionmonitoring of tension A third focus of complexity-oriented network monitoring has to be the dimension of structural tension. Sydow has introduced eight lines of tension that have to be regulated in networking processesor if lacking regulation can cause a networking process to fail (Sydow 1999). They provide analytical potential and differentiating criteria for the evaluation and design of network cooperations. Messner, coming from political science, has also identified structural dilemmas of

Journal of MultiDisciplinary Evaluation: JMDE(2)

58

Articles

networking that have to be worked on within networking processes (Messner 1995, 1994). The following section is based on Sydows presentation (1999) of the lines of tension between autonomy and dependency, trust and control, cooperation and competition, flexibility and specificity, variety and unity, stability and fragility (e.g. change), formality and informality, economic rationality and preservation of power (Sydow 1999:300). Varietyunity: How can a balance be reached between the variety of participating actors and their integration to some kind of unity? Flexibilityspecificity: How flexible is the network in terms of its goals and selfimage, how specific is it? Autonomydependency: How much autonomy is possible and what does it consist of, how much dependency exists and what does it consist of? Trustcontrol: How much trust and what kind of trust is there; what is regulated by control mechanisms, and how? Cooperationcompetition: What role do cooperation and competition play? What relationship is created between them? Stabilityfragility: What role do stability and fragility play? How are they created? What regulating mechanisms exist? Formalityinformality: How is the relationship between formality and informality regulated, what relationship do they have? Economypower: What relationship is there between arrangements of functionality and power? How are power patterns generated?

Journal of MultiDisciplinary Evaluation: JMDE(2)

59

Articles

Windeler (2001) also refers to these lines of tension in his approach based on a theory of structuring. Within a monitoring approach they can be regarded as analytical dimensions and as design parameters. They are useful for the incorporation of reflexivity in discursive and qualitative processes of analysis, thus for clarification and localization within the discursive context and the networks path of development. 4.4 Knowledge, communication and system reflexivity: networking as a learning process Since networks represent dynamic rather than static arrangements of relations and cooperations, networking has to be read as a learning process. Monitoring and evaluation have the function to generate knowledge from practical experience and to reflect on it, in order to deduce knowledge from it that may guide future actions (Uhl/Ulrich/Wenzel 2004:11). So their primary objective is to provide chances for learning and optimization on the system level. They are tied to system reflexivity and communication, re-entering into the circle of active planning within the network. The explicit directedness of monitoring towards the design of learning contexts makes it possible to identify future-oriented developmental potentials of networking projects. Discursive reflection produces awareness of change in the first placedata gathering procedures not only reconstruct their subject in different ways, the subject of reflection itself is changed by it (Hendrich 2003:157). In network contexts as informal learning contexts, aspects of a learning biography and the estimation of ones own competence can also be used for a kind of monitoring that is oriented to competence development. The social dimension, the functions of network guidance, the structural lines of tension, and the dimension of learning within the networking process, have been

Journal of MultiDisciplinary Evaluation: JMDE(2)

60

Articles

suggested as dimensions for the monitoring of efficiency and for the evaluation of complex transformations (Weber 2003). What instruments and learning arrangements can support complexity-oriented monitoring and evaluation which reply to demands on the social, functional, structural and learning dimensions in a pragmatic and manageable fashion? 5. A perspective: instruments for evaluative and planning-oriented network development As the criticism of under-complex evaluation designs has shown, the focus may not be narrowed to a few efficiency indicators since this includes the risk of distortions. Especially quantified data is often endowed with a status of objectivity that makes it difficult to question the results. Under-complex designs for monitoring and evaluation have counterproductive effects when the truth production of the system generates faulty attributions and labeling or unintended effects, e.g. in the sense of social dynamics. This means that monitoring and evaluation in a network have to be geared towards communication and complex reconstruction. In complex social systems reflexive network monitoring will not exclusively be left to process counselors, brokers, coordinators and moderators. It will be part of everyday action and has to be functional in terms of the necessities that come with this. Below the level of external evaluation by experts it is recommended that there be developed a discursive, procedural self-evaluation. On this level networking needs cooperative core competence to balance existing tensions. These tensions cannot be dissolved; they are part of the structural characteristics of networking and have to be dealt with productively. In this way they become accessible to process evaluation and optimization. Sydow thinks that a continuous employment

Journal of MultiDisciplinary Evaluation: JMDE(2)

61

Articles

and practice of reflexive monitoring can render more formal evaluative methods unnecessary (Sydow 2001:97). Monitoring and networking in networks are instruments for the construction of reality, wrapped up in a heterarchic and polyvalent structure of interests and in complex transformation processes. For an integrated design of monitoring and planning it seems to be practical to generate open evaluation designs (Lynen von Berg/Hirseland 2004:15). These designs should take the form of participative evaluation (Oels 2003, in publication; Weber 2003; Weber/Benthin, in publication) which should be multi-layered, procedural and temporal. Design criteria for network evaluation should be a multitude of perspectives, process-, future- and identity-orientation as well as an orientation toward a multi-layered approach. Depending on a given context of economic sectors or institutions, instruments of quality management can be employed, or self-evaluation or ex-post evaluation by experts can be seen as practical. Network monitoring and evaluation which are geared to future-oriented learning and collective development of competence will be designed in a rather decentralized, dynamic and open fashion, although the employment of quantitative methods is not excluded. Evaluative learning arrangements combine qualitative and quantitative methods, methods that generate knowledge and those that measure success in a methodical mix, and can thus fulfill the different demands of a networking context. To deal adequately with complex relations of cause and effect they should be represented in a complex fashion (Bangel 1999:354). Guba and Lincoln (1989) suggested an approach of stakeholder-based evaluation, one that is participant-oriented and allows the collective definition of criteria and indicators of successful cooperation. Participatory effect monitoring

Journal of MultiDisciplinary Evaluation: JMDE(2)

62

Articles

follows a central evaluative objective of increasing the collective capability to act, of breaking out of old ruts, and of doing things differently and possibly better than in the past (Oels 2003). Its goal is to expand the repertoire of action (Benthin/Baumert 2001), to increase autonomy and to minimize the degree of manipulation and passivity (Oels, in publication). The approach to evaluation, monitoring and planning described as stakeholder-based evaluation is based on a constructivist paradigm and aims at addressing a large variety of perspectives which can also be contradictoryin order to create a complex picture of the whole. Indicators and criteria for monitoring and evaluation are generated interactively with the actors concerned. Special emphasis is placed on the definition of learning objectives. On the basis of a participant-oriented approach, instruments of network management can be put to use which can take over planning, monitoring and evaluation functions and in this way fulfill the evaluative functions of understanding, legitimization and optimization. Especially in open, dialogical settings, the objects of evaluation can be regarded as dimensions of social, functional, structural and learning evaluation. For example, the social dimension in networks can be analyzed with the help of indicators: the Balanced Scorecard is an instrument for the analysis of network functions as objects of evaluation. Structural tensions can be analyzed e.g. with an appreciative evaluation approach, while the dialogical arrangements of Large Group Interventions can provide an evaluative, planning arrangement of network learning. In this way, contexts and procedures of complex (self-) evaluation are created that simultaneously cover the functions of understanding and optimization, and if needed legitimization as well (Ulrich/Wenzel 2004:28).

Journal of MultiDisciplinary Evaluation: JMDE(2)

63

Articles

Large Group Interventions provide strategic agility and risk minimization in fast transformation processes with a high degree of network activity, because they make use of collective intelligence (Knigswieser/Keil 2000). As procedures of transformation they follow the systemic paradigm (Bunker/Alban 1997:5). Systemic, open approaches like Large Group Interventions make it possible to regulate the network tensions brought about by system monitoring and evaluation. They create a mode of pedagogical organizing, with its quality of experimental practice (Weber 2004, in publication). A practice that is oriented towards reflexivity and knowledge generation closes the circle of knowledge provided by monitoring, evaluation and planning in the sense of an incremental, spiral-shaped model of evaluation, working with the iterative practice of producing systemic rationality. But it will never produce complete results and will always have rational and irrational parts (Windeler 2001:220). This practice of system reflexivity produces a discursive arrangement of ulterior and self-guidance in which a lot escapes the grip of reflexivity, in which unrecognized conditions and unintended results of actions emerge as well as blind spots, chain reactions and reflexively influenced causal connections. So participant-oriented effect monitoring in networks will always have to try and strike a balance with that which is not known (Kade 2003). For this reason it will escape the myth of technocratic feasibility and embark on a journey of collective procedural learning. References Atteslander, P. (1976): Sozialwissenschaftliche Aspekte von Raumordnung und Raumplanung. In P. Atteslander (Hrsg.), Soziologie und Raumplanung. Berlin u.a.: De Gruyter. pp. 10-71.

Journal of MultiDisciplinary Evaluation: JMDE(2)

64

Articles

Bangel, Bettina (1999): Evaluierung der Arbeitsmarktpolitik aus Lndersicht. Die Brandenburger Konzeption der adressatenorientierten Evaluation. In: Informationen fr die Beratungs- und Vermittlungsdienste der Bundesanstalt fr Arbeit. Nr. 45, pp. 3745-3754. Benthin, Nicole; BAUMERT, Martina. (2001): Selbstevaluation als Methode der Qualittsentwicklung, Prozesssteuerung und summativen Evaluation. In: WEBER, S. (Hrsg.): Netzwerkentwicklung in der Jugendberufshilfe. Erfahrungen mit institutioneller Vernetzung im lndlichen Raum. Opladen. pp. 261-280. Bunker, Barbara Benedict; Alban; Billie, T. (1997): Large Group Interventions. Engaging the Whole System for Rapid Change. Jossey Bass. San Francisco. Castells, Manuel (2000): Elemente einer Theorie der Netzwerkgesellschaft. In: Sozialwissenschaftliche Literaturrundschau. 2/2000. Heft Nr. 41. pp. 37-54. Duschek, Stephan; Rometsch, Markus (2004): Netzwerktypologien im

Anwendungsbereich Kompetenzentwicklung. In: QUEM Bulletin Berufliche Kompetenzentwicklung. Heft 3/2004. Berlin. pp. 1-7. Gnahs, Dieter (2003): Indikatoren- und Messprobleme bei der Bestimmung der Lernhaltigkeit von Regionen. In. Brdel, Rainer; Bremer, Helmut; Chollet, Anke; Hagemann, Ina-Marie (Hrsg.): Begleitforschung in Lernkulturen. Mnster / New York / Mnchen / Berlin. Waxmann. Verlag. pp. 92-106. Grunow, Dieter (2000): Netzwerkanalyse. Theoretische und empirische

Implikationen. In H.-J. Dahme & N. Wohlfahrt (Hrsg.), Netzwerkkonomie im Wohlfahrtsstaat. Wettbewerb im Sozial- und Gesundheitssektor. Berlin: Edition Sigma. Pp. 303-336.
Journal of MultiDisciplinary Evaluation: JMDE(2) 65

Articles

Guba, Egon G.; Lincoln, Yvonna S. (1989): Fourth Generation Evaluation. London. Hanft, Anke (2003): Evaluation und Organisationsentwicklung. Erffnungsvortrag zur 6. Jahrestagung der Deutschen Gesellschaft fr Evaluation e.V. (DeGeVal). 8.-10.11.2003 in Hamburg. EvaNet-Positionen 10/2003. http://evanet.his.de Hellmer, F.; Friese, Chr., Kollros, H.; Krumbein, W. (1999): Mythos Netzwerke. Regionale Innovationsprozesse zwischen Kontinuitt und Wandel. Berlin. Edition Sigma. Helsper, Werner; Hrster, Reinhard; Kade, Jochen (2003): Pdagogische Felder im Modernisierungsprozess. Weilerswist. Hendrich, Wolfgang (2003): Anstze interaktiver Evaluationsmethoden in der beruflichen Weiterbildung am Beispiel informell erworbener Kompetenzen. In: Brdel, Rainer; Bremer, Helmut; Chollet, Anke; Hagemann, Ina-Marie (Hrsg.): Begleitforschung in Lernkulturen. Mnster / New York / Mnchen / Berlin. Waxmann. Verlag. pp.149-161. Herrmann, Franz (1998): Jugendhilfeplanung als Balanceakt. Umgang mit Widersprchen, Konflikten und begrenzter Rationalitt. Neuwied. Herrmann, Franz (2001): Planungstheorie. In: Otto, Hans-Uwe; Thiersch, Hans (2001): Handbuch der Sozialarbeit, Sozialpdagogik. Neuwied; Kriftel. 2. Auflage. pp. 1375-1382. Jutzi, Karin; Wllert, Katrin am (2003): Beispiel Erfahrungen der und Probleme mit

Handlungsforschung

wisenschaftlichen

Begleitung

Journal of MultiDisciplinary Evaluation: JMDE(2)

66

Articles

intermedirer Agenturen. In: Brdel, Rainer; Bremer, Helmut; Chollet, Anke; Hagemann, Ina-Marie (Hrsg.): Begleitforschung in Lernkulturen. Mnster / New York / Mnchen / Berlin. Waxmann. Verlag. pp. 129-143. Kade, Jochen (2003): Wissen - Umgang mit Wissen Nichtwissen. ber die Zukunft pdagogischer Kommunikation. In: Gogolin, Ingrid; Tippelt, Rudolf (Hrsg.): Innovation durch Bildung. Beitrge zu 18. Kongress der Deutschen Gesellschaft fr Erziehungswissenschaft. Opladen. pp. 89-108. Kappelhoff, Peter (2000a): Der Netzwerkansatz als konzeptueller Rahmen fr eine Theorie interorganisationaler Netzwerke. In J. Sydow & A. Windeler (Hrsg.), Steuerung von Netzwerken. Opladen: Westdeutscher Verlag. pp. 2557. Kappelhoff, Peter (2000b): Komplexittstheorie und die Steuerung von Netzwerken. In: J. Sydow & A. Windeler (Hrsg.), Steuerung von Netzwerken. Opladen: Westdeutscher Verlag. pp. 347-389. Kirstein, H. (o.J.): Was ich schon immer ber die Balanced Scorecard wissen wollte. 1.10.2003) Knigswieser, Roswita; Keil, Marion (Hrsg.) (2000): Das Feuer groer Gruppen. Konzepte, Designs, Praxisbeispiele fr Grogruppenveranstaltungen. Stuttgart: Klett-Cotta. Kuhlmann, Stefan (1998): Politikmoderation. Evaluationsverfahren in der Forschungs- und Technologiepolitik. Baden Baden. NomosVerlag. Unter: http://www.deming.de/efqm/balanscore-1.html (am

Journal of MultiDisciplinary Evaluation: JMDE(2)

67

Articles

Lau, Christoph (1975): Theorien gesellschaftlicher Planung. Eine Einfhrung. Stuttgart, u.a. Lynen von Berg, Heinz; Hirseland, Andreas (2004): Zivilgesellschaft und politische Bildung Zur Evaluation von Programmen und Projekten. In: Uhl, Katrin; Ulrich, Susanne; Wenzel, Florian M. (Hrsg.): Evaluation politischer Bildung. Ist Wirkung messbar? Verlag Bertelsmann Stiftung. Gtersloh. pp. 15-26. Messner, Dirk (1995): Die Netzwerkgesellschaft. Wirtschaftliche Entwicklung und internationale Wettbewerbsfhigkeit als Probleme gesellschaftlicher Steuerung. Kln: Weltforum-Verlag. Messner, Dirk (1994): Fallstricke und Grenzen der Netzwerksteuerung. In: PROKLA-Zeitschrift fr kritische Sozialwissenschaft. Nr. 97.pp. 563-596. Oels, Angela (2000): Lets get together and feel alright! Eine kritische Untersuchung von Agenda 21-Prozessen in England und Deutschland. In H. Heinelt & E. Mhlich (Hrsg.), Lokale Agenda 21-Prozesse. Erklrungsanstze, Konzepte und Ergebnisse. Reihe Stdte und Regionen in Europa Band 7, Opladen: Leske + Budrich. pp. S.182-200. Oels, Angela (i.V.a): Grogruppenevaluation in Netzwerken. In S. Weber (Hrsg.), Netzwerklernen. Methoden, Instrumente. Erfolgsfaktoren. Oels, Angela (2003): The Power of Visioning. An evaluation of community-based Future Search Conferences in England and Germany. Mnster. Schffter, Ortfried (2001): Weiterbildung in der Transformationsgesellschaft. Zur Grundlegung einer Theorie der Institutionalisierung. Hohengehren.

Journal of MultiDisciplinary Evaluation: JMDE(2)

68

Articles

Simmel, Georg (1908): Soziologie. Leipzig: Duncker & Humblot. Steger, Renate (2003): Netzwerkentwicklung im professionellen Bereich dargestellt am Modellprojekt REGINE und dem Beraternetzwerk zetTeam. Materialien aus dem Institut fr empirische Soziologie an der FriedrichAlexander-Universitt Erlangen-Nrnberg. 6/2003. IfeS. ISSN 1618-6540 (Internet). www://ifes.uni-erlangen.de. Sydow, Jrg (1999): Management von Netzwerkorganisationen. Zum Stand der Forschung. In J. Sydow (Hrsg.), Management von Netzwerkorganisationen. Wiesbaden: Gabler. pp. 279-305. Sydow, Jrg (2001): Management von Unternehmungsnetzwerken Auf dem Weg zu einer reflexiven Netzwerkentwicklung? In: Howaldt, Jrgen; Kopp, Ralf; Flocken, Peter (Hrsg.): Kooperationsverbnde und regionale Modernisierung. Theorie und Praxis der Netzwerkarbeit. Gabler Verlag. Wiesbaden. pp. 79-102. Tuckman, B.W. (1965): Developmental Sequences in Small Groups. Psychological Bulletin 63, pp. 384-399. Uhl, Katrin; Ulrich, Susanne; Wenzel, Florian W. (2004): Einleitung: Evaluation und politische Bildung was kann man messen? In. Uhl, Katrin; Ulrich, Susanne; Wenzel, Florian M. (Hrsg.): Evaluation politischer Bildung. Ist Wirkung messbar? Verlag Bertelsmann Stiftung. Gtersloh. pp. 9-12. Ulrich, Susanne; Wenzel, Florian M. (2004): Partizipative Evaluation. In: Uhl, Katrin; Ulrich, Susanne; Wenzel, Florian M. (Hrsg.): Evaluation politischer Bildung. Ist Wirkung messbar? Verlag Bertelsmann Stiftung. Gtersloh. pp. 27-48.
Journal of MultiDisciplinary Evaluation: JMDE(2) 69

Articles

Weber, Susanne (i.V.) (Hrsg.): Netzwerklernen. Instrumente, Methoden, Erfolgsfaktoren. Weber, Susanne (2001a). Netzwerkentwicklung in der Jugendberufshilfe. Erfahrungen mit institutioneller Vernetzung im lndlichen Raum. Opladen: Leske + Budrich. Weber, Susanne (2001b): Vom Ist zum Soll: Partizipative Verfahren und neue Planungsrationalitt. In: Keiner, Edwin (Hrsg.): Evaluation (in) der Erziehungswissenschaft. Weinheim und Basel. pp. 255-272. Weber, Susanne (2002): Vernetzungsprozesse gestalten. Erfahrungen aus der Beraterpraxis mit Grogruppen und Organisationen. Wiesbaden: Gabler. Weber, Susanne (2003): Zur Evaluation von Grogruppenverfahren am Beispiel regionaler Vernetzung. In: Dewe, Bernd; Wiesner, Gisela; Wittpoth, Jrgen (2003): REPORT. Literaturund Forschungsbericht Weiterbildung. Erwachsenenbildung und Demokratie. Dokumentation der Jahrestagung 2002 der DGfE Sektion Erwachsenenbildung. pp. 110-119. Weber, Susanne (2004a): Transformation und Improvisation. Universitt

Grogruppenverfahren als Technologien des Lernens im Umgewissen. Unverffentlichte Marburg. Weber, Susanne (2004b): Organisationsnetzwerke In: W. Bttcher & E. und pdagogische (Hrsg.), Habilitationsschrift. Philipps-Universitt

Temporrorganisation.

Terhart

Organisationstheorie. Ihr Potential fr die Analyse und Entwicklung von pdagogischen Feldern. pp. 253-269.

Journal of MultiDisciplinary Evaluation: JMDE(2)

70

Articles

Weber, Susanne; Benthin, Nicole (i.V.b): Innovation, Wissen, Selbstreflexivitt im Netzwerk generieren. In: Weber, Susanne: Netzwerklernen. Methoden, Instrumente, Erfolgsfaktoren. Wenzel, Florian M. (2004): Selbstevaluation wertschtzend gestalten methodisches Vorgehen in 6 Schritten. In: Uhl, Katrin; Ulrich, Susanne; Wenzel, Florian M. (Hrsg.): Evaluation politischer Bildung. Ist Wirkung messbar? Verlag Bertelsmann Stiftung. Gtersloh. pp. 177-196. Windeler, Arnold (2001): Unternehmensnetzwerke. Konstitution und

Strukturation. Wiesbaden: Westdeutscher Verlag. Winkler, Ingo (2002): Steuerung zwischenbetrieblicher Netzwerke

Koordinations- und Integrationsmechanismen. In: Freitag, Matthias, Winkler, Ingo (Hrsg.): Kooperationsentwicklung in zwischenbetrieblichen Netzwerken. Strukturierung, Koordination und Kompetenzen. Wrzburg; Boston. pp. 31-55. Wolf, Harald (1999): Arbeit und Autonomie. Ein Versuch ber Widersprche und Metamorphosen Dampfboot. Zipp, Gisela (1976): Planungsziele und Planungswirklichkeit. In: Atteslander, Peter (Hrsg.): Soziologie und Raumplanung. Berlin. pp. 72-93. kapitalistischer Produktion. Mnster .Westflisches

Journal of MultiDisciplinary Evaluation: JMDE(2)

71

Practical Ethics for Program Evaluation

Client Impropriety Chris L. S. Coryn41, Daniela C. Schrter, and Pamela A. Zeller

Requests for proposals (RFPs) often include statements transferring ownership of the content of proposals to the requestor. Thus, evaluators are frequently faced with the problem of responding to a RFP in an unprotected manner, knowing full well that potential funding entities have the legal right to implement these ideas without the submitters approval. In extreme cases, funding entities have even requested proposals for the purpose of idea-generation only, that is, it was never the intention to fund these submissions, only to use their ideas. This kind of ethical abuse is neither new nor unique to evaluation. Allowing intellectual property to become the property of the entity requesting the proposal directly influences the evaluators work and raises several significant ethical issues regarding the contractual statements found in most requests for proposals which give funding agencies property rights to all information and materials submitted to them. Take the following case, for example. Recently, a request for proposals was issued for an adult drug treatment program. The RFP was of the usual sort; design, expertise and experience, budget, and so on. Two proposals were ultimately selected as the final candidates: the first, a well planned, systematic evaluation with a proposed budget of just under $100,000; the second a poorly designed effort
41

Corresponding author: Chris L. S. Coryn, The Evaluation Center, Western Michigan 1903 West Michigan Avenue, Kalamazoo, MI 49008, e-mail:

University,

christian.coryn@wmich.edu.

Journal of MultiDisciplinary Evaluation: JMDE(2)

72

Practical Ethics for Program Evaluation

budgeted at slightly more than $10,000. Why the substantial budget differences? The first proposal was submitted by a university-based evaluation unit and the second was submitted by a university professor acting as an independent consultant, local to the city within which the program was based. As such, the second proposal neither included expenses for travel nor the indirect costs associated with university-based research units. Moreover, the independent consultant indicated within his proposal that all work would be conducted by his students as part of a class project and that these students would not be reimbursed for their work. The funding entity decided that $10,000 was more attractive than $100,000. Ultimately, the low-cost competitor was funded, but under the premise of utilizing the costlier competitors plan and design. The following questions arise as a result of the client's decision: 1. Can the costlier competitors plan and design be comparably implemented by the low-cost competitor at 1/10 of the price? Perhaps costs can be cut dramatically by hiring a local evaluator with access to free labor and university resources, but what the evaluand saves monetarily may be lost in validity and credibility. 2. Does the low-cost competitor have the expertise and competency to implement the costlier competitors plan and design? It may be reasonable to infer, in some case, that the contracted low-cost competitor has neither the means nor the competencies necessary to effectively implement the competitors plan and design. The client may save as a result of funding the low-cost evaluator if the evaluator is able to implement and fulfill the contract as proposed by the high-cost competitor. However, the issues surrounding the contractual statements that allow all submitted

Journal of MultiDisciplinary Evaluation: JMDE(2)

73

Practical Ethics for Program Evaluation

materials to become the property of the funding agency are indeed troubling. Given the current climate of the competitive evaluation market, proposal writers are faced with several poignant questions: How detailed and precise should evaluation proposals be if they become the intellectual property of the entity requesting them? Should the funding entity return rejected proposals? How can we as evaluators protect our intellectual property given that funders have the rights to use all proposals they receive? If the funding entity uses any or all of the rejected proposals, in full or part, it shouldfor the sake of integritycompensate the originator. This could be accomplished in several ways: (i) a fee could be provided for the use of plans and designs, (ii) the evaluator could collaborate for a consulting fee to help execute the evaluation, or (iii) the evaluator could be contracted as a metaevaluator. The aforementioned example of client/funder impropriety is utterly unacceptable and the repercussions for the evaluation profession are profound. In addition to the impropriety of the client/funder, other relevant ethical concerns are raised. First, the professor discussed in the case example is more than likely not a member of any organized evaluation organization and therefore not accountable to professional standards of conduct, yet he had obviously violated the unwritten standards of conduct expected of a researcher by accepting the contract and using anothers work without the consent of the proposal writer. Second, what can be done to alleviate these problems in the future? Some writers of proposals have attempted to take matters into their own hands by explicitly indicating in their proposals that no portion of the submission may be used without their express

Journal of MultiDisciplinary Evaluation: JMDE(2)

74

Practical Ethics for Program Evaluation

consent. Yet if potential clients/funders willingly and knowingly use these materials, unbeknownst to the proposal writer, what can be done? As can be seen, the implications of an epidemic of this kind of client behavior are frightening. It has been suggested here at the Evaluation Center that approaching AEA might be appropriate. We might suggest developing a code of conduct for evaluation clients, and perhaps some defensive strategies such as blacklisting abusers. What do you think?

Journal of MultiDisciplinary Evaluation: JMDE(2)

75

Ideas to Consider

Managing Extreme Evaluation Anxiety Though Nonverbal Communication Regina Switalski Schinker

Many evaluative situations cause people to fear that they will be found to be deficient or inadequate by others (Donaldson, Gooler, & Scriven, 2002, p. 261). Donaldson, et al. (2002) use the acronym XEA to describe excessive anxiety and explain that there are people who are very upset by, and sometimes rendered virtually dysfunctional by, any prospect of evaluation, or who attack the evaluation without regards to how well conceived it might be (ibid). A common technique or magic bullet to prevent excessive anxiety would not exist in program evaluation. In his EVAL 600 class (Western Michigan University, October 26, 2004), Scriven stated that evaluators should care about excessive evaluation anxiety for two reasons. First, if XEA can be quelled, getting information from participants should be more fruitful. Thus, evaluators should strive to make evaluates less fearful of the evaluation process. Secondly, the likelihood of implementing recommendations should be increased if impactees of the evaluation are comfortable with the evaluation process. The use of communication research may be a unique approach to relieving XEA, one aspect being nonverbal communication. How can evaluators through unspoken messages impact stakeholders? A gaze broken too soon, a forced smile, a flat voice, an unreturned phone call, a conversation conducted across the barrier of an executive desktogether such nonverbal strands form the fabric of our communicative world, defining our interpersonal relationships,
Journal of MultiDisciplinary Evaluation: JMDE(2) 76

Ideas to Consider

declaring our personal identities, revealing our emotions, governing the flow of our social encounters, and reinforcing our attempts to influence others (Ebesu & Burgoon, 1996, p. 346). In an instant, a stakeholder will make an impression regarding the professional evaluator. This impression will be based on a number of nonverbal cues; eye contact, voice pitch and speed, dress, posture, and facial expression. Credibility, trustworthiness, and expertise are often determined through nonverbal communication channels (Ebesu & Burgoon, 1996; Self, 1996). Eye contact. Averting eyes, shifting eyes, looking at notes for an extended period of time, and blinking excessively all signal untrustworthiness, insecurity, and/or lack of credibility (Burgoon, Coker, & Coker, 1986; Fatt, 1999; Nolen, 1995). If an evaluator greets a stakeholder with a calm, consistent gaze, they are conveying confidence and believability. Paralanguage. Paralanguage, or vocal cues, include such factors as volume, rate, pitch and pronunciation (Fatt, 1999). At times, vocal cues are more important than words (Nolen, 1995). Gestures. Gestures like smiling with head nodding, open arms, casually crossed legs, and leaning towards the person of focus (Nolen, 1995) all convey comfort and confidence. Alternately, such gestures as negative head nodding and foot movement in space signify a tense environment (Keiser & Altman, 1976). Appearance. Nolen (1995) states, that the objective of communication may determine the choice of clothing. For example, if the evaluator wants to imply receptiveness of others, he/she should mimic the dress of those being

Journal of MultiDisciplinary Evaluation: JMDE(2)

77

Ideas to Consider

evaluated. Similarity implies receptiveness (Nolen, 1995). If the evaluator wishes to promote an image of status and expertise (ibid), they should dress more formally than those they are evaluating. Environmental factors. An environmental factor such as seating arrangement says a lot about the evaluator and their intentions. A person expecting to exercise leadership typically sits at the head of a table (Ebesu & Burgoon, 1996, p. 350). However, close proximity, or face-to-face, communication leads often the most fruitful communication (Burgoon, Buller, Hale, & deTurck, 1984; Fatt, 1999). In summary, the evaluator can use the above nonverbal cues to exhibit dominance or cultural similarity; closedness or openness. While nonverbal cues occur almost automatically (Palmer & Simmons, 1995), we must try to be cognizant of them. For a more in-depth understanding, it would be wise and worthwhile for every evaluator to take a graduate level course in nonverbal communication to better understand the person and attitude they are portraying and how their communication cues may affect XEA. My vision for evaluation concerns communicative style. I would like to see evaluators become conscious of the nonverbal messages they are sending to stakeholders and peers. It is through an evaluators communicative style that the image of evaluation will be formed. If evaluators are to reduce anxiety and gain a helpful reputation (Donaldson, 2001) they must approach their stakeholders with friendliness, sociability, and ease.

Journal of MultiDisciplinary Evaluation: JMDE(2)

78

Ideas to Consider

References Burgoon, J. K., Buller, D. B., Hale, Jerold L., & deTurck, M. A. (1984). Relational messages associated with nonverbal behaviors [Electronic version]. Human Communication Research. 10 (3, Spring), 351-378. Burgoon, J. K., Coker, D. A., & Coker, R. A. (1986). Communicative effects of gaze behavior: A test of two contrasting explanations [Electronic version]. Human Communication Research. 12 (4, Summer), 495-524. Donaldson, S.I. (2001). Overcoming our negative reputation: Evaluation becomes known as a helping profession [Electronic version]. American Journal of Evaluation, 22, p. 355-361. Donaldson, S.I., Gooler, L.E., & Scriven, M. (2002). Strategies for managing evaluation anxiety: Toward a psychology of program evaluation [Electronic version]. American Journal of Evaluation. 23(3), p. 261-272. Ebesu, A. S. & Burgoon, J. K. (1996). Nonverbal Communication. In M. B. Salwen & D. W. Stacks (Eds.), An integrated approach to communication theory and research (pp. 345-358). Mahwah, NJ: Lawrence Erlbaum Associates. Fatt, J. P. T., (1999, June 1). Its not what you say, its how you say it - nonverbal communication. Communication World. Retrieved November 9, 2004 from http://findarticles.com/p/articles/mi_m4422/is_6_16/ai_55580031/print Nolen, W. E. (1995, April 1) Reading people - nonverbal communication in internal auditing. Internal Auditor. Retrieved November 9, 2004 from http://findarticles.com/p/articles/mi_m4153/is_n2_v52/ai_17003168/print

Journal of MultiDisciplinary Evaluation: JMDE(2)

79

Ideas to Consider

Keiser, G. J., & Altman, I. (1976). Relationship of nonverbal behavior to the social penetration process [Electronic version]. Human Communication Research. 2 (2, Winter), 147-161. Palmer, M. T. & Simmons, K. B. (1995). Communicating intentions through nonverbal behaviors. Conscious and nonconscious encoding of liking [Electronic version]. Human Communication Research. 22 (1, September), 128-160. Self, C. C. (1996) Credibility. In M. B. Salwen & D. W. Stacks (Eds.), An integrated approach to communication theory and research (pp. 345-358). Mahwah, NJ: Lawrence Erlbaum Associates.

Journal of MultiDisciplinary Evaluation: JMDE(2)

80

Ideas to Consider

Is Cost Analysis Underutilized in Decision Making? Nadini Persaud

Is cost analysis underutilized in decision making? Research suggests it is. According to several authors, the use of cost analysis is still infrequent. Further, where cost analysis is conducted, it is often poorly done because many evaluators lack the necessary technical skills (Levin & McEwan, 2001). Some reasons for the underutilization of cost analysis center on difficulties associated with its use. These include: (1) unfamiliarity with the necessary analytical procedures; (2) political or moral controversies in assigning values to input/outcome measures (e.g. determining the appropriate discount rate); (3) determining the extent to which benefits identified and quantified have been caused by the program); (4) determining who incurs the benefits and costs; (5) determining when benefits and costs occur; (6) inability to quantify all costs and benefits; (7) lack of resources to conduct long-term follow up studies; (8) lack of data; (9) data in a form incomprehensible to the evaluator; and (10) difficulties with separating program developmental costs from operating costs (Alkin & Solomon, 1983; Andrieu, 1977; Berk & Rossi, 1990; Fitzpatrick et al., 2004; Rossi et al., 2004; Sewell & Marczak, 1997). The current underutilization of cost analysis should seriously concern evaluators, policy makers and society at large. Informed decisions require information on both costs and effects. Given that the ultimate societal goal is to optimize the use of scarce resources, cost analysis can play an important role in national planning. The question is Can anything be done to raise awareness on this issue? Yes! Leading

Journal of MultiDisciplinary Evaluation: JMDE(2)

81

Ideas to Consider

evaluation textbooks and journals must take a more active role in promoting cost analysis. In addition, graduate programs and certificate programs in evaluation need to incorporate cost analysis in their course requirements. If evaluators are not exposed to such techniques and trained to use them, they will never be confident they are conducting cost analysis competently References Alkin, M. C., & Solomon, L. C. (1983). The costs of evaluation. Beverly Hill, CA: Sage. Andrieu, M. (1977). Benefit-cost evaluation. In L. Rutman (Ed.), Evaluation research methods: A basic guide. p. 217232. Thousand Oaks, CA: Sage. Berk, A. R., & Rossi, P. H. (1990). Thinking about program evaluation. Newbury Park, CA: Sage. Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines. (3rd Ed.). White Plains, N.Y: Longman Publishers. Levin, H. M. & McEwan, P. J. (2001). Cost-effectiveness analysis: Methods and applications. (2nd Ed.) Thousand Oaks, CA: Sage. Rossi, P. H., Lipsey, M. W., & Freeman, H. E. (2004). Evaluation: A systematic approach. (7th Ed). Thousand Oaks, CA: Sage. Sewell, M., & Marczak, M. (1997). Using cost analysis in evaluation. Tucson, AZ: USDA/CSREES and the University of Arizona. Retrieved on October 15, 2004, from http://ag.arizona.edu/fcs/cyfernet/cyfar/Costben2.htm.

Journal of MultiDisciplinary Evaluation: JMDE(2)

82

Ideas to Consider

Is E-Learning Up to the Mark? Fundamentals in Evaluating New and Innovative Learning Approaches Involving Information- and Communication-Technology Oliver Haas42

Abstract With the introduction of training courses or seminars via the internet it becomes relevant to evaluate and assess these new and innovative forms of learning and teaching. The following article deals with two issues: a) Evaluation methods used in web-based learning arrangements that are dependent on standards in evaluation research b) The implicit logic of the communication medium internet. The paper concludes by illustrating methods of evaluation and assessment that correspond to both paradigms.

42

Oliver Haas (M. SocSc.) conducted university studies in Sociology at the

Johann-Wolfgang v. Goethe University of Frankfurt/ Germany and the Free University of Berlin/ Germany. He is currently employed a Technical Advisor by the German Agency for Technical Cooperation (GTZ) and has worked in Russia, Tanzania, Malaysia, and South Africa. Here he has been involved in Vocational Education and Training projects.

Journal of MultiDisciplinary Evaluation: JMDE(2)

83

Ideas to Consider

Introduction Does the internet make us more intelligent? Do we obtain more knowledge, skills and better qualification through its utilization compared to the traditional methods of teaching and learning? Does the internet change teaching and learning concepts or even our perception of learning? After the last big technical innovation print mediahad a significant impact on teaching and learning, the most recent learning technologies43 only changed learning strategies in human resource development to a minor extent. However, the most recent development in Information and Communication Technology (ICT)the world wide webas a universal medium of exchanging and mediation of information and knowledge has set a new yardstick regarding the accessibility of learners and the dissemination of learning content. Innovations in general, and innovative teaching and learning procedures specifically, are always under pressure to legitimize their existence and benefits. This becomes even more crucial when their application is highly related to risks and financial expenditure. Efficiency and effectiveness, quality as well as relevance and significance, to name a few, are key criteria under which these procedures are critically assessed. Despite the recent enormous interest generated in learning via the web44, the question of acceptance, efficacy and suitability has not been answered fully.

43 44

E.g. language laboratories, Computer based training, etc. The terms web-based learning, online learning, internet-based learning, and e-

learning are used interchangeably in this paper.

Journal of MultiDisciplinary Evaluation: JMDE(2)

84

Ideas to Consider

Methodology-driven and comprehensible criteria-based assessments of procedures, events or actions are called evaluations. So far evaluations of internet-based learning have been mainly applied in the public sector. Yet over the past years, the question of efficacy and efficiency of online-learning has become increasingly relevant for the private sector. With regard to in-house and on-the-job training the common understanding is that any training is an investment in employees that needs to be justified like any other investment. While the determination of costs is relatively easy to define, the determination of economies of scale has proven to be a challenge on its own. This is where evaluation becomes relevant. To evaluate learning via the internet requires an accurate preparation of empirical research that can only lead to useful data if meaningful criteria under which the evaluation will take place have been assigned in advance. Up to now evaluation of learning via the web has been a seriously neglected aspect of impact assessment in education and training. As such, theories on how to go about evaluating learning via the net as well as gaining empirical valid data are difficult to find. However, evaluations in this field are not just necessary but also possible. It is the aim of this paper to provide a selection of possibilities on how to evaluate the efficiency and benefits of online learning. Due to the complexity of e-learning, this is not possible without a methodological introduction. Components in evaluating online seminars To evaluate learning via the internet is nothing new. Assessments of learning software (e.g. in the field of Computer Based Training) have been conducted widely and provide useful experiences as well as theoretical concepts, paradigms and procedures. These are methodological aspects that cannot be neglected when evaluating learning via the internet.
Journal of MultiDisciplinary Evaluation: JMDE(2) 85

Ideas to Consider

In recent years, highly acclaimed work has been done in the development of methodical standards and instruments in evaluating the efficacy of education and training. However, these standardsconsisting models, methods, and instrumentscan also be utilized in other contexts. Evaluation methods and criteria are often explicitly created for a specific evaluation purpose. This suggests that evaluationor better evaluation researchshould be considered as applied science following a specific need and demand. Areas and criteria of evaluation The setting of criteria should be the first step when assessing online training courses. Criteria need to be defined for each and every aspect of the learning scenario. This involves all participants of the training course, the utilized learning material, the pedagogic approach, and technological aspects, as well as guidance in the learning process and technical-administrative support as part of the training course. Criteria are directly related to the quality of online seminars and constitute the foundation of any evaluative approach to online learning. Thus, a clear statement and definition of evaluation criteria is of crucial importance for the whole evaluation process. Often several areas of evaluation are strongly interlinked and have a significant impact on each other.45 However, for a differentiated assessment of online learning courses it is essential to select evaluation criteria for each evaluation area separately. The decision on evaluation areas and its correlating evaluation criteria is to be done at the beginning of the overall evaluation and needs to be specified for each learning program. Yet, there are some criteria that could be defined as
45

For example, the influence of the pedagogic approach or the technology on the students

motivation as well as learning success.

Journal of MultiDisciplinary Evaluation: JMDE(2)

86

Ideas to Consider

typical in evaluating online learning. The following chart provides a selection of these criteria: Chart 1: Evaluation areas und evaluation criteria for online learning Evaluation area Participants/ students Evaluation criteria Acceptance of training course Drop-out rate Degree of collaborative learning46 Rate and intensity of interaction with learning content Learning success Communication among students Transfer and utilization of learning content at the workplace Pedagogical approach Learning and teaching methods Didactic of activation47 Didactic of enabling48 Degree of blend in the pedagogic approach49 Learning material Editing and processing of learning content Comprehensibility, amplitude, correctness, time sensitivity of learning material
46 47 48 49

User of training courses can collaboratively work on tasks independent of time and space. For a detailed explanation of the term please refer to page 4. see above. Blended learning is an integrated learning concept that combines Information and

Communication Technology (ICT) with traditional learning methods and media in a single learning arrangement.

Journal of MultiDisciplinary Evaluation: JMDE(2)

87

Ideas to Consider

Technical system

Quality and reliability of connectivity Technical infrastructure at the learning site (e.g. internet accessibility) Collaboration and communication tools

Support and administration

Registration and financing Online-support Offline-support Technical support

Participants When evaluating online training courses it is crucial to remember that it is not the learner that is evaluated but the learning content delivery! However, the learner plays an important role as a resource person for evaluating the overall training program. The learners behavior, learning success as well as transfer of learning content to the workplace provide important empirical information with regards to the quality of the training course. Pedagogic approach Even though internet-based learning has beenespecially in its early years strongly related to technology, it is still about the provision of learning content and qualifying people in order to create employability. Therefore web-based learning alsoor even morehas to be based on a pedagogic foundation, generally provided through the Curriculum. In Education and Training one distinguishes between two didactic models:

Journal of MultiDisciplinary Evaluation: JMDE(2)

88

Ideas to Consider

Didactic of activation50 Coming from engineering science, the didactic of activation assumes that successful learning can only take place if it is adequately planned with learning methods having been selected accordingly and the sequence of learning is being followed rigidly. Programmatic learning and curricular planning are core determinants of this model.51 Didactic of enabling52 This model focuses on the learning and its success. It tries to create an enabling environment for the learner to build up on existing knowledge and to expand his skills and competencies according to his need and demand. Group work, the provision of several learning paths and methods to acquire knowledge are key features of the didactic of enabling.53 Degree of blend in the pedagogic approach The blended learning approach treats web-based learning as one way of delivering learning content. Consequently, other methods of traditional learning
50 51 52 53

German: Erzeugungsdidaktik See Arnold, Rolf/ Schssler, Ingeborg, 1998 German: Ermglichungsdidaktik The action-oriented learning approach is one relevant learning approach of this model. The

action-oriented learning approach is based on a holistic interpretation of technical, individual, methodological and social competence. Learners graduating through this approach are expected to have acquired not only skills and knowledge obtained from qualifications, but also key competencies, such as problem solving techniques, communication skills and the ability to work in teams (see Heitmann, Werner, 2004).

Journal of MultiDisciplinary Evaluation: JMDE(2)

89

Ideas to Consider

such as group or individual exercises remain valid and relevant. Depending on the Curriculum, it has to be decided which learning content will be delivered via the web and where other forms of learning material are more relevant. The demand for content delivery via the web must result from the pedagogic approach of the overall learning program. Learning material The learning material provided should be a main focus of the assessment in two respects: 1) Quality of content 2) The processing of learning content into learning material incorporating ICT In terms of quality of content it is mainly comprehensibility, coverage, correctness and time sensitivity of learning material that needs to be evaluated. Concerning its processing of content into learning material the evaluation should focus on the conversion of content to learning material (e.g. the utilization of text, pictures, animation, simulation, etc.). Technical System When dealing with web-based learning, technical aspects are of great importance. Quality and reliability of connectivity including time needed for loading frames and content need to be assessed. The stability of the Web-server as well and its capacity are aspects that must also be assessed. Additionally, the extent and time of utilization of web-content and services provided by the learning platform can provide useful information on the suitability of the technology. Support and administration
Journal of MultiDisciplinary Evaluation: JMDE(2) 90

Ideas to Consider

E-learning can take place on an independent level or as an add-on to teaching (the blended approach). The content-based support provided by tutors and technical administration ensures a smooth operation of the e-learning course. The more complex a learning course the more support it needs. However, support and administration can be reduced if training courses include collaboration tools, where students can interact and exchange views, ideas and information. Newsgroups and mailing lists are a very economic way of using portals of this kind. Types of evaluation Evaluation research deals with several different types of evaluation. Each has its own suitability to assess the quality of projects and programs and make recommendations for improvement. In the following sections, various types of evaluation will be outlined. However, the focus will be on the suitability of evaluating learning arrangements in the field of web-based learning. Formative and summative evaluations Formative evaluations focus on the training course during its development. These types of evaluation aim at ensuring quality and provide useful suggestions for further improvement and refinement of the online course. Summative evaluations however examine the training course after it has been finalized. Here, the main focus is on data compilation. These data give useful hints regarding acceptance, effectiveness and the impact of online training courses.54

54

See Kromrey, 1998, p.100

Journal of MultiDisciplinary Evaluation: JMDE(2)

91

Ideas to Consider

Product-evaluations and process-evaluations Product-evaluations consider one specific product when assessing (e.g. a learning program). In contrast, process-evaluations focus on procedures, handling and utilization of these products. Internal evaluation and external evaluation Those who have been actively involved in the development of the online-course do internal evaluation. They conduct the assessment and evaluation of its performance. External evaluation involves external assessors as a main resource to conduct the evaluation. These types of evaluation provide a grid that should support the decision-making process, showing what type of evaluation is most suitable for which specific training course. Here, suitability depends very much on the type of training course, the state of its implementation, its composition, the assessors perspective and most importantly the reasons for conducting the evaluation. Obviously a one-size-fitsall solution is not possible. Evaluation methods and gathering of data Evaluation criteria and empirical data are two central elements of every evaluation. Without evaluation criteria, the acquisition of data can quickly turn into a wild collection of data without correlations, interaction and structure. On the other hand one can say that without any empirical data, questions and presumptions remain without an answer. In an evaluation, empirical data provide the foundation to validly answer theorydriven questions, to get clarity on assumptions and hypothesizes and to be able to

Journal of MultiDisciplinary Evaluation: JMDE(2)

92

Ideas to Consider

make recommendations. That is why when evaluating online training courses, methods of empirical research are necessary in order to help collect data without interfering with the actual learning process. Generally, internet-technology provides a number of possibilities to collect empirical data. This is especially relevant for written assessment methods (e.g. questionnaires, rating scales, etc.) that are applied frequently in evaluations. Like all methods of empirical research, evaluation research can be divided into reactive and non-reactive procedures. When conducting reactive assessment procedures (such as interviews) the interviewee is aware of the assessment being conducted. Therefore the person assessed can react to the assessment process in an unpredictable way. The answers can mingle with the originally intended aim of the assessment in such way that it is difficult to make a distinction of all results obtained afterwards. Non-reactive assessment procedures (such as hidden observations) take place without the awareness of the subject of assessment. The reactive element of responding does not exist and therefore also no distinction of data is necessary after the assessment has taken place. As a consequence one can say: The assessment with reactive assessment procedures causes a wild mix of (interesting and not interesting) data. Nevertheless, the utilization of non-reactive assessment procedures also has its challenges: After the assessment is finalized the subject of assessment should be informed on the objectives of the evaluation as well as the reasons for conducting the assessment in such a way. Generally, anonymity of all gathered data in reports should be guaranteed.

Journal of MultiDisciplinary Evaluation: JMDE(2)

93

Ideas to Consider

Therefore, when selecting evaluation methods it must be noted that obtaining empirical information depends on the decision whether to utilize reactive or nonreactive methods of assessment. A substitution is not possible. Analysis of documents Text and documents (e.g. curricular text, teaching text, documentation of communication amongst the participants including the mentor, etc.) on various levels play a crucial role within learning arrangements involving ICT. It is selfevident that all documents used in the learning context need to be carefully tested. When dealing with online training courses this issue becomes even more relevant and higher standards need to be set. This is because the mentor, who functions as a corrective element in the learning process, is not always directly available. It is advisable to make use of programmes designed for text analysis to analyse comprehensive text. Interviews Interviews are the most common method of data collection. Here, various variations of data assessment exist. The most popular distinction is made between oral interviews and written interviews. Oral interviews are mostly distinguished by their degree of complexity and structure. Structured interviews are based on a guideline that has been prepared beforehand. This guideline contains all relevant, necessary and already formulated questions to be asked during the interview as well as (if necessary) hints and tips with regard to the behaviour of the interviewer.

Journal of MultiDisciplinary Evaluation: JMDE(2)

94

Ideas to Consider

Semi-structured interviews are based on clusters and groups of questions and topics to be dealt with during the interview. A specific sequence or wording of questions is not part of the guideline. Freely structured interviews are completely free of structure. Written interviews or questionnaires can be divided based on the form of question, which is used in the interview. Open questions provide the interviewee with the possibility to formulate the answer in an individual manner. Closed questions suggest answering options to be chosen from. Online-questionnaires should be the most common way of assessing and evaluating online learning processes. Observation Observations do not play a significant role when assessing online learning. Behaviour can be validly recorded through technology-based recording of behaviour via the usage of the learning platform. Recording of behaviour Behavioural expressions in the context of e-learning can be recorded via the socalled log files to be found through the server, where all html-documents of an online course are stored. These are access data that provide useful information for the evaluation process (e.g. acceptance of specific learning content, etc.). Besides that, log files inform about time, sequence and duration of utilization. However, one should not overestimate the role and function of these files and the information

Journal of MultiDisciplinary Evaluation: JMDE(2)

95

Ideas to Consider

provided. On the one hand they do not comprise all of the information necessary to conduct an evaluation.55 On the other hand they only provide clustered quantitative information that only might be of limited value for the overall assessment. After all, log-files only cover the access to HTML-documents. If other forms of communication like collaboration tools, bulletin boards are being used as part of the learning process it will not be captured by these files. Testing In evaluation research the term testing describes a standardized procedure to assess the occurrence of empirically defined performance characteristics. Usually assessment via testing is done in an ad hoc way and rather informally. Standardized forms of testing however are highly sophisticated and complex (e.g. intelligence tests). In standardized testing one distinguishes between normoriented and criterion-oriented methods. Norm-oriented methods assess an individual test result compared to a control group. Criterion-oriented methods are based on a predefined figure to assess individual test results. Empirical research When assessing online learning, methods of empirical social research need to be adopted and adjusted according to the specific need and demand created by the training course. This means that the whole range of research methods is of relevance for assessing online learning. Just as in any other area of empirical
55

If passwords have been given out, it will not be captured by log files.

Journal of MultiDisciplinary Evaluation: JMDE(2)

96

Ideas to Consider

research the decision on the most suitable method depends on the research question as well as the capacity of the respective research method. Concepts of evaluation for web-based learning So far we have dealt with evaluation of web-based learning from a methodical point of view. The following sections will apply the outlined methods with the aim of illustrating three types of evaluation concepts. Each one has aspects relating to data survey and data assessment. In other words, after dealing with evaluation of online-learning on an operative level, the following section will provide a birds eye view of the topic. The first two sections will illustrate aspects of data survey with regards to concepts of evaluation for web-based learning Utilization of criteria indices In the field of learning software, criteria indices are widespread. Generally, criteria indices can be described as checklists. Evaluation via criteria indices is based on a selection of various relevant and non-relevant criteria that have been pre-determined by experts. Due to their low-cost implications and easy application throughout the whole evaluation phase as well as transparency, criteria indices are very popular. Methods like these are relevant to obtain prompt results and to get a preliminary orientation for the overall implementation process. Furthermore, results gained through criteria indices are easy for others to understand. However, they also have risks and challenges. These indices are often not based on sound theoretical ground. This shows in uncertainties when selecting criteria as well as the emphasis on each and every criterion. Furthermore, it has been empirically proven that assessors who utilize criteria indices when assessing

Journal of MultiDisciplinary Evaluation: JMDE(2)

97

Ideas to Consider

learning programmes obtain results that may deviate relatively strongly from each other. Despite all criticism, one should not generally object to the method of criteria indices. They provide a grid for the area of web-based learning and a possibility for empirical research and evaluation. When expanding these indices through variables (e.g. drop-out rate), one can gain a proper instrument, with which online-seminars and web-based learning programmes can be validly evaluated. To extend this approach with research on students (e.g., individual self-assessment on learning progress) provides further possibilities of application that will be illustrated in the following section. Determination of coherence A step beyond the application of criteria indices are those concepts of evaluation, that not only deal with the existence or non-existence of characteristics, but also touch on the elaboration of coherence amongst characteristics. However, this is not free from difficulties. If coherence has been determined (e.g., between the amount of participating students and contributions in online discussions), or differences (e.g., difference between achieving a learning objective and learning groups), the finding is almost impossible to predict for other online-learning programmes if no other variables such as motivation of students are rigidly controlled. Just as with criteria indices it might also make sense to analyze coherence relations via surveys involving training participants. In fact this is a requirement for many criteria and cannot be replaced by expert surveys. Evaluations via criteria indices make it possible to allow statements on the theoretical impact of e-learning. A continuation of this approach leads to so-called

Journal of MultiDisciplinary Evaluation: JMDE(2)

98

Ideas to Consider

linear structure equalisation models, where a specific variable (e.g. estimated success in learning) will be determined from interrelations with other variables. Chart 2: Concepts of evaluation Evaluation via criteria indices Perspective of the expert Perspective of the user Learning success Drop-out rate Evaluation via analyzing relations Relation between learning success and acceptance of a specific learning/ teaching method Estimated relation between the degree of communication with other participants and own learning success

Self-estimated learning success Self-estimated degree of communication and interaction with other participants

Aspects of assessing data If the data assessment is based on criteria indices or interrelations, data alone only partially gives a statement on the quality of online-learning (including suggestions for improvement). The following section will provide an insight into various types of data that can be obtained when evaluating e-learning: Data indicating learning success A quantification of learning success only becomes a valid empirical statement through comparison (e.g. before/after-comparison) or through inclusion of analyzing relations (e.g. connection between learning success and participation in a specific learning module/ qualification module).

Journal of MultiDisciplinary Evaluation: JMDE(2)

99

Ideas to Consider

Decisions of participants/ experts Judgments made by participants or experts are a fundamental empirical finding when evaluating online-learning. Before making changes based on such statements the data need to be examined further in order to ensure reliability and interrelations with other data. Data relating to technical features As long as these data are not utilized in relation with other relevant data of the evaluation (e.g. drop-out rate, learning success), these data can be considered as insignificant. Data relating to the acceptance of the offered learning programme Except for the case when responses in this regard are not accessible data, of this kind proves to be very difficult to interpret. Assuming that out of several alternative learning modules only one is accepted by a single learner or small group, it can still be of significance and relevance for a specific learning gproject. Conclusion and discussion Despite remarkable work being done in related fields such as educational software, evaluation research in ICT is still very much in its initial steps. However, one can assess online-learning on a practical level as long as an adequate system of classification comprising all relevant and necessary evaluation aspects has been developed. The system presented in the present paper is in conformity with these functional requirements as it provides a pragmatic, criteria-based evaluation that focuses on interrelations and thus serves the purpose of verifying or falsifying hypothesis-based evaluations.

Journal of MultiDisciplinary Evaluation: JMDE(2)

100

Ideas to Consider

References Arnold, Rolf; Schssler, Ingeborg (1998): Ermglichungsdidaktik.

Erwachsenenpdagogische Grundlagen und Erfahrungen, Schneider Verlag, Baltimore. Heitmann, Werner (2004): The action-oriented learning approach for promoting occupational performance and employability. South African-German Development Co-operation, Skills Development Research Series, Book 3, Pretoria. Kromrey, Helmut (1998): Empirische Sozialforschung, 8th edition. Leske + Budrich, Opladen.

Journal of MultiDisciplinary Evaluation: JMDE(2)

101

Ideas to Consider

The Problem of Free Will in Program Evaluation Michael Scriven

A group of hard-nosed scientists who have been studying the major commercial weight-loss programs recently reported their disappointment that the proprietors of these programs refuse to release data on attrition. The evaluators, though thats not the label they use, think its obvious that this is aor perhaps thekey ratio needed to appraise the programs, and one that the FDA should require them to release. On this issue (possibly for the first time in my life), I find myself taking sides with the vendor against the would-be consumer advocate, and I think the issue has extremely general applicability. My take is that the key issue is whether the program, if followed, will produce the claimed results; and that following the program is (largely but not entirely) a matter of strength of will. Failure to stay with the programthat is, attritionis therefore (largely but not entirely) a failure on the part of the subject not the program, and the program should not be charged with it. First, heres why I think this is a very general problem that we need to deal with, in evaluation overall, not only in program evaluation. Think about the evaluation of: any chemical drug abuse program; twelve step programs like AA for alcohol and gambling abuse; distance or online education; continuing education of any kind this clearly applies to all of them. Now it also applies in some important cases outside program evaluation, ones that you might not think of immediately. Here are two: (i) it applies to standard pharmaceutical drug evaluation because there is a serious problem referred to as the fidelity or adherence problem, about the extent

Journal of MultiDisciplinary Evaluation: JMDE(2)

102

Ideas to Consider

to which patients ex-hospital do in fact take the prescribed dosage on a regular basis. In these studies we surely want to say that the merit of the drug lies in what it does if its used, not whether its used. Case (ii): in teacher evaluation, although we want to say that the teacher has some obligation to inspire interest, to motivate, as well as to teach good content well, success is clearly limited, not only by natural capacityas we all agreebut also by dogged disinterest. We dont want to blame teachers for failing to teach inherently capable students who are determinedly recalcitrant, i.e., for high failure (attrition) rates where the cause is simply refusal to try. Heres the schema I recommend for dealing with this kind of consideration. Think of a program (or drug regimen, or educational effort) as having three aspects that we need to consider in the evaluation: (A) Attractive power; (B) Supportive power; (C) Transformative power. For short: Appeal, Grip, and Impact. A is affected by presentation, marketing and perhaps allocation, and controlled by selection. The vendor or provider has the responsibility to use selection to weed out cases who are demonstrably unsuitable for the treatment; but, given the unreliability of such selection tests in the personnel area (pharmacogenomics is the subject devoted to this in the pharmaceutical area, where its considerably more successful) and the importance of giving people a chance when they want to try, one cant be very critical of high-pass filtration for weight-loss, distance ed, and twelve-step programs. Of course, high front-end loading of payments may be excessive, if theres no money-back guarantee. B is affected by support level including infrastructure (e.g., equipment, air conditioning, counseling), continuing costs (including opportunity costs and fees), and ease of use, for all of which the program is largely responsible; but of course B is also controlled by strength of will. If the support, costs, and ease of use are
Journal of MultiDisciplinary Evaluation: JMDE(2) 103

Ideas to Consider

disclosed in advance and are both reasonable and delivered as pictured and promised, willpower becomes the controlling variable. Which leaves C, the Impact issue, the real kick in the program: will it deliver as promised if we do our part, taking the pill, doing the homework, getting to the meetings? Thats the key issue. While the good evaluator absolutely must check to see if the provider has indeed provided what was promised, and that what was provided was about as good as can be provided at the cost level in question, the rest is up to the subjects. Under these conditions, easily checked and often met, attrition is your failure, not the vendors. This is an important issue because its important that evaluation not assume that these treatments are done to people, and are at fault if they dont work. The fact is that they are selected by people as something they will undertake, not undergo, and failure is often the fault of the people not the program. Even with drug treatments, the drugs have to be taken, and often taken for the rest of your life. They only work if you make them work. This is not surgery, which you do undergo, which is done to you; its something where you choose to get some help in doing something to yourself. You have to take responsibility for doing your part, and the evaluator must not take that responsibility away and say that the program failed if it didnt get you through to the Promised Land, when it was you who failed. We have free will, but that doesnt mean success is a free lunch. Free will is the freedom to start a program: will power is what it takes to complete it.

Journal of MultiDisciplinary Evaluation: JMDE(2)

104

Global Review: Regions

Japan Evaluation Society: Pilot Test of an Accreditation Scheme for Evaluation Training Masafumi Nagao

The Japan Evaluation Society (JES) has conducted a pilot test of an accreditation scheme for short-term evaluation training programs. The program chosen for this test was a 4-day school evaluation training course organized by a public teacher training center in cooperation with Hiroshima University. The course was designed to impart functional competence to school teachers for co-ordinating selfevaluation exercises in their schools. The course contents were adapted from the Essential Skills Series (ESS) program of the Canadian Evaluation Society (CES), which provided technical assistance for the formulation of the course design, preparation of the textual materials and actual conducting of the course, including on-site guidance and evaluation of its results on the basis of a formal agreement with JES. The course was given twicefirst in July 2003 and then in August 2004. A special committee established in JES elaborated tentative procedures for accreditation and examined reports submitted by the course organizer and a JES member dispatched to observe the course. The committee has judged that the course cleared the hurdle for accreditation and, based on this pilot test, is making a recommendation to JES Board that an Accreditation Scheme should be formally established. JES believes that its accreditation scheme will, by virtue of the norms it sets for evaluation training, have a direct impact on the quality of the short-term evaluation training programs conducted by public sector bodies and specialized training organizations. It also hopes that the scheme will lead to multiplication of evaluation training programs by providing qualification incentives to individuals

Journal of MultiDisciplinary Evaluation: JMDE(2)

105

Global Review: Regions

interested in evaluation training. In addition to school evaluation, aid evaluation and government performance evaluation are considered target areas for the accreditation scheme. Masafumi Nagao is a professor at the Center for the Study of International Cooperation in Education (CICE), Hiroshima University and can be contacted at nagaom@hiroshima-u.ac.jp.

Journal of MultiDisciplinary Evaluation: JMDE(2)

106

Global Review: Regions

Aotearoa/New Zealand Starting National Evaluation Conferences Pam Oliver, Maggie Jakob-Hoff, and Chris Mullins

Aotearoa/New Zealand had its first national evaluation conference in September 2004. Organised by the Auckland Evaluation Group, the meeting was held at the Tauhara Centre, near Lake Taupo (central North Island). This was chosen partly because it is a central location for North Island evaluators to get to, and partly because of the nature of the conference centre itselfa retreat and spiritual centre located in a quiet setting with native bush and overlooking the lake. The conference theme in 2004 was Radical directions, gnarly questions and halfbaked ideas and the programme included a keynote workshop, several other facilitated workshops and discussion groups, and a number of more informal discussion groups using an Open Space Technology process. Twenty-five people attended the conference and spent two and a half days engaged in discussions which participants found inspiring and highly worthwhile. Entitled Really useful stuff: A new role for evaluators in building evaluative capacity in New Zealand, the keynote workshop was presented by Bill Ryan, Associate Professor and Director of Programmes in Victoria Universitys School of Government in Wellington. It was followed by an engaging discussion which focused on participants endeavours to date at building evaluation culture into their everyday evaluation practice, including examples of both successful attempts and frustrations, and suggestions about what evaluators would like to be able to do, in an ideal evaluation environment.

Journal of MultiDisciplinary Evaluation: JMDE(2)

107

Global Review: Regions

In addition to the programmed workshopsranging from looking at ways of educating our clients around evaluation tendering, and aspects of ethical decision-making in evaluation, to building reflective practice through evaluationthe Open Space events also resulted in some focused discussion on topics of key importance to the development of the profession, such as who evaluates the evaluators? A second conference is scheduled for June 29th-July 1st, 2005 at the Tauhara Centre and is being organized jointly by members of the Auckland and Wellington evaluation groups. More information will be available in the next few weeks.

Journal of MultiDisciplinary Evaluation: JMDE(2)

108

Global Review: Regions

Conference Explores the Intersection of Evaluation and Research with Practice and Native Hawaiian Culture Matthew Corry

One of the great potential sources of innovative thinking in evaluation is events where indigenous peoples come together to explore the ways in which evaluation and research can be designed in a way that is culturally relevant and useful to the communities served by these activities. In recent years, there have been several hui (gatherings) where Native Hawaiian and New Zealand Mori researchers and evaluators have come together to explore this theme. Currently, a Mori/Native Hawaiian working group is compiling a series of papers on the topic for possible publication in a monograph series. Last October, the Policy Analysis & System Evaluation (PASE) department of Kamehameha Schools organized a conference at their Hawaii (Big Island) campus. The purpose of the gathering was to gain a better understanding of Hawaiian well-being by bringing together multiple viewpoints from diverse disciplines. The event attracted a broad cross-section of researchers, educators, and cultural practitioners from the fields of education, health, family, economics, leadership, environmental studies, cultural practices, politics, and spirituality. The variation of research presented was wonderful, said one participant. I am amazed at how far we have come [to be able to] present from a Hawaiian perspective.

Journal of MultiDisciplinary Evaluation: JMDE(2)

109

Global Review: Regions

An overarching theme throughout the conference was the charge to rely on kpuna (elders) wisdom and to balance that knowledge with scientific learning. Presenters also insisted that native voices are necessary to provide a more complete and accurate portrait of knaka maoli (indigenous activities). It was very special to have the conference on our campus and to be talking about the wellbeing of our children while they were among us building their own futures through education. It made tangible the direct connections between our research and the education of Hawaiian children, says PASE director Shawn Malia Kanaiaupuni, PhD. Everything we dothe surveys, the longitudinal studies, statistical analyses, technical reports, evaluations, and the sharing of findings with other researchersis to help achieve a better understanding of how to make a difference for the keiki (children) and families we serve. For more information about PASE, call (808) 541-5372 or visit

www.ksbe.edu/pase. To view presentations from this years conference, visit http://www.ksbe.edu/pase/researchproj-ksrschcon.php.

Journal of MultiDisciplinary Evaluation: JMDE(2)

110

Global Review: Regions

Washington, DC: Evaluation of Driver Education Northport Associates February 16-17, 2005

Project to Develop Guidelines for Improved Evaluation of Driver Education This is intended to provide a brief overview of this project and of the consultative workshop which took place last week in Washington, D.C. Guidelines Project The AAA Foundation for Traffic Safety and BMW of North America are funding a research project to develop guidelines for improving the evaluation of driver education (DE) programs. Northport Associates is conducting this project in consultation with an advisory group and other experts. Over the long history of driver education, there have been a moderate number of evaluations, including quasi-experiments, random controlled trials, and ecological time series studies. Reviews of evaluations typically conclude that young people who complete DE programs crash at about the same rate as those who do not receive formal education. Do some types of driver education programs lead to better educational outcomes and safety impacts than others? How can driver education programs be improved in order to yield safer young drivers? Lack of systematic programs of research and methodological weaknesses in previous evaluations have left these questions partially or completely unanswered.

Journal of MultiDisciplinary Evaluation: JMDE(2)

111

Global Review: Regions

Northport Associates, lead contractor on the project, is reviewing the DE evaluation literature, examining methods and theories of driver education programs, identifying and assessing evaluation methods, measures, and data sources, and preparing a final report and guidelines for future evaluations of driver education programs. Driver EducationA Challenging Evaluand Evaluating and improving DE is highly challenging, but the potential benefits are very high. Road trauma is a costly public health problem, particularly among youth. Young, inexperienced drivers are at high risk16 year olds have 10 times the crash rate per mile of experienced adults. Risk declines rapidly over the first few hundred miles of driving, but the learning curve is long, taking up to 10 years to finally level off. The limitations in skills and abilities that contribute to elevated risk are known. Novice drivers are less able to control attention, scan the environment effectively, detect potential hazards early, make critical decisions quickly, and maintain consistency in critical thought and action. They often raise their risk through overconfidence and choices such as driving too fast, accepting small gaps in traffic, and leaving inadequate safety margins. Both skill deficiencies and risky choices contribute to their excess risk. Driver education has long been seen as a societal response to the tragic losses of novice drivers. Traditional driver education takes place before the driver becomes licensed. Indeed, one of its principal purposes is to prepare beginners for license testing. Typical U.S. DE programs consist of 30 classroom hours and 6 hours in car. Content covers legal requirements, vehicle handling, interacting with traffic, and efforts to motivate beginners to fear the consequences of crashes. Classroom
Journal of MultiDisciplinary Evaluation: JMDE(2) 112

Global Review: Regions

methods most often include teacher-centred lectures, with some discussion and support with film and video, and sometimes simulators. In recent years, there have been major changes in the technological, business, and regulatory environments of driver education and also in driver licensing, with the move to graduated licensing suggesting graduated training. There appears to have been limited development of new DE content, but instructional and delivery methods are rapidly changing. Self-instruction, computer-based instruction, simulation, and web-based instruction are increasingly becoming available. In some jurisdictions, recent regulatory provisions recognize a formal course delivered by parents. While more education is always a popular prescription for improving safety, demonstrated effectiveness in reducing the risk of drivers of all ages through education alone is rare. It is widely believed, if not yet proven, that carefully designed multifaceted, multi-level behavior-change programs are required. Simply increasing knowledge and skill does not make safer driversbetter drivers do not necessarily crash less. Because the needed comprehensive programs require coordination across bureaucratic boundaries, organizational interests and behavior become issues as important as individual behavior change. Organizational constraints provide great challenges for DE program development and to their evaluation. Utilization of evaluation results is also problematical in the DE field. Disappointing evaluations led to (or justified) reduced support for driver education in the 1980s, when a more rational response would have been to redouble efforts to make the programs more effective.

Journal of MultiDisciplinary Evaluation: JMDE(2)

113

Global Review: Regions

Content, structure, standards, governance, and market incentives are critical issues for driver education globally. Significant further development is needed for DE to fully satisfy the expectations placed upon it by society. More comprehensive evaluation and continuous improvement are seen as critical to progress. Developing guidelines for evaluation of complex programs is always challenging, but seems especially so for the highly diverse driver education field, whose two main goalsindependent mobility and safetyare antithetical. Guiding the GuidelinesThe Consultation Processes Consultation for the development of guidelines consists of an internet discussion board (www.northportassociates.com/aaafts) and a two day workshop. The discussion board has been active and received contributions from driver education experts and evaluators. Much of the board discussion has focused on appropriate objectives and success criteria for DEsafety impacts or other outcomes and impacts. The consultative Workshop took place February 16-17, 2005 in the AAAs Washington, D.C. offices. Two dozen invited participants included academics, consultants and research staff from: National Highway Traffic Safety Administration National Institutes of Health Insurance Institute for Highway Safety American Driver and Traffic Safety Education Association State Departments of Motor Vehicles

Journal of MultiDisciplinary Evaluation: JMDE(2)

114

Global Review: Regions

Driving School Association of the Americas University traffic research institutes: UNC-HSRC, TTI, UMTRI Traffic Injury Research Foundation of Canada The Evaluation CenterWestern Michigan University Georgia State University, Institute Of Public Health Private sector safety research consultants The Workshop members were asked to help answer fundamental questions from their own diverse perspectives. Questions and a few representative answers are shown below. 1. What has worked in driver education evaluation? RCTs focused on safety impacts Quasi-experiments with statistical control 2. What should be done differently? More comprehensive, systematic evaluation, e.g., British Columbia GDl/DE ongoing evaluation/program development process More formative evaluation, e.g., SPC process in the 1970s Use of established evaluation standardsJoint Committee Standards for Educational Evaluation Look at different approaches, e.g., Success Case Method

Journal of MultiDisciplinary Evaluation: JMDE(2)

115

Global Review: Regions

Methodological improvements, e.g., sample sizes Hybrid designs, e.g., RCTs with modeling to compensate for limitations in control group equivalence 3. What do we want driver education evaluation to accomplish? To track DE effectiveness, see if it works To help improve DE, e.g., feedback & continuous improvement To compare performance, e.g., across states. Defend policy, e.g., choices regarding investment, etc. Recognize needs versus objectives, e.g., real driver needs versus arbitrary objectives 4. What are the key evaluation targets, indicators and measures? Product & process, e.g., needs assessment, program quality & consistency, quality management Learning outcomesknowledge & skills, e.g., risk perception, insight Behavioral outcomesrisk response, e.g., speed & space choices Societal impactssafety & mobility, e.g., licensing & crash rates 5. Who are the key users for DE evaluation guidelines? Evaluators Policy makers, legislatures
Journal of MultiDisciplinary Evaluation: JMDE(2) 116

Global Review: Regions

Parents Consumer protection Insurers & policy holders State level administrators DE program managers Researchers 6. What is the best format for the guidelines? Emulate good models, e.g., CDC, Ottawa Health Unit Moderate in size User friendly to a wide range of users, e.g., clear definitions Examples of good & bad practice Cover simple needs, e.g., program quality checklist Support higher level evaluations, e.g., data acquisition for intermediate objectives and impacts The Driver Education Evaluation Guidelines project will proceed through drafting and review of materials, as well as ongoing consultation. It is scheduled to be completed in the Fall of 2005.

Journal of MultiDisciplinary Evaluation: JMDE(2)

117

Global Review: Regions

Principal investigator: Larry Lonero, Northport Associates, e-mail: npa@eagle.ca, telephone: 905377-8883 AAAFTS Project Manager: Dr. Scott Osberg, e-mail: sosberg@aaafoundation.org, telephone: 202-6385944 Selected Bibliography Ammenwerth, E., Iller, C., & Mansmann, U. (2003). Can evaluation studies benefit from triangulation? A case study. International Journal of Medical Informatics, 70(2-3), 237-248. Anderson, D., Abdalla, A., Goldberg, C. N., Diab, T., & Pomietto, B. (2000). Young drivers: A study of policies and practices. Fairfax, VA: Center for the Advancement of Public Health, George Mason University. Donelson, A. C., & Mayhew, D. R. (1987). Driver improvement as post-licensing control: The state of knowledge. Toronto, ON: Ontario Ministry of Transportation and Communications. Fisher, D. L., Laurie, N. E., Glaser, R., Connerney, K., Pollatsek, A., Duffy, S. A., et al. (2002). Use of a fixed-base driving simulator to evaluate the effects of experience and PC-based risk awareness training on drivers' decisions. Human Factors, 44(2), 287-302. Glad, A. (1988). Phase 2 Driver Education: Effect on the risk of accident. Oslo, Norway: Norwegian Center for Research, Department of Transportation.

Journal of MultiDisciplinary Evaluation: JMDE(2)

118

Global Review: Regions

Green, L. W., & Kreuter, M. W. (1991). Health promotion planning: An educational and environmental approach. Mountain View, CA: Mayfield Publishing. Joint Committee on Standards for Educational Evaluation. (1994). The Program Evaluation Standards. How to assess evaluations of educational programs (2nd ed.). Thousand Oaks, CA: Sage Publications. Keskinen, E., Hatakka, M., & Katila, A. (1998). The Finnish way of driver improvement: Second phase of training. Paper presented at the International Driver Improvement Workshop, Bergisch Gladbach, Germany: Bundesanstalt fur Strassenwesen. Lonero, L. P., Clinton, K. M., & Black, D. (2000). Training to improve the decision making of young novice drivers. Volume II: Literature review: Consistency not capacity: Cognitive, motivational and developmental aspects of decision making in young novice drivers. Washington, DC: U.S. Department of Transportation, National Highway Traffic Safety Administration. Lonero, L. P., Clinton, K. M., Peck, R. C., Mayhew, D., Smiley, A., & Black, D. (unpublished). Driver improvement programs: State of knowledge and trends. Ottawa, Ontario: Transport Canada, Government of Canada. Lonero, L. P., Clinton, K. M., Persaud, B. N., Chipman, M. L., & Smiley, A. M. (unpublished). A longitudinal analysis of Manitoba Public Insurance Driver Education Program. Winnipeg, Manitoba: Manitoba Public Insurance. Lonero, L. P., Clinton, K. M., Wilde, G. J. S., Roach, K., McKnight, A. J., Maclean, H., et al. (1994). The roles of legislation, education, and
Journal of MultiDisciplinary Evaluation: JMDE(2) 119

Global Review: Regions

reinforcement in changing road user behaviour. Toronto, Ontario: Ontario Ministry of Transportation. Lonero, L., & Clinton, K. (1998). Changing road user behavior: What works and what doesnt, from www.drivers.com Lund, A. K., & ONeill, B. (1986). Perceived risks and driving behaviour. Accident Analysis and Prevention, 18(5), 367-370. Mayhew, D. R., & Simpson, H. M. (1997). Effectiveness and role of driver education and training in a graduated licensing system. Ottawa, Ontario: Traffic Injury Research Foundation. Patton, M. Q. (1997). Utilization-focused evaluation: The new century text (3rd ed.). Thousand Oaks, CA: Sage Publications. Rogers, P. J., Hacsi, T. A., Petrosino, A., & Huebner, T. A. (2000). Program theory in evaluation: Challenges and opportunities. New Directions for Evaluation (87). Rossi, P. H., Lipsey, M. W., & Freeman, H. E. (2004). Evaluation. A systematic approach (7th ed.). Thousand Oaks, CA: Sage Publications. Scriven, M., quoted in Davidson, E. J. (2004). Evaluation methodology basics. Thousand Oaks, CA: Sage Publications. Shope, J. T., & Molnar, L. J. (2003). Graduated driver licensing in the United States: Evaluation results from the early programs. Journal of Safety Research, 34(1S), 63-69.

Journal of MultiDisciplinary Evaluation: JMDE(2)

120

Global Review: Regions

Smiley, A., Lonero, L., & Chipman, M. (2004). Final Report. A review of the effectiveness of driver training and licensing programs in reducing road crashes. Paris, France: MAIF Foundation. Stock, J. R., Weaver, J. K., Ray, H. W., Brink, J. R., & Sadof, M. G. (1983). Evaluation of Safe Performance Secondary School Driver Education Curriculum Demonstration Project (DOT HS-806 568). Washington, DC: U.S. Department of Transportation, National Highway Traffic Safety Administration. Stufflebeam, D. L. (2001). Evaluation Models. New Directions for Evaluation (89), 1-106. THCU (The Health Communication Unit). (2002). Evaluating health promotion programs. Toronto, ON: Centre for Health Promotion, University of Toronto. Trochim, W. (2001). The research methods knowledge base. Cincinnati, OH: Atomic Dog Publishing. Turner, C., McClure, R., & Pirozzo, S. (2004). Injury and risk-taking behavior - A systematic review. Accident Analysis and Prevention, 36(1), 93-101. W.K. Kellogg Foundation. (1998). W.K. Kellogg Foundation Evaluation Handbook. Retrieved November, 2004, from http://www.wkkf.org/Pubs/Tools/Evaluation/Pub770.pdf Waller, P. F. (1975). Education for driving: An exercise in self delusion? Prepared for the Driver Research Colloquium, Highway Safety Research Institute,

Journal of MultiDisciplinary Evaluation: JMDE(2)

121

Global Review: Regions

University of Michigan, Ann Arbor, Michigan, June 4 & 5. Chapel Hill, N.C.: Highway Safety Research Center, University of North Carolina. Waller, P. F. (1983). Young drivers: Reckless or unprepared? Paper presented at the International Symposium on Young Driver Accidents: In Search of Solutions, November, Banff, Alberta.

Journal of MultiDisciplinary Evaluation: JMDE(2)

122

Global Review: Regions

African Evaluation Association AfrEA56

The African Evaluation Association (AfrEA) held its Third Conference in Cape Town, South Africa, on December 1-4, 2004. The conference was attended by 479 people from 61 countries, of which 47 were in Africa. Around 250 participants also attended the 10 pre-conference professional development workshops held on 29-30 November. The Conference was hosted in conjunction with the Public Service Commission of the South African Government, and was supported by another 21 local and international organisations, including SIDA, DFID, GTZ, the World Bank, the Nelson Mandela Foundation and the African Capacity Building Foundation. Conference proceedings will be available in the first few months of 2005, in both English and French. A narrative conference report is available at Third AfrEA Conference Report . A listserv has been established to continue communication and information sharing among those interested in monitoring and evaluation in Africa. Details can be obtained from the AfrEA Secretariat at info@afrea.org. AfrEA is planning several follow-up activities in partnership with national associations and other interested organizations before its next conference, which will be held in Niger in 2006. For additional information on evaluation in Africa

56

This piece is based on the News section of the African Evaluation Association Website.

Available at http://www.afrea.org/.

Journal of MultiDisciplinary Evaluation: JMDE(2)

123

Global Review: Regions

visit the AfrEA Website or see the first issue of JMDE (Part III: Global Review Regions) at http://evaluation.wmich.edu/jmde/JMDE_Num001.html.

Journal of MultiDisciplinary Evaluation: JMDE(2)

124

Global Review: Regions

International Association for Impact Assessment Brandon W. Youker

International Association for Impact Assessment (IAIA) defines impact assessment as the process of identifying the future consequences of current or proposed action. IAIA is a forum for advancing innovation, development and communication of best practice in impact assessment. Its international membership promotes development of local and global capacity for the application of environmental assessment in which sound science and full public participation provide a foundation for equitable and sustainable development. I. Impact Assessment and Its Relationship with Evaluation IAIA is a shadow professional organization of AEA. Like AEA members, IAIA members are concerned with evaluation issues as they pertain to predicting longterm outcomes, particularly as it relates to the human environment. Below is a discussion of the relationship between impact assessment and evaluation from the perspective of a traditional evaluator. Impact assessment, as defined by IAIA, is pertinent to evaluation; yet differences between the definition of impact by evaluation experts and impact assessment experts cause confusion. IAIA defines impact assessment as, the process of identifying the future consequences of current or proposed action. Conversely, evaluation most frequently refers to retrospective studies. In Evaluation Methodology Basics: The Nuts and Bolts of Sound Evaluation (2005), E. Jane Davidson defines impact as, change or (sometimes) lack of change caused by the evaluand. This term is similar in meaning to the terms outcome and effect. This

Journal of MultiDisciplinary Evaluation: JMDE(2)

125

Global Review: Regions

term impact is often used to refer to long-term outcomes, (Davidson, 2005, p. 241). The IAIA definition of impact assessment and the Davidson definition of impact at least seem to agree on long-term outcomes as a focus of assessing impact. However, the IAIAs definition is inconsistent with that of the Evaluation Thesaurus (1991), which defines an impact evaluation as an evaluation focused on outcomes or payoff rather than process, delivery or implementation evaluation, (Scriven, 1991, p. 190). Scrivens definition mentions nothing of long-term or future outcomes; rather, an impact evaluation is focused on actual outcomes and does not examine other program or policy components. It is clear that impact assessments would surely investigate beyond solely outcomes and also study the process and implementation of the planned intervention. Despite the confusion over the multiple meanings and uses for the term impact, several aspects of impact assessments are evaluative in nature and include evaluation-type tasks. An impact assessment is a type of evaluation and may have utility for a certain evaluators. As the author previously reported, IAIA uses impact assessment to be the study, prediction, and evaluation of long-term outcomes. Therefore, an impact assessment is a process that determines the value of policies and programs in relation to future consequences. To elaborate further, impact assessments select relevant values and determine merit criteria in evaluating both the planned intervention and several alternative interventions to find the best (greatest benefit with least cost) potential intervention. IAIAs description and definition of impact assessment leads the author to conclude that an impact assessment is in fact an evaluation. It is an evaluation of interventions (program/policy) and alternatives based on long-term or future outcomes. Furthermore, impact assessments frequently incorporate evaluation methodology, use evaluation reports, and/or conduct (the typically defined) program and policy
Journal of MultiDisciplinary Evaluation: JMDE(2) 126

Global Review: Regions

evaluations. Evaluations may be particularly germane in developing monitoring and management systems for these outcomes. Evaluation experts may find IAIA and impact assessments especially relevant if they have interest in studying the future outcomes of social; bio-physical; health; or policy, as it pertains to specific human activities. Methodology of impact assessments is beyond the scope of this paper, for additional information on impact assessment methodology, see the IAIA Website. II. IAIA Members and activities Introduction IAIA was founded in 1980 aiming to provide an international forum for researchers, practitioners and others who utilize impact assessments. Its more than 2,500 members from 100 plus-countries include corporate planners and managers, public interest advocates, government planners and officials, private consultants and policy analysts, and university professors and students. IAIA Partnerships and Interested US Agencies IAIA has strategic partnerships formed with the Canadian International Development Agency, the Netherlands Association for Environmental Professionals, the World Bank, US Council on Environmental Quality, the Canadian Environmental Assessment Agency, and various UN agencies. Examples of other US federal agencies that may conduct social impact assessments or utilize its principles: US Bureau of Reclamation, US Forest Service, US Department of Transportation, US Environmental Protection Agency, and US Council on Environmental Quality.

Journal of MultiDisciplinary Evaluation: JMDE(2)

127

Global Review: Regions

IAIA Topical Sections There are 11 sections of IAIA that provide more in depth coverage of a topical debate. The Sections of IAIA are biodiversity and ecology; environmental management systems; health impact assessment; integrated assessment of traderelated policies; indigenous peoples; strategic environmental assessment; local and regional government policy and impact assessment; disasters and conflicts; environmental legislation, institutions and policies; public participation; and social impact assessment. Additional IAIA Activities o Several topical listservs provide networking opportunities and dialogue regarding impact assessments for IAIA members. o IAIA presents an annual conference and each year IAIA chooses a different global location. Additionally, it offers regional conferences, trainings, and professional exchange opportunities. o An IAIA newsletter is published quarterly. It provides information regarding association activities and events, as well as professional news related to impact assessments. o Quarterly, IAIA produces Impact Assessment and Project Appraisal, a journal containing peer-reviewed research articles, professional practice ideas, article and book reviews, editorials, and a professional practice section. The journal focuses on the environment, social, health, technology assessment, sustainability, project appraisal, case studies, cost-benefit analysis, and other impact assessment-related material. The journal is only available to members of IAIA or vis--vis purchase.

Journal of MultiDisciplinary Evaluation: JMDE(2)

128

Global Review: Regions

o The IAIA Website offers Key Citations of background reference material related to the various areas of impact assessments. Environmental Impact Assessments index of Websites is a preliminary index of useful Internet sites used as a preliminary guide for environmental impact assessments. References Davidson, E.J. (2005). Evaluation methodology basics: The nuts and bolts of sound evaluation. Thousand Oaks, CA: Sage Publications, Inc. IAIA (2005). International Association for Impact Assessment (IAIA) Website. Available at http://www.iaia.org/ Scriven, M. (1991). Evaluation thesaurus (4th ed.). Newbury Park, CA: SAGE Publications, Inc.

Journal of MultiDisciplinary Evaluation: JMDE(2)

129

Global Review: Regions

An Update on Evaluation in Canada Chris L. S. Coryn

News and Events The most pertinent news on the Canadian front is of course the 2005 joint Canadian Evaluation Society (CES)/American Evaluation Association (AEA) conference: Crossing Borders, Crossing Boundaries. The conference will be held October 24-30, 2005 in Toronto, Ontario, Canada. Proposals for the conference are due no later than March 11, 2005. For complete information on the conference including strands, topical interest groups (TIGS), submission procedures, and general information please visit the CES/AEA 2005 conference Website. The most recent CES Newsletter (Vol. 4, September 2004) focuses on unpublished evaluation-related articles, conference presentations, book reviews, interviews, and dissertations. This newsletter concentrates on contributions from the European and Australasian Evaluation Societies (EES and AES, respectively) and is divided into five sections; (1) Benefits, Consequences, and Effects of Evaluation, (2) Evaluation Outputs, (3) Evaluator Knowledge, Skills, and Competencies, (4) Evaluator Professionalism, and (5) Evaluation Process Issues. Over 90 complete papers and presentations (primarily from the conferences of the ESS and the AES) are available through links within the newsletter. The Grey Literature Database and Evaluation Report Bank The CES Website offers a wide array of useful resources for evaluators. Of notable interest is the Grey Literature database; a fully searchable bank of almost 500
Journal of MultiDisciplinary Evaluation: JMDE(2) 130

Global Review: Regions

evaluation-related documents encompassing diverse areas of interest to evaluators; for example, theory, policy analysis, ethics, and communication and reporting. The Grey Literature database is searchable by recent conferences (from 2000-2003; including those of the European Evaluation Society, the Australasian Evaluation Society, and the Canadian Evaluation Society), topic, author, date, or language (English, French, and Spanish). These documents include, for example, full-length evaluation reports, conference presentations, and unpublished papers. Also available on the CES Website is the Evaluation Report Bank, a resource for evaluation reports related to Canadian academia, government, and the private sector. Both the Grey Literature database and Evaluation Report Bank are well organized, easy to navigate, and accessible free of charge to non-CES members.

Journal of MultiDisciplinary Evaluation: JMDE(2)

131

Global Review: Regions

An Update on Evaluation in Europe Daniela C. Schroeter

I apologize for exclusion of those national societies whose language I have not mastered and would like to encourage evaluators from the European countries to contribute a report about the state of evaluation in their country to JMDE. Details of events and news items listed below can be found on the corresponding Web sites. Evaluation and the European Union Within the European Union (EU), evaluation is promoted by the Directorate General for Budget, which commissions and conducts ex ante (prospective), interim, and ex post (retrospective) evaluations of programs, projects, and policies of the General Directorates within the EU. The Web site of the evaluation unit provides information about evaluation activities and ongoing studies, policies and procedures, guides on evaluation in different contexts (General Directorates) and on evaluation specific methodology, procurement of evaluation via calls for expressions of interest, information on evaluation networks, including links to specific commissions within the EU, the expert network of member states, evaluation societies within Europe, and links to other organizations conducting evaluation (e.g., OECD, UNDP, UNICEF, the World Bank, the GAO, and the International Monetary Fund).

Journal of MultiDisciplinary Evaluation: JMDE(2)

132

Global Review: Regions

The European Evaluation Society In October, 2004 the EES held its 6th Conference in Berlin, Germany where over 420 participants from all over the world attended. Attendees feedback and the conference program can be viewed on the EES Web site. Papers and presentations from the conference are available through links within the Canadian Evaluation Societys Newsletter (Vol. 4, September 2004). The most recent Newsletter of the EES (December, 2004) reports on the 6th conference of the EES and introduces the new board for the year 2005. Additionally, the EES Residential Summer SchoolEvaluating innovative policy instruments: change, complexity, and policy learning, which takes place September 25-30, 2005 in Seville, Spain, is announced. Between 24-30 experienced evaluators, policy designers, and change managers are invited to participate and learn from international evaluation experts. More detailed information on how to register will be posted on the EES Web site. Moreover, the EES published a summary of a proposed strategy for evaluation, training, and education in Europe (see http://www.european evaluation.org/docs/Strategy_for_Education_and_Training.pdf). The EES board promotes (i) the provision of evaluation training through the society, (ii) support and consulting services for developing evaluation training by national evaluation societies in Europe, (iii) collaboration with institutions of higher education to develop degrees in evaluation, and (iv) functioning as a connector between potential collaborators to develop evaluation training for those in need.

Journal of MultiDisciplinary Evaluation: JMDE(2)

133

Global Review: Regions

The German Evaluation Society The German Evaluation Society (DeGEval) announced various training opportunities. Of specific interest are e-learning opportunities provided by the Center for European Evaluation Expertise (Eureval-C3E). Moreover, the DeGEval announced an invitation for a conference on Multifunctionality of LandscapeAnalysis, Evaluation, and Decision Support, on May 18-19, 2005 in Giessen, Germany. Most recently, the following event were announced on the listserv of the DeGEval: EU-sponsored series of three conferences and four training opportunities on the "Evaluation of Sustainability," EASY-ECO, 2005-2007. For details and further information, please visit www.sustainability.at/easy/. The first EASY-ECO conference Impact Assessment for a New Europe and Beyond takes place June 15-17, 2005 in Manchester. The call for papers is published at: http://www.sed.manchester.ac.uk/idpm/news/#easyeco. The Spanish Evaluation Society The Spanish Evaluation Society announced: Fifth Annual Campbell Collaboration Colloquium on Supply and Demand for Evidence in Lisbon, Portugal, February 23-25, 2005 Third Argentine Congress of Public Administration in San Miguel de Tucumn, Argentina with the topic Society, State, and Administration The Swiss Evaluation Society The Swiss Evaluation Society (SEVAL) announced:
Journal of MultiDisciplinary Evaluation: JMDE(2) 134

Global Review: Regions

SEVAL congressEvaluation in Education, June 3, 2005 in Berne, Switzerland International Conference on "Visualising and Presenting Indicator Systems," March 14 16, 2005, Swiss Federal Statistical Office, Neuchtel, Switzerland Outcome Mapping: Practical, Flexible, Participatory Approach to

Monitoring and Evaluation, a 4-day workshop, Febuary 21-25, 2005 Lebensgarten, Germany Others events listed on SEVALs Web site include training opportunities provided in Switzerland. The UK Evaluation Society The UK Evaluation Society (UKES) announced various events sponsored by regional networks: London Evaluation Network: Participatory Approaches to Evaluation, February 3 or 10, 2005 in the Institute of Commonwealth Studies Scottish Evaluation Network: Whats working? Improving the contribution of research and evaluation to organizational learning, February 25, 2005, in Our Dynamic Earth, Holyrood Road, Edinburgh North West Evaluation Network: Evaluation today: emerging themes and improving practice, July 1, Manchester, NWEN annual conference. Additionally and as announced on the EvalTalk listserv, The 5th International Conference on Evaluation for Practice takes place July 13-15, 2005 at the
135

Journal of MultiDisciplinary Evaluation: JMDE(2)

Global Review: Regions

University of Huddersfield, England. The conference will focus on evaluation practice and its implications in areas such as social services, social work, education, and health services among other. The keynote speaker is Professor Michael Scriven.

Journal of MultiDisciplinary Evaluation: JMDE(2)

136

Global Review: Regions

An Update on Evaluation in the Latin American and Caribbean Region Thomaz Chianca

This short paper updates the evaluation scene in the Latin American and Caribbean (LAC) region. Specifically, it covers four aspects of that scene: (1) evidence on the growing number of job opportunities for evaluators in the region, (2) news about national-level evaluation organizations, (3) professional development courses available in 2005, and (4) the internal evaluation system supported by the Brazilian federal government. 1. Work opportunities for evaluators in LAC There is an increasing number of job postings being advertised in three major LAC evaluation discussions listsRELAC (Latin American and the Caribbean Evaluation Network), PREVAL (Program for Strengthening the Regional Capacity for Evaluation of Rural Poverty Alleviation Projects in Latin America and the Caribbean), and the Brazilian Evaluation Network. Just in the past three months, at least eleven job postings have been advertised. Ten of them were procurements for external program evaluators and one was a search for a full-time coordinator of monitoring and evaluation. Government agencies and nonprofit organizations (NGOs, foundations, and institutes) were the two main clients offering such opportunities. It is interesting to notice the broad range of areas covered by such announcements, including education, environment, child labor, agriculture, and socioeconomic development. Such augmentation in the number of job postings announced in these evaluation lists can be considered in two ways: it is not only an increase in the existing market
Journal of MultiDisciplinary Evaluation: JMDE(2) 137

Global Review: Regions

for professional evaluators in the region, but also a general indicator that local evaluators are being sought to fill the available positions. Table 1 presents some additional details on the job positions advertised. Table 1. Employment Opportunities in LAC
Sector Government Client Ministry of Education Brazil Ministry of Education Brazil Ministry of Education Brazil Brazilian Environmental and natural Resources InstituteBrazil Fideicomisos Instituidos en Relacin con la Agricultura (FIRA) Mexican Government World VisionBrazil Accin Sin Fronteras (ASF)Peru Instituto Avaliao Brazil Killefit Consult Colombia Jurez and Associates USA Position Evaluation consultant Description Assess intermediary results of methodology of Active School Programimproving the quality of public schools Develop evaluation methodology to assess strategic planning of education secretaries within the FUNDESCOLA Project Independent evaluation of GESTAR I Programimproving quality of Math and Language education Evaluation of projects related to the Promising Initiatives Component of the Pro-Vrzea Program in the Amazon region Evaluation of Mexican government programs in livelihood/agriculture. Develop monitoring system based on Logic Model for a social-economic development program Provide leadership and manage all activities related to monitoring and evaluation within ASF Evaluation of projects in the area of race-ethnicity and education Several evaluations of projects in the area of environment and rural development in LAC countries Technical assistance in monitoring and evaluation of international education initiative to combat child labor in Latin American and other countries External final evaluation of program for eradicating child labor in Guatemala

Evaluation consultant

Evaluation consultant Evaluation consultant

Evaluation consultant

Nonprofits

Evaluation consultant Coordinator for Monitoring and Evaluation Evaluation consultant Evaluation consultants Evaluation consultants

Private

Development Agency

OITInternational Labor Organization

Evaluation consultant

Journal of MultiDisciplinary Evaluation: JMDE(2)

138

Global Review: Regions

2. LAC Evaluation Organizations Since November 2004, one new national evaluation organization was formed and three others are taking the initial steps towards their official creation. The Nicaraguan Monitoring and Evaluation Network (RENICSE) became the fifth organization officially created in the region, joining Brazil, Colombia, Costa Rica and Peru. Bolivia, Chile and Venezuela have already started consistent efforts to create their evaluation networks. RELAC, PREVAL, and the UNICEF continue to play important roles in the establishment of national organizations. While searching for information to write this update, I came across some references to an evaluation association in Argentina, the Asociacin Argentina de Evaluacin, including a training course they are providing (see Evaluation Training below for details). At the closing of this issue of the journal I had not received any additional detail regarding the organization from the provided contact personMara Isabel Andrs (mandal@mecon.gov.ar). We hope to have some more to report on the Argentinean Evaluation Association in the next issue of JMDE. The following are some short news items about the existing national evaluation organizations in the region. Brazil: The Brazilian Evaluation Network has now about 300 members divided among its five state chapters in Bahia, Minas Gerais, Pernambuco, Rio de Janeiro, and So Paulo. The Network maintains a website (www.avaliabrasil.org.br) and a discussion list (ReBraMA-subscribe@yahoogrupos.com.br). The most recent activity promoted by the Network was an evaluation seminar involving professionals working in the social development area during the World Social

Journal of MultiDisciplinary Evaluation: JMDE(2)

139

Global Review: Regions

Forum in Porto Alegre (Brazil) in January 2005. For more information about this organization contact Rogrio Silva (rrsilva@fonte.org.br). Colombia: The Colombian Planning, Monitoring, Evaluation and Systematization Network (SIPSE) has its headquarters in Cali at the San Buenaventura University and consists of professionals working in academia, government agencies, nonprofit organizations, international cooperation agencies, and grassroots organizations. SIPSE has as its main goal applying planning, monitoring, evaluation and systematization approaches as a way to foster participative democracy. The programs of the two national events organized by SIPSE can be found at consorcio@consorcio.org.co. For more information about SIPSE contact Gloria Vela (gvela@cable.net.co). Costa Rica: Costa Rica hosts the Central American Evaluation Association (ACE)the first formal evaluation organization created in LAC. ACE has recently launched its first newsletter (http://www.geolatina.org/ace) and is regaining space as a major reference for the region. More information about ACE can be obtained by contacting Welmer Ramos (ramosacu@racsa.co.cr) or Ana Laura Ibaja Jimnez (sisube@racsa.co.cr). Nicaragua: The Nicaraguan Monitoring and Evaluation Network (RENICSE) has hosted three meetings and is working towards defining its vision, mission, and strategic objectives. Eduardo Centeno Cruz (ecenteno@ibw.com.ni) can provide further details about RENICSE. Peru: The Peruvian Monitoring and Evaluation Network has developed its annual plan for 2005. It works via a coordinating committee that rotates every six months charged with fostering participation and articulations for accomplishing the annual plan. The Network has an active electronic discussion list. Additional information
Journal of MultiDisciplinary Evaluation: JMDE(2) 140

Global Review: Regions

on the Networks activities and accomplishments can be provided by Emma Rotondo (rotondoemma@yahoo.com.ar). RELAC: The Latin American and the Caribbean Monitoring, Evaluation and Systematization Network (RELAC) had its first international conference in Lima, Peru, in October 2004. The conference had 142 participants from 22 countries. A Coordination Committee for RELAC was formed with the support of representatives from 16 countries that already have or are in their way to create national evaluation organizations. A preliminary plan of action for 2005 was approved that include: (i) defining operational norms, (ii) defining strategies for establishing alliances, (iii) designing communication and support strategies for new national organizations, and (iv) establishing working groups (e.g., evaluation standards, systematizing, public policy, and capacity building). A CD-Rom with all the material presented at the Conference, including videos, as well as the strategic planning for RELAC will soon be available upon request at: preval3@desco.org.pe. For additional information on RELAC contact one of the members of the coordination committee: Consuelo BallesterosSouth Cone (consueloballesteros@vtr.net), Gloria VelaColombia (gvela@cable.net.co), Welmer RamosCentral America (ramosacu@racsa.co.cr), Luis SobernPeru (lsober@terra.com.pe), Rogerio SilvaBrazil (rrsilva@fonte.org.br), Marco Segone (msegone@unicef.org), Ada Ocampo (ada.ocampo@undp.org), Emma Rotondo (rotondoemma@yahoo.com.ar), and Oscar Jara (oscar.jara@alforja.or.cr) 3. Evaluation training The availability of a number of on-line evaluation courses in Spanish is probably the most interesting fact regarding evaluation training in LAC not covered in the last issue of JMDE. At least three institutions are offering such courses in 2005:

Journal of MultiDisciplinary Evaluation: JMDE(2)

141

Global Review: Regions

(1) In Argentina, the University Nacional del Litoral and the Center for Development and Technological Technical Assistance Centro for Public Organizations (TOP) are offering an on-line certification course in Outcome and Impact Evaluation of Public Organizations and Programs. More information at: www.top.org.ar/curso_virt6.htm (2) PROGRESO Social Projects, Management, and Resources, a Peruvian nonprofit organization, is offering an on-line course in Qualitative Methods for Evaluation. Access their website for additional information: www.progresoperu.org (3) The Inter-American Development Bank (BID) is offering a series of four online courses covering: (i) Logic Models for Project Design, (ii) Project Monitoring and Evaluation, (iii) Evaluation of Environment Impact, and (iv) Institutional Analysis. The courses have two quite attractive features. They are free of charge and are available year around. For details access: www.iadb.org/int/rtc/ecourses/esp/index.htm Some courses not covered in the October 2004 JMDE issue include: The Brazilian Ministry of Social Development, the National School of Public Administration and the National School of Public Health in association with the Institute of Social Studies, (The Netherlands) are offering a certification course on Evaluation of Social Programs targeting primarily public administrators. March to June 2005 in Brasilia, DF, Brazil. The Center for Studies in Economic Development (CEDE)University of Los Andes in Bogot, Colombia is offering a specialization course in Social Evaluation of Projects.

Journal of MultiDisciplinary Evaluation: JMDE(2)

142

Global Review: Regions

The Latin American Institute for Social and Economic Planning (ILPES) will be offering at least three short-term evaluation courses in 2005: (i) Use of Socioeconomic Indicators to Evaluate the Impact of Poverty Reduction Program July 2005, Cartagena de Indias, Colombia; (ii) Logic Model, Monitoring and EvaluationAug/Sep 2005, Antigua, Guatemala; (iii) Planning and Evaluation of Public Investments ProjectsOct 2005, Santiago, Chile. The Argentinean Evaluation Association (AAE) will be offering a year-long (March-December 2005) specialization course on project identification, elaboration, and evaluation as part of the Capacity Building Program of the Secretary of Economic Policy of the Argentinean Government. For details contact Mara Isabel Andrs (mandal@mecon.gov.ar). 4. Evaluation within the Brazilian federal government The Brazilian government under the leadership of the Ministry of Planning, Budget and Management has taken important steps towards the development of an evaluation culture within the public administration system. Important efforts have been made in the direction of establishing internal evaluation strategies for all federal programs under the umbrella of the federal Pluri-Annual Plan (PPA) for 2004-2007. An evaluation manual has been produced offering a framework to orient all program managers to assess their efforts as a way to improve their practices. The idea is to establish a flexible monitoring system with the input from the different ministries External evaluations of four major federal programs are expected to be implemented in 2005. Studies to develop a more participative evaluation system within the federal sphere are underway. The government is also trying to learn from other countries experiencesa group of staff members recently visited the
Journal of MultiDisciplinary Evaluation: JMDE(2) 143

Global Review: Regions

US, Canada and the United Kingdom. An evaluation seminar, open to the public and including the participation of evaluation specialists from Canada, was organized to share lessons learned from such visits. The advances produced in this area in Brazil are unquestionable and show a clear interest in promoting long-lasting changes. Hopefully such efforts will culminate in the establishment of an independent evaluation agency, on the model of the American General Accounting Office (GAO) that will be able to provide candid accounts of the merit, relevance, and significance of the federally funded programs. Such evaluations will help strengthen governments accountability as well as provide quality information to help decision makers make better use of the scarce resources available. Additional information regarding the work done by the Brazilian Ministry of Planning can be obtained from Andreia Rodrigues dos Santos (andreia.santos@planejamento.gov.br). Final Note I would like to thank the many people that provided me with key information to develop this update: Ana Laura Ibaja Jimnez (Costa Rica), Andria Rodrigues dos Santos (Brazil), Eduardo Centeno Cruz (Nicaragua), Emma Rotondo (Peru), Gloria Vela (Colombia), Rogrio Renato Silva (Brazil), and Welmer Ramos (Costa Rica). If you have additional information or corrections on any of the topics covered by this article or by the previous one, or if you want to send additional contributions regarding evaluation in Latin America and the Caribbean please do not hesitate to contact me at Thomaz.Chianca@wmich.edu.

Journal of MultiDisciplinary Evaluation: JMDE(2)

144

Global Review: Publications

Summary of American Journal of Evaluation, Volume 25(4), 2004 Melvin M. Mark

The following is excerpted from the introduction to Volume 25, Issue 4 of the American Journal of Evaluation, by former AJE editor Dr. Melvin M. Mark. It is reprinted here with permission from Dr. Mark; AJEs current editor, Dr. Robin Miller; and the American Evaluation Association (AEA). The American Journal of Evaluation is the official journal of the American Evaluation Association and is distributed to AEA members as part of their membership package. To learn more about AEA and how to receive AJE, please go to www.eval.org. In the first paper, Robert Orwin, Bernadette Campbell, Kevin Campbell, and Antoinette Krupski examine the effect of the 1997 termination of the Social Security Administrations Disability Insurance and Supplemental Security Income benefits for persons diagnosed with drug or alcohol addiction. The paper describes and illustrates innovations and recent developments in quantitative methods for evaluation, including a combination of the interrupted time series with growth curve modeling; propensity scoring analyses; the use of alternative ways to estimate the counterfactual; and sensitivity analyses to empirically assess the plausibility of validity threats. Importantly, Orwin and his colleagues go further, carefully conducting and considering the implications of a set of post-hoc exploratory analyses. These analyses suggest a far more nuanced interpretation of the effects of benefit termination than did the primary, state-of-the-art tests. The paper is a valuable example of the principled examination of quantitative data that can move us beyond overall, global estimates of an interventions average effects.

Journal of MultiDisciplinary Evaluation: JMDE(2)

145

Global Review: Publications

In the second paper, Katherine Ryan describes, illustrates, and critiques three approaches which fall under the broader umbrella of "democratic evaluation approaches". The three are the seminal democratic evaluation approach of MacDonald, the deliberative democratic evaluation approach of House and Howe, and the emerging notion of communicative evaluation advocated by Niemi and Kemmis. Ryan compares and contrasts these three approaches, in part by presenting for each a vignette describing a case in which the approach was implemented. Ryan goes beyond simply examining the three democratic evaluation approaches in the abstract. Instead, she considers the implications of these approaches in an environment in which educational accountability has been shaped by the No Child Left Behind legislation and related forces. In effect, Ryan asks how evaluators can contribute to more democratic forms of educational accountability. In the third paper in this issue, Christine Leow, Sue Marcus, Elaine Zanutto, and Robert Boruch address the question of whether taking advanced courses in math and science improves performance on basic achievement tests. Leow and her colleagues use propensity score methods in an attempt to control for the biases that otherwise would result because of the systematic differences between students who take advanced courses and those who do not. The paper thus will be of interest to readers who would like to learn more about propensity score analyses. Perhaps of more interest, Leow and her colleagues illustrate a kind of sensitivity analyses, which allows them to examine how susceptible their findings are to what is called hidden bias, that is, the bias that might arise from background factors that are not controlled for in the analyses. Sensitivity analyses should be an important technique in the tool kit of quantitative evaluators, as a way of helping assess how much uncertainty one should ascribe to evaluation findings.

Journal of MultiDisciplinary Evaluation: JMDE(2)

146

Global Review: Publications

As one outcome of the evidence-based practice movement, there seems to be a growing trend whereby mandates, recommendations, or incentives are put into place in an effort to lead practitioners to use programs that have passed some evaluative threshold. But this trend raises several questions, among them: How do practitioners learn about so-called evidence-based programs? What are the processes by which they adopt such programs and eliminate their current programs? Are the evidence-based programs likely to be implemented with sufficient fidelity that one would expect good outcomes? Tena St. Pierre and D. Lynne Kaltreider address these and related questions, in a replicated case study investigating school adoption and implementation processes of an evidence-based substance abuse prevention program. The findings should be noteworthy to those interested in program implementation, in the way schools choose to adopt and adapt programs, and more generally in how mandates for evidence-based practice play out in real life. Huilan Yang, Jianping Shen, Honggao Cao, and Charles Warfield address "multilevel evaluation," which arises, for example, when there are multiple sitelevel projects within a broader programs or, as in the example Yang and colleagues discuss, three levels: project, cluster, and initiative. The authors of this paper lay out a process to facilitate multilevel evaluation alignment, that is, to facilitate congruence, compatibility, and efficiency across the evaluations at the different levels. In one sense, the process can be seen as the application of sound evaluation planning in the multilevel program context. However, Yang and her colleagues argue that the literature on multisite evaluation demonstrates the need for an alignment model specifically focused on multilevel evaluations. In the final paper in the Articles section, Tricia Leakey, Kevin Lund, Karin Koga, and Karen Glanz address an issue of considerable importance to those who
Journal of MultiDisciplinary Evaluation: JMDE(2) 147

Global Review: Publications

evaluate programs based in schools or, more generally, who work with participants who are minors: obtaining parental consent. Leakey and her colleagues describe a case from their own evaluation experience examining a smoking prevention program. They employed different consent procedures at different times, and describe their experiences in this article. In the Method Notes section, Henry May addresses a classic and continuing concern for evaluators: How can we best communicate our results, especially statistical findings, to those who need to make sense of and use evaluation findings? May discusses and illustrates the use of three guidelines for formulating and presenting more meaningful statistics. These are understandability, interpretability, and comparability. May also offers several interesting and valuable examples for reporting a variety of statistics, both simple and complex, in more meaningful ways. This issue includes an atypical contribution in the Exemplars section. In the past, this section has presented a series of interviews with evaluators who discuss a specific evaluation they had conducted. In those interviews the section editor, Jody Fitzpatrick, questioned the evaluator to understand more about the various choices he or she made throughout the evaluation, from the initial steps in planning, to the involvement of stakeholders, to the data collection methods and evaluation approaches employed, and to the steps taken to disseminate findings and facilitate use. With the naming of a new editor for the Exemplars section, I invited Jody Fitzpatrick to reflect on the numerous interviews she had conducted. Such an effort to "sum up" previous work in a section of AJE is not completely new. Two years ago, Michael Morris (2002), section editor of Ethical Challenges, invited Lois-ellin Datta (2002) and Nick Smith (2002) to examine previous commentators' responses to 10 ethical challenges Morris had previously posed in the section. As
Journal of MultiDisciplinary Evaluation: JMDE(2) 148

Global Review: Publications

was the case with the Datta and Smith reflections, Jody Fitzpatrick has provided a fascinating piece. In essence, she treats the interviews from Exemplars as a set of case studies, allowing her to examine similarities and differences across a set of evaluators in terms of such important characteristics as preferred evaluation role, the purpose of evaluation, the factors the evaluator used to organize and frame their work, the nature of stakeholder involvement, and method choices. Finally, after too long a delay, the Book Review section reappears. Shirley Copeland reviews a recent book by Martha Feldman, Jeannine Bell, and Michelle Berger on the process of gaining access and qualitative research. Thanks to Shirley for an informative review. References Datta, L-e. (2002). The case of the uncertain bridge. American Journal of Evaluation, 23, 187-197. Morris, M. (2002). Ethical challenges. American Journal of Evaluation, 23, 183185. Smith, N. L. (2002). An analysis of ethical challenges in evaluation. American Journal of Evaluation, 23, 199-206. The Oral History Project Team (2003). The oral history of evaluation Part I. Reflections on the chance to work with great people: An interview with William Shadish. American Journal of Evaluation, 24, 261-272. The Oral History Project Team (2003). The oral history of evaluation Part II. An interview with Lois-ellin Datta. American Journal of Evaluation, 25, 243253.

Journal of MultiDisciplinary Evaluation: JMDE(2)

149

Global Review: Publications

New Directions for Evaluation John S. Risley

The two most recent issues of New Directions for Evaluation each cover international perspectives in the field. The Fall 2004 issue (Rugg, Peersman, and Carael) addressed Global Advances in HIV/AIDS Monitoring and Evaluation while the Winter 2004 issue (Russon and Russon) concerned International Perspectives on Evaluation Standards. The Fall issue covers a wide range of topics in HIV/AIDS monitoring and evaluation including political influences, international perspectives focusing on the roles of the United Nations and the World Bank, and specific program evaluation experiences. While this issue deals mostly with subjects specific to HIV/AIDS prevention and treatment it does offer some insight into evaluation questions with a wider impact. These questions are identified nicely by Michael Quinn Patton in his overview chapter A Microcosm of the Global Challenges Facing the Field: Commentary on HIV/AIDS Monitoring and Evaluation. Patton identifies issues touched on by the various authors that are seen in many evaluation contexts, such as the denial of problems despite compelling evidence, the use of evaluation for accountability vs. program improvement, and selective use of evaluation findings. The three main critiques Patton offers are: 1) the sense that the authors are overwhelmed by numbers and fail to include stories of real people affected by HIV/AIDS, 2) the deeply entrenched mechanistic linearity (p. 168) in evaluation, and 3) the acceptance of unrealistic goals. He argues for including stories of real
Journal of MultiDisciplinary Evaluation: JMDE(2) 150

Global Review: Publications

people along with the reporting of data so that the data doesnt take on an abstract life of their own. (p.168) He criticizes the input-activities-output-outcomeimpact framework presented in one chapter as the basic organizing framework endorsed by all agencies to organize the data required to monitor program progress. (p. 37) Patton cites Uganda and Brazil as two successful cases of countries greatly reducing their HIV/AIDS infection rates through complex, dynamic systems change. (p. 169) In situations such as these complex systems change mapping and networking models hold more promise than do traditional linear-logic models. (p. 169) Patton also contends that evaluators should not merely accept the program goals when evaluating a program. Specifically, he says overly optimistic goals, like those set by the United Nations regarding HIV/AIDS, should be questioned by evaluators. The Winter issue reviews the development of evaluation standards in the United States, Western Europe, Africa, Australasia, and at some large international nongovernmental organizations. Craig Russon (co-editor of the issue with Gabrielle Russon) provides an overview of the development of national- and regional-level evaluation standards in the years since the Joint Committees Program Evaluation Standards were adopted in 1994. He notes that the Joint Committee Standards were influential on all standards that followed, acting as either a point of departure or as an example of what some national and regional groups did not want their standards to be. (Russon, p. 90) One such instance is addressed by Doug Fraser in his review of the experience of the Australasian Evaluation Societys (AES) ongoing process of developing a policy on standards. Fraser recounts how the Joint Committee Standards were the starting points but they depended on a number of fundamental preconditions or assumptions that did not necessarily hold true in the environment of Australia and
Journal of MultiDisciplinary Evaluation: JMDE(2) 151

Global Review: Publications

New Zealand. (p. 71) The Program Evaluation Standards concentrated on risks that were internal to the evaluation itself: risks of evaluators overreaching themselves, overlooking key aspects of their task, exercising bias, behaving unethically, or failing to apply an appropriate range and quality of techniques. (p. 71) AES members saw the risks and threats they wished to address as being external to the process of evaluation. These risks and threats concern how evaluation is managed, planned, supported and used. Many of these issues are controlled by those who fund and use evaluation, therefore any standards should address these audiences, not simply practicing evaluators. Fraser recounts how the AES has long had a practitioner code of ethics but the process of developing a set of standards for evaluation stalled in 2001 owing to many factors. However, an Ethics and Standards Committee did prepare a draft set of standards for the societys 2001 conference. This draft included six categories: transparency, utility, practicality, cost-effectiveness, ethics, and accuracy/quality/comprehensiveness. (p. 77) Fraser notes the prominence of transparency in this draft as contrasted with the Joint Committee Standards. References Rugg, D., Peersman, G., & Carael, M. (Eds.). (2004). Global advances in HIV/AIDS monitoring and evaluation. New Directions for Evaluation, 103. Russon, C. & Russon, G. (Eds.). (2004). International perspectives on evaluation standards. New Directions for Evaluation, 104.

Journal of MultiDisciplinary Evaluation: JMDE(2)

152

Global Review: Publications

Education Update Nadini Persaud

Websites The World of Education http://www.educationworld.net/ user friendly website provides links to jobs in education, world facts, education forums, a library, a web directory and a bookstore. The world facts link is particularly informative; it provides a database including every country on the globe. Once a country is selected, the visitor can get access to a country map and brief profiles on the geography, people, government, economy, communications, transportation, military and transnational issues for each country. The forum link directs the viewer to the Education American Network and Education Canada Network where various workshops on teacher-to-teacher and lessons, plans and curricula, can be found. The Internet Public Library Website provides links to subject areas in Arts and Humanities, Business, Computers, Education, Entertainment, Health, Government, Regional, Science and Technology and Social Sciences. It also provides links to a Ready Reference database (almanacs, calendars, dictionaries) and a Reading Room (books, magazines and newspapers). The regional link directs the viewer to databases on history and travel and tourism by Continents/Region. Journal Articles Journal of Teacher Education (Volume 56: 2005 and Volume 55: 2004) has a number of interesting articles including:

Journal of MultiDisciplinary Evaluation: JMDE(2)

153

Global Review: Publications

Integrating Technology into Teacher Education: A Critical Framework for Implementing Reform by Valerie Otero, Dominic Peressini, Kirsten Anderson Meymaris, Pamela Ford, Tabitha Garvin, Danielle Harlow, Michelle Reidel, Bryan Waite, and Carolyn Mears [PDF] The article Integrating Technology Into Teacher Education: A Critical Framework For Implementing Reform discusses the challenges of integrating technology into teacher education. According to the authors, teachers must be skilled in technology applications and knowledgeable about using technology in order to enhance and extend student learning. In this article, the authors present a model for technological change and also describe a critical framework to facilitate discourse among education faculty from which understandings of why, when and how to use technology emerge. Implicit in this model for technological change is a strategy for sustainability. The authors conclude that a shared vision about the role of technology in teacher education has not yet emerged in the field of education. Teaching Under High-Stakes Testing: Dilemmas And Decisions Of A Teacher Educator by Rosemary E. Sutton The article Teaching Under High-Stakes Testing: Dilemmas And Decisions Of A Teacher Educator reviews how an experienced teacher was forced to change her teaching strategies as a result of the introduction of the PRAXIS II: Principles of Learning and Teaching (PLT) tests, which are now mandated in Ohio. According to Sutton (2004), many of her students failed the PRAXIS II test when it was first introduced, because they were not good standardized test takers. She explains that as a result of the unsatisfactory pass rate on the PRAXIS II test, the Deans Office in the College of Education and Human Services encouraged the faculty to take the tests. Sutton (2004) notes that it was only after taking the test, that she realized that

Journal of MultiDisciplinary Evaluation: JMDE(2)

154

Global Review: Publications

she would have to alter her method of assessment, content and teaching strategies, in her education psychology courses to prepare students for the PRAXIS II tests. She concludes by noting that the implementation of PRAXIS II increased collaboration among faculty teaching education psychology at her University. Now instructors meet regularly to choose a common text book, share resources and discuss topics that should be incorporated into the educational psychology courses offered by the University. Other interesting articles include Taking Stock in 2005: Getting Beyond the Horse Race by Marilyn CochranSmith [PDF] The Effect of Perceived Learner Advantages on Teachers' Beliefs About Critical-Thinking Activities by Edward Warburton and Bruce Torff [PDF] Shifting from Developmental to Postmodern Practices in Early Childhood Teacher Education by Sharon Ryan and Susan Grieshaber [PDF] Preservice Teachers Becoming Agents of Change: Pedagogical Implications for Action Research by Jeremy N. Price and Linda Valli [PDF] "Nadie Me Dij [Nobody Told Me]": Language Policy Negotiation and Implications for Teacher Education by Manka M. Varghese and Tom Stritikus [PDF] Comparing PDS and Campus-Based Preservice Teacher Preparation: Is PDSBased Preparation Really Better? by D. Scott Ridley, Sally Hurwitz, Mary Ruth Davis Hackett, and Kari Knutson Miller [PDF].

Journal of MultiDisciplinary Evaluation: JMDE(2)

155

Global Review: Publications

References Ford, P., Garvin, T., Harlow, D., Mears, C., Meymaris, K. A., Otero, V., Peressini, D., Reidel, M., Waite, B. (2005). Integrating technology into teacher education: A critical framework for implementing reform. Journal of Teacher Education, 56(1): 8-23. Sutton, R. E. (2004). Teaching under high-stakes testing: Dilemmas and decisions of a teacher educator. Journal of Teacher Education, 55(5): 463-475.

Journal of MultiDisciplinary Evaluation: JMDE(2)

156

Global Review: Publications

The Evaluation ExchangeHarvard Family Research Project Brandon W. Youker

Harvard Family Research Project (HFRP) was founded by the Harvard Graduate School of Education in 1983. The HFRP aims to help strengthen family, school, and community partnerships of early childhood care and education; promote evaluation and accountability; and offer professional development to those who work with children and/or their families. The project has aided philanthropies, policymakers, and practitioners by collecting, analyzing, and synthesizing research and information. HFRPs Goals: Develop, test, and communicate methods that promote continuous improvement and accountability Promote diversity, program and system complexity, and outcomes measurement and attainment through evaluation practices Expand and strengthen the professional development base of those who work directly with children and families Provide policymakers, practitioners, and foundations with research and information to guide them as they fund new strategies and strengthen existing initiatives HFRP strives to reach its goals through providing:

Journal of MultiDisciplinary Evaluation: JMDE(2)

157

Global Review: Publications

Knowledge Development Training and Professional Development Technical Assistance Continuous Learning and Dialogue HFRP has two categories for research: a) Family-school-community partnerships b) Strategy consulting and evaluation HFRP-partial list of funders: Carnegie Corporation of New York The Annie E. Casey Foundation The Ford Foundation The Heinz Endowments The W.K. Kellogg Foundation John D. & Catherine T. MacArthur Foundation The Charles Steward Mott Foundation The Pew Charitable Trusts The Rockefeller Foundation

Journal of MultiDisciplinary Evaluation: JMDE(2)

158

Global Review: Publications

HFRP has an evaluation periodical, The Evaluation Exchange. The journal, published 3 or 4 times a year, addresses issues that program evaluations frequently encounter. The Evaluation Exchange emphasizes innovative methods and approaches to evaluation, emerging trends in practice, and practical applications of evaluation theory. It is designed as an ongoing discussion medium among evaluators, program practitioners, funders and policymakers. The journal is divided in to 5 sections; (1) Theory & Practice; (2) Promising Practices; (3) Spotlight; (4) Evaluations to Watch, and (5) Beyond Basic Training. Journal subscriptions are free and contributions are encouraged. Examples of evaluation-related articles in the most recent journal publication (Volume X, No.4, Winter 2004/2005): Improving Parental Involvement: Evaluating Treatment Effects in the Fast Track Program. Parental Involvement and Secondary School Student Educational Outcomes: A Meta-Analysis. Blending Evaluation Traditions: The Talent Development Model. What Matters in Family Support Evaluation? Learning from Parents Through Reflective Evaluation Practice. Ongoing Evaluations of Programs in Parent Leadership and Family Involvement. Promoting Quality Outcome Measurement: A Home-Visitation Case. Past journal issues of particular relevance to evaluators:
159

Journal of MultiDisciplinary Evaluation: JMDE(2)

Global Review: Publications

Vol. X, No. 3, Fall 2004 Vol. X, No. 2, Summer 04 Vol. X, No. 1, Spring 04 Vol. IX, No. 4, Winter 03/04 Vol. IV, No. 2, 1998 Vol. 1, No. 2, 1995

Harnessing Technology for Evaluation Early Childhood Programs and Evaluation Evaluating Out-of-School Time Program Quality Reflecting on the Past and Future of Evaluation Evaluation in the 21st Century Participatory Evaluations

Journal of MultiDisciplinary Evaluation: JMDE(2)

160

Global Review: Publications

Canadian Journal of Program Evaluation, Volume 19(2), Fall 2004 Chris L. S. Coryn

The Canadian Journal of Program Evaluation's (CJPE) most recent issue contains 7 articles (5 in English and 2 in French), a research and practice note, and 3 book reviews. The first article in this issueThe Role of the Evaluator in a Political Worldis by Ernie House, and loosely based upon his keynote speech from last year's CES conference (Evaluation 2004) in Saskatoon, Canada. House focuses his attention on the political struggles faced by evaluators under the current Bush administration in the United States as well as the difficulties in balancing often conflicting, highly variable values and interest. Following House's contribution is Using Multi-Site Core Evaluation to Provide "Scientific" Evidence by Frances Lawrenz and Douglas Huffman. The authors describe how to incorporate other evaluative purposes into evaluations in which the client's central concern is effectiveness (usually goal-based evaluation requiring experimental designs, for example). Lawrenz and Huffman argue that the prevalent standards of "scientific" evidence (such as the U.S. Department of Education's priority on "scientifically-based evaluation methods" (p. 18)i.e., RCTs) have not been proven superior to other approaches. The authors demonstrate, through a case study, that scientific rigor can be maintained while including other evaluation approaches (in this case a participatory, collaborative approach) and purposes (to not only evaluate "what" happened, but "how").

Journal of MultiDisciplinary Evaluation: JMDE(2)

161

Global Review: Publications

Mary Sehls piece entitled Stakeholder Involvement in a Government-Funded Outcome Evaluation: Lessons Learned from the Front Line describes involving project stakeholders in the planning and decision making process and the strengths and limitations associated with this approach. Le Benchmarking Et Lamlioration Continue by Marthe Hurteau describes the benchmarking process and its use to identify relevant performance indicators. Hurteau uses this article to explore the current preoccupation in evaluation literature: the impact of organizational context on the evaluation process (p. 57). Lvaluation des Technologies de la Sant: Comment Lintroduire dans les Hpitaux Universitaires du Qubec? by Oliver Sossa and Pascale Lehoux describes the implementation of health technology in Quebec university teaching health centers and the structures that facilitate its development. J. Bradley Cousins, Swee C. Goh, Shannon Clark, and Linda E. Lee explore the conceptual interconnections and linkages among developments in the domains of evaluation utilization, evaluation capacity building, and organizational learning (p. 99) in their article entitled Integrating Evaluative Inquiry into the Organizational Culture: A Review and Synthesis of the Knowledge Base. The implications for future research efforts and practice are also discussed. In The Analysis of Focus Groups in Published Research Articles Geoffrey S. Wiggins critically assesses the use of analytic methods employed in published literature from several disciplines. The author found that fewer than half of these research articles utilized systematic analytic techniques (emergent or pre-ordinate) in assessing focus group transcripts. Even fewer utilize measures of reliability when analyzing transcripts.

Journal of MultiDisciplinary Evaluation: JMDE(2)

162

Global Review: Publications

The Research and Practice Note in this issue of CJPE is a piece by Allison Nichols, entitled Pre- and Post-Scenarios: Assessing Learning and Behavior Outcomes in Training Settings. Nichols frames her discussion of documenting training outcomes in learning and behavior using two case study examples. The final section of this issue of CJPE is devoted to reviews of three books. The first is Organizational Assessment: A Framework for Improving Performance (Lusthaus, Adrien, Anderson, Carden, & Montalvan, 2002) reviewed by Stephen M. Morabito. The second, Evaluating Social Programs and Problems: Visions for the New Millennium (Donaldson & Scriven, Eds., 2003) is reviewed by Jennifer Carey. The final review, of the book entitled Institutionalizing IMPACT Orientation: Building a Performance Management Approach that Enhances the Impact Orientation of Research Organizations (Smith & Sutherland, 2002) is reviewed by Ronald Mackay.

Journal of MultiDisciplinary Evaluation: JMDE(2)

163

Global Review: Publications

Evaluation: The International Journal of Theory, Research and Practice, Volume 10(4), October 2004 Daniela C. Schrter

The most recent issue of Evaluation contains six articles, one contribution to A Visit to the World of Practice, and News from the Community respectively. Widmer and Neuenschwander discuss in their articleEmbedding Evaluation in the Swiss Federal Administration: Purpose, Institutional Design, and Utilization how evaluation is embedded within the Swiss political system and conclude that currently used evaluation measures in the government can be improved through purposeful differentiation of evaluation types. They first summarize four purposes of evaluationaccountability, improvement, basic knowledge, and strategy; secondly, five uses of evaluationinstrumental, conceptual, interactive, legitimating, and tactical; and thirdly, two predominant institutional designs in which evaluation is implementedcentralized and decentralized. Thereafter, Widmer and Neuenschwander demonstrate how evaluation is embedded within the different federal agencies of the Swiss government. Key findings of their study include that (i) accountability and improvement were the most relevant purposes in these organizational contexts; (ii) evaluation findings were most commonly utilized instrumentally, followed by legitimizing and interactive uses; (iii) the institutional design was of little or no relevance; and (iv) unanticipated blends of purpose and utilization existed. In the second articleUtilizing Evaluation Evidence to Enhance Professional PracticeHelen Simons criticizes the current politically favored approach to

Journal of MultiDisciplinary Evaluation: JMDE(2)

164

Global Review: Publications

evaluation, namely evidence-based evaluation. She states that this approach to elucidating evidence fails to recognize the holistic nature of professional practice and disregards the complexity of professional decision making and action (p.410). Qualitative forms of knowledge generation for evaluative purposes would enhance the quality of evaluation and increase the utilization of evaluation findings. In The Meaning Assigned to Evaluation by Project Staff: Analysis from the Project-management Perspective in the Field of Social Welfare and Healthcare in Finland, Seppnen-Jrvel examines how evaluation is understood by project staff and management and how it influences the work environment. Seppnen-Jrvel concludes that there is a need to update the current knowledge of project staff and management about evaluation and to promote and enforce evaluation culture and capacity building within organizations in the Social Welfare and Healthcare sector in Finland. Oakley, Strange, Stephenson, Forrest, and Monteiros articleEvaluating Processes: A Case Study of a Randomized Controlled Trial of Sex Education exemplifies the application of RCTs to evaluate how processes and outcomes are interrelated. The authors conclude that ultimately the choice of design and quantitative or qualitative approaches is context-dependent and related to the questions asked. Following is McNamara and OHaras article Trusting the Teacher: Evaluating Educational Innovation, in which the authors claim that the role of the external evaluator should be switched to that of the educating consultant for the teacher. The case for self-evaluation of the practitioner is supported by the argument that external evaluation would often fail to support improvement of the evaluand. To support sound educational values, (p. 472) evaluators should function as

Journal of MultiDisciplinary Evaluation: JMDE(2)

165

Global Review: Publications

facilitators and consultants in conducting the research and enhance the credibility of self-evaluation by meta-evaluating the internal self-evaluation processes. The last article, authored by Hanberger and Schild, discusses Strategies to Evaluate a University-Industry Knowledge-exchange Programme. The authors consider two management-oriented approaches to program evaluation (program theory evaluation and outcome analysis) and two non-management oriented approaches (policy discourse analysis and qualitative network analysis). They conclude that different evaluation methods stress the values of different stakeholder groups. An integration of various methods is necessary, reduces the bias toward one stakeholder group, and increases the validity in contexts where multiple stakeholder groups are present. In situations with only one target group and few stakeholders, a combination of various evaluation approaches would not be as essential. In A Visit to the World of Practice, Farrall and Gadd address Evaluating Crime Fears: A Research Note on a Pilot Study to Improve the Measurement of the Fear of Crime as a Performance Indicator. Instruments intended to assess fear of crime are criticized for their poor design and neglect of crucial research concerns such as frequency and intensity. The authors suggest survey questions to be incorporated to improve instruments measuring fear of crime. In News from the Community, Nicoletta Stame, President of the EES, reports on the 6th EES conference entitled Governance, Democracy and Evaluation. These issues include: (i) evaluation as a tool for democratic government, (ii) the question of an European evaluation identity, (iii) European standards for evaluation, (iv) relationships among evaluation networks and associations, and (v) training, education, and professional development of evaluation in Europe. For example, the

Journal of MultiDisciplinary Evaluation: JMDE(2)

166

Global Review: Publications

EES considers the establishment of European evaluation standards but does not want to conflict or overthrow features unique to individual national characteristics. The journal concludes with translations of the articles abstracts into French and an Annual Index of articles.

Journal of MultiDisciplinary Evaluation: JMDE(2)

167

Global Review: Publications

Measurement: Interdisciplinary Research and Perspectives, Volume 1(1), 2003 Chris L. S. Coryn

Measurement: Interdisciplinary Research and Perspectives is a relatively new journal devoted to the interdisciplinary study of measurement in the human sciences and is intended to represent a broad range of disciplines and perspectives including psychometrics, ethnography, social theory, psychology, economics, education, linguistics, sociology, policy studies, history, and law. Each issue is devoted to a single, provocative focus article followed by commentaries and a rejoinder article. Further information can be found at http://bearcenter.berkeley.edu/measurement/. Presently eight issues are available covering objectivity and trust, standards-based testing, and certification testing, for example. The inaugural issueVolume 1(1), 2003was sent to us by the editors (Mark Wilson at the University of California, Berkeley; Paul De Boeck at K. U. Leuven, Belgium; and Pamela Moss at the University of Michigan) to encourage becoming involved in for example, debating a focus paper or participating in a commentary. The focus article of the inaugural issue is On the Structure of Educational Assessments by Robert J. Mislevy, Linda S. Steinberg, and Russell G. Almond. This article describes a framework for assessment that makes explicit the interrelations among substantive arguments, assessment designs, and operational processes. This framework, called evidence-centered assessment design (ECD)

Journal of MultiDisciplinary Evaluation: JMDE(2)

168

Global Review: Publications

entails the development, construction, and arrangement of specialized information elements, or assessment design objects, into specifications that embody the substantive arguments that underlies an assessment (p. 4). The authors illustrate their ideas with examples from language testing and the article is presented parallel to the stages of the ECD design process: (1) domain analysis, (2) domain modeling, (3) conceptual assessment framework, and (4) operational assessmentthe four-process delivery system. In the first stage of ECD design, domain analysis, information about the domain is used to organize beliefs, theories, research, subject-matter expertise, instructional materials, and exemplars. In the second stage, domain modeling, the information gathered in stage 1 is organized into three paradigms; proficiency, evidence, and tasks. The third task, developing a conceptual assessment framework (CAF), specifies the technical details necessary for implementing the assessment; specifications, operational requirements, statistical models, and rubrics, for example. The final stage, the four-process delivery system, consists of four principal components: (1) activity selection, (2) presentation, (3) evidence identificationtask-level scoring, and (4) evidence accumulationtest-level scoring. Mislevy, Steinberg, and Almonds intricate approach emphasizes measurement models which incorporate the relationship between assessment purposes, substantive experience and theory, statistical models and task authoring schemas, and the elements and processes of operational models (p. 56). Eight commentaries follow the focus article which range from the limitations of Bayesian models in assessment (Earl Hunt) to critiques and comments from a
Journal of MultiDisciplinary Evaluation: JMDE(2) 169

Global Review: Publications

psychometric perspective (Cees A. W. Glas) and finally to a framework for shifting from principle to practice (Richard K. Wagner). The issue concludes with Mislevy, Steinberg, and Almonds rejoinder in which the authors address the themes of critique presented in the commentaries; constructivist and situative learning perspectives, model generalizability, user and statistical models, and implementation.

Journal of MultiDisciplinary Evaluation: JMDE(2)

170

También podría gustarte