Don't Read This If You're a Funder: Evaluation REALNESS
The other day I was writing up a detailed report about a pilot project, channeling everything I'd ever learned about utilization-focused evaluation, and channeling it hard. As someone who has done more program development than evaluation, I've probably written more grant reports than I have evaluation reports, but I've also been a consumer of consultant-led evaluations, and no one had ever delivered a report to me that was quite as useful as I wanted it to be.
Utilization-focused evaluation means, obviously, that the evaluation is useful. It's not useful to rubber-stamp a bunch of outcomes and say that everything is going great, everything is on track, and everyone is happy. Nor is it useful to mark something a failure and show every little bit of evidence why. The latter I've thankfully never received from a consultant, but the former description is all too familiar. The "Everything's great!" evaluation is the kind of evaluation I've often received from consultants on program work, even when I literally begged for a more critical – more useful – result.
I don't blame the consultants, though; I blame the dysfunctional relationship between philanthropists and grantees. Thus, the title of this blog post.
It's probably not useful to chicken-or-egg how this relationship came to be, but pretty much everyone I know has played out the dysfunction on one or both sides of the table, and evaluation consultants are often caught in the middle. Whether we are explicitly told or not, we know what's at stake when doing grant-funded program evaluation, and the same passion for social justice that drove us to want to support nonprofit work probably keeps some consultants from doing just that – supporting the work by providing a critical, useful evaluation. I get it, though as someone who also develops programs specifically to build equity, usually racial equity, I get pretty grumpy when I see tens of thousands of dollars being spent on a non-useful evaluation when that money could be going to advancements and improvements that ultimately support systems change.
Even if the program work is going great, an evaluation should provide the program developers and managers with direction for moving the work to the next stage, and that usually includes identifying ineffective methods of implementation, gaps in service, and/or weaknesses in the theory of change and logic model.
When we tie evaluation to grants in the sense of determining whether a model works or not (period) we are doing a disservice to the programs being evaluated, as well as to the broader field. As a program manager, I wanted my evaluations to show what could happen with additional resources, such as more staff time or new staff skills and increased access to new technologies. I wanted them to illuminate what, given the resources we had, would have the most impact moving forward, given our theory of change, for the people we cared about the most. I wanted them to point to weaknesses in my logic model, to give me a sense of where I need to tighten or revise so that I could move the work. Mostly what I got, though, was a collection of positive remarks by stakeholders, and maybe some verbatim critique by stakeholders. I got very little analysis, however, much less a map to inform my strategy for next-generation program development.
In retrospect, I realize that there was a communication breakdown between myself and my evaluation consultants – probably because evaluations were almost always funder-initiated. Probably also because I didn't really understand evaluation and didn't know how to ask for what I wanted. Because the evaluations were funder-initiated, and because I couldn't articulate much beyond, "Funder X wants us to do an evaluation," evaluation consultants saw their evaluation purpose as to deliver something in the realm of an overall summative judgment, where I, inarticulate firebrand that I was, was assuming that the deliverable would be a development evaluation that would support me in social innovation for systems change.
As I grow my own evaluation practice (which is, perhaps not surprisingly, also focused on social innovation for systems change), I often think of my favorite Marge Percy poem, which I first read when I was in high school and since have revisited regularly to help me remember who I am at the heart of things and how I want to protect that orientation and live my beliefs to their fullest:
To be of use
The people I love the best
jump into work head first
without dallying in the shallows
and swim off with sure strokes almost out of sight.
They seem to become natives of that element,
the black sleek heads of seals
bouncing like half-submerged balls.
I love people who harness themselves, an ox to a heavy cart,
who pull like water buffalo, with massive patience,
who strain in the mud and the muck to move things forward,
who do what has to be done, again and again.
I want to be with people who submerge
in the task, who go into the fields to harvest
and work in a row and pass the bags along,
who are not parlor generals and field deserters
but move in a common rhythm
when the food must come in or the fire be put out.
The work of the world is common as mud.
Botched, it smears the hands, crumbles to dust.
But the thing worth doing well done
has a shape that satisfies, clean and evident.
Greek amphoras for wine or oil,
Hopi vases that held corn, are put in museums
but you know they were made to be used.
The pitcher cries for water to carry
and a person for work that is real.
Source: Circles on the Water: Selected Poems of Marge Piercy, Alfred A. Knopf, 1982.
I still get chills every time I finish reading that last stanza.
Fortunately for me the other day, my long and hopefully useful report didn't have me in the position of making or breaking a program's financial sustainability. No one expected anything other than Evaluation Realness from me. Purpose and audience were clear.
My job was to reflect on and evaluate a program component that I had worked on for some time. Over a year prior, I had been given a great opportunity to run a pilot project teaching other university instructors how to, essentially, run a small piece of a program evaluation by assessing student learning and critical thinking as evidenced in a students' written work. And, because I was evaluating a longer-term, innovative pilot for which I'd had a consultative role, there was the expectation by my clients that I would provide analysis-supported recommendations rather than a list of accolades that would impress a funder.
Still, even as I wrote the preliminary draft, I found myself sprinkling in the confectioner's sugar now and again, and then having to delete and revise, delete and revise. In this struggle, I found it handy, as I often do, to put on my imaginary WWMQPD? bracelet. That's right, my What Would Michael Quinn Patton Do? bracelet. I keep it right next to my WWFFPD? (What Would Frances Fox Piven Do?) bracelet. I pulled the 4th edition of Utilization-Focused Evaluation from my bookshelf and kept it by my side throughout the rest of the report-writing process, reviewing menus and tables as needed, particularly Patton's broad section on choices, options, and decisions as I made my way through the preliminary draft.
In sharing an early draft with them, my clients had indicated that they were pleased with the report-end recommendations section, but they'd hinted at wanting more. The recommendations in some ways provided a framework moving forward, post-pilot to Year One, but, based on their feedback (and with Patton still providing parables from the text on my desk), I realized that I hadn't really taken the bull by the horns, so to speak – I hadn't really dug into the nine pilot objectives in an analytical, evaluative way. The old funder-pleaser in me had listed them at the front of the report and used them as a way of organizing the detailed documentation of the pilot, but that's it. I hadn't gone back to them and evaluated progress explicitly. Instead, I'd taken them into account "holistically" in the end (meaning, not very well, in this case). In failing to reflect on each objective specifically, I'd almost implied that we'd met all of the objectives and Year One would allow for a fresh start. But that's not what had happened.
Like all good pilots, ours had provided us with the opportunity to learn and correct course as we went along, and while those course-corrections were documented, recommendations for what to do with specific objections were absent. WWMQPD? He'd roll up his sleeves and analyze each objective in turn, deriving useful recommendations as he went along. That made sense. I figured I'd just throw the nine objectives in to a table and write a few sentences about each one. No big deal, right?
I did throw them into a table (why is that always harder than you think it's going to be?), a simple three-by-nine: objective, reflection, recommendation for each of the nine objectives. The first one was easy – your typical numbers objective. We intended to recruit this many people. Did we recruit that many people? Why, yes, we did. Did I recommend that we do that again? Well, yes, but with some caveats …
Great. Except the reality associated with the next objective was a bit more complex, and it would need to be adjusted. And that was doubly true for the one after that, and even more so for the next one. In fact, of all nine objectives, there was only one that didn't require substantive remarks. The rest needed analysis and recommendations for moving forward. I had waded in, I realized, not jumped in head first. I'd totally dallied in the shadows! In no way was I the kind of water-buffalo ox-puller that Marge Percy wanted to see when she wrote my favorite poem, and I was glad that I hadn't realized too late.
Revisiting the objectives with a truly analytical eye allowed me to ferret out recommendations that I had not even really thought about before – and, as it turned out, those evaluations were some of the most important in the final report. My original broad recommendations were appropriate for the use of overall program improvement. But my subsequent, objective-level recommendations enhanced the usability of the report for implementing and managing the next version of the program, dealing with problems, heeding important aspects of the complex, dynamic environment in which the program lives, and ensuring future data collections were reliable and rigorous enough to be reported to others for field-based learning.
In other words, my initial draft was weak sauce.
Over and over again in my evaluation training, I've heard from every evaluation expert not to skimp on the recommendations, and yet I'd lacked confidence in the first round to really do the thing. I imagine other evaluators find themselves in this space as well – especially when we come to evaluation having done work that must achieve a philanthropic seal of approval to survive. And yet clearly we're doing anyone – not program managers, not end-users, not folks who will benefit from systems change, and not even funders -- any favors by failing to use our knowledge and training as intended and at our full capacity. (Though, if funders were reading this, I'd urge them to think about how they can do their part, too, to stop the madness. But of course they aren't reading this, based on the title, so I won't waste my time!).
I mean, that's what drew me to evaluation as a field in the first place – that it allows me to use analytical skills to support good work for systems change. The pitcher cries for water to carry, as Percy writes, and a person for work that is real. Me too. Never before in my lifelong work have I felt more potential to do "real work" than I have since I begin to build my capacity for evaluation.
And now I look forward to the next angst-producing project!
Laurie Jones Neighbors is an independent consultant and educator who specializes in developing, implementing, and assessing programs and educational experiences in support of equitable political representation and local, regional, and national decision making by low-income communities and communities of color.