An operationalist lesson for passing judgments

A while back, I wrote a blog post on how I don’t believe in the narrative around “talent”. This post continues that line of thought and discusses judgments of potential or effort that are so common in academia.

As I explained in that post, I was not especially good at critical thinking up until upper-years of undergrad. Around that time, I was working in several psychology labs and was trained in a variety of methodologies. In order to better remember them, I internalized a lot of it as basic guidelines of behaviour. (E.g., I still make sure that, when I make requests of people, I do it in a way that makes it easy for them to decline, because that was important when soliciting consent. This is probably not a good practice for nagging purposes.)

One thing that was important for experimental psychology was operationally defining terms. For those who are not familiar, an operational definition is one where a term is defined by its measurement procedure. (If this sounds verificationist to you, there are close historical ties.) It’s important to keep in mind that a term’s colloquial meaning may differ in significant ways from its operational definitions and that operational definitions of the same term may actually pick out different things, too. For example, I can operationally define “happiness” as a person’s answer to the question “on a scale from 1 to 10, how happy are you?” I can also define it as the number of times a person smiles throughout the day. Quite obviously, these can come apart.

There are a lot of theoretical problems with the use of operational definitions, one of them being that they are not actually what we are trying to study. Practically, however, experimental observations are about what is in fact measured, not what we would like to measure.

Above is point one. Point two concerns one desideratum for getting good operational definitions. Similar to the verificationist idea that a word that applies to everything has no meaning, a good operational definition needs to have a certain level of discriminatory power. If I want to use the frequency of smiles to operationally define happiness, I not only need to justify that people who are happy smile a lot, I also need to justify that people who are unhappy don’t smile much. Consider this an analog of the negative test for the decidability condition.

So, here’s the two-part lesson: we don’t measure the concept itself but its operational definition; an adequate operational definition needs to have both positive and negative tests.

What does this have to do with passing judgments? Since I’ve been reading and writing about measuring traits and aptitudes, I’ve become increasingly sensitive to judgments of characters and potentials. Surprise surprise- there are a lot of those in academia.

Part of this is understandable. The reason why intelligence testing is such a big thing is that there is a huge market for it. When I invest resources in someone — as a student or employee — I am investing in their future performance. I need to know now that they will perform in the future — that they have “potential”. Moreover, when we don’t have enough resources for everyone, choices are inevitable even if we don’t have a good basis to make them. It’s easy to cling on to the promise of a potential-identifier even if it’s based on shady science.

Many ways in which people have conceived of such identifiers in the past have now been seen as problematic: wealth, parental education, gender, race, etc. Part of the justification (in addition to the obvious one that they are discriminatory) is that they do not actually track potential.

But one “nice” thing about these historically used identifiers is that it is usually pretty easy to tell whether someone counts as having it or not. Indeed, there are a lot of gatekeeping around these concepts whose job is to sharpen the line: who counts as “posh”; who counts as coming from a “well-respected family”, etc. I suspect part of the reason why people spend so much energy in talking about the boundaries of these categories is that they serve functional roles, and so changing them brings practical consequences. They are like operational definitions in experiments: they are the things we can deal with, so we pretend that they are what we want to deal with.

These traps are easier to spot, if harder to address. But there often is a danger of replacing them with something similar — perhaps more harmless, but similarly unjustified.

In the interest of providing as little context as possible, I recently heard someone say, with quite some enthusiasm, that they believe that students with grit are the most likely to succeed because they have what it takes to overcome hardship. Consequently, this person is more willing to support students who are gritty, even if they might have had lower starting points.

Now, I have a lot of suspicion about the ontological status of grit — whether there is any empirical evidence suggesting that some people just by nature are better at recovering from hardship than others. But let’s suppose it’s a thing and it makes sense to invest more resources in people who have this thing. Still, the assumption remains that this person, in order to properly support gritty students, is capable of figuring out which ones are gritty.

In the spirit of charity, I have to assume that this consequence has never occurred to this person. For things like grit or resilience or potential, it is a lot easier to look in the eyes of someone and say, “even though you don’t have a lot of achievements to list on paper, I believe you have what it takes.” It is quite a bit trickier to look in the eyes of someone and say, “you might have achieved things in the past, but I really don’t think you are capable of handling it in the future.”

Many people I know who advocate abandoning traditional criteria like grades or prestigious (and expensive) school attendance adopt alternative criteria like grit or passion. They might very happily say to some students: you have a lot of potential. And that might be good for them to hear. However, I don’t know if they are equally happy to say to other students: you don’t have potential at all.

What’s wrong with telling everyone that they have potential and resilience? Although I personally am not a big fan of such narratives (if everyone’s special, then no one is special), I agree that there is no intrinsic harm. However, these conversations very often happen in the context of distributing limited resources. If someone gets it, others do not. If the justification is that this person gets it because this person is gritty, then the assumption is that those who do not get it are not gritty.

Are they truly not gritty, according to this person’s operational definition of grit? It is possible that this person has a concrete set of assessment criteria that would allow them to reliably pronounce whether someone is gritty or not (let’s leave aside the question of whether these criteria are correct). But most likely, this person has not thought about it in this way. They have a positive test for grit — if someone behaves in a certain way or says certain things, they are deemed gritty — but not a negative test. Instead, anyone who doesn’t stand out is deemed not gritty enough.

Ok, then, everything seems to rely on this person’s self-acclaimed ability to accurately identify gritty students. How accurate are they? Are they consistent over time? And if this is only a positive test, then what about those who are gritty in different ways and so got false negatives?

These are difficult questions. These are also questions I suspect many people who make judgments of potential do not think enough about.

To be clear, I am not saying that people should never make hasty judgments about other people’s characters (and this is apparently a minority view, even though everyone judges). Just like how we would much rather get rid of operational definitions in psychology, we would much rather be able to always accurately and holistically understand everyone around us. Unfortunately, both are impossible. It is really difficult to navigate the world without making judgments about what sort of people you are interacting with.

However, the fact that operational definitions are not “real” definitions means that, when we find important or unexpected results, there are extra alternative explanations we have to deal with. If a study on happiness conflicts with our widely accepted theory of how happiness works, it is worthwhile doublechecking whether the operational definition used in the study is a good one. Passing judgments should have a similar failsafe. It is okay to haphazardly form an impression of a student being “disinterested” when they have never gone out of their way to seek help. But when that student puts a lot of effort into a piece of work — something a disinterested student wouldn’t do — or when important decisions may be affected by such a judgment, it is important to recognize that the original impression was based on an assessment procedure that’s probably not very good.

So, here’s the lesson: next time when you’re tempted to say that someone deserves something because they have grit/potential/passion/etc., it’s worth pause and think about what someone would have to do for you to say that they do not have grit/potential/passion/etc., and whether that’s really a good criterion.

Kino
Latest posts by Kino (see all)