follow CCP

Recent blog entries
popular papers

Science Curiosity and Political Information Processing

What Is the "Science of Science Communication"?

Climate-Science Communication and the Measurement Problem

Ideology, Motivated Cognition, and Cognitive Reflection: An Experimental Study

'Ideology' or 'Situation Sense'? An Experimental Investigation of Motivated Reasoning and Professional Judgment

A Risky Science Communication Environment for Vaccines

Motivated Numeracy and Enlightened Self-Government

Making Climate Science Communication Evidence-based—All the Way Down 

Neutral Principles, Motivated Cognition, and Some Problems for Constitutional Law 

Cultural Cognition of Scientific Consensus
 

The Tragedy of the Risk-Perception Commons: Science Literacy and Climate Change

"They Saw a Protest": Cognitive Illiberalism and the Speech-Conduct Distinction 

Geoengineering and the Science Communication Environment: a Cross-Cultural Experiment

Fixing the Communications Failure

Why We Are Poles Apart on Climate Change

The Cognitively Illiberal State 

Who Fears the HPV Vaccine, Who Doesn't, and Why? An Experimental Study

Cultural Cognition of the Risks and Benefits of Nanotechnology

Whose Eyes Are You Going to Believe? An Empirical Examination of Scott v. Harris

Cultural Cognition and Public Policy

Culture, Cognition, and Consent: Who Perceives What, and Why, in "Acquaintance Rape" Cases

Culture and Identity-Protective Cognition: Explaining the White Male Effect

Fear of Democracy: A Cultural Evaluation of Sunstein on Risk

Cultural Cognition as a Conception of the Cultural Theory of Risk

Saturday
Sep202014

Weekend update: Who sees what & why in acquaintance rape cases?

I've been pondering the resurgence of attention to & controversy over the standards used, in the law generally and in particular institutions such as universities, to assess complaints of sexual assault.  I'll post some reflections next week, and also a guest blog from a scholar who has done a very interesting study on how cultural norms might be constraining the effectiveness of investigations of sexual assault complaints in the military. But by way of introduction, here is an excerpt from Culture, Cognition, and Consent: Who Sees What and Why in Acquaintance Rape Cases, 158 U. Penn. L. Rev. 729, a paper from way back in 2010 that reported the results of an empirical study of how cultural norms shape pereceptions of disputed facts in date rape cases and disputed empirical claims about the impact of competing legal standards for defining "consent."


Introduction

Does “no” always mean “no” to sex? More generally, what standards should the law use to evaluate whether a woman has genuinely consented to sexual intercourse or whether she could reasonably have been understood by a man to have done so? Or more basically still, how should the law define “rape”?  

These questions have been points of contention within and without the legal academy for over three decades. The dispute concerns not just the content of the law but also the nature of social norms and the interaction of law and norms. According to critics, the traditional and still dominant common law definition of rape—which requires proof of “force or threat of force” and which excuses a “reasonably mistaken” belief in consent—is founded on antiquated expectations of male sexual aggression and female submission.  Defenders of the common law reply that the traditional definition of rape sensibly accommodates contemporary practices and understandings—not only of men but of many women as well. The statement “no,” they argue, does not invariably mean “no” but rather sometimes means “yes” or at least “maybe.” Accordingly, making rape a strict-liability offense, or abolishing the need to show that the defendant used “force or threat of force,” would result in the conviction of nonculpable defendants, restrict the sexual autonomy of women as well as men, and likely provoke the refusal of prosecutors, judges, and juries to enforce the law.

This Article describes original, experimental research pertinent to the “no means . . . ?” debate. . . .

Conclusion

This Article has described a study aimed at investigating the contribution that cultural cognition makes to the controversy over how the law should respond to acquaintance rape. The results of the study suggest that common understandings of the nature of that dispute and what’s at stake in it are in need of substantial revision.

All of the major positions, the study found, misapprehend the source of the “no means ...?” debate. Disagreement over the significance the law should assign to the word “no” is not rooted in the self-serving perceptions of men conditioned to disregard women’s sexual autonomy. Nor is it a result of predictable misunderstanding incident to conventional indirection (or even misdirection) in the communication of consent to sex. Rather it is the product, primarily, of identity-protective cognition on the part of women (particularly older ones) who subscribe to a hierarchic cultural style. The status of these women is tied to their conformity to norms that forbid the indulgence of female sexual desire outside of roles supportive of, and subordinate to, appropriately credentialed men. From this perspective, token resistance is a strategy certain women who are insufficiently committed to these norms use to try to disguise their deviance. Because these women are understood to be misappropriating the status of women who are highly committed to hierarchical norms, the latter are highly motivated— more so even than hierarchical men—to see “no” as meaning “yes,” and to demand that the law respond in a way (acquittal in acquaintance- rape cases) that clearly communicates the morally deficient character of women who indulge inappropriate sexual desire.

This account also unsettles the major normative positions in the “no means . . . ?” debate. Because older, hierarchical women are the persons most likely to misattribute consent to a woman who says “no” and means it, abolishing the common law’s “force or threat of force” element and its “reasonable mistake” defense would not create tremendous jeopardy for convention-following men. Nevertheless, there is also little reason to believe that these reforms would enhance the sexual autonomy of women whose verbal resistance would otherwise be ignored. Cultural predispositions, the study found, exert such a powerful influence over perceptions of consent and other legally consequential facts that no change in the definition of rape is likely to affect results.

This conclusion, however, does not imply that the outcome of the “no means . . . ?” debate is of no moment. On the contrary, the role of cultural cognition helps to explain why the debate has persisted at such an intense level for so long. The powerful tendency of those on both sides to conform their perceptions of fact to their values suggests why thirty years worth of experience has not come close to forging consensus on what the consequences of reform truly are. Over the course of this period, the constancy of the cultural identities of those who plainly see one answer in the data and those who just as plainly see another has driven those on both sides to form their only shared perception: that the position the law takes will declare the winner in a battle for cultural predominance.

This particular battle, moreover, occupies only a single theater in a multifront war. Like the debate over rape-law reform, continuing disputes over the death penalty, gun control, and hate crimes all feature clashing empirical claims advanced by culturally polarized groups who see the law’s acceptance or rejection of their perceptions of how things work as a measure of where their group stands in society. Indeed, the same can be said about a wide range of environmental, public- health, economic, and national-security issues. It is impossible to formulate a satisfactory response to the debate over rape-law reform without engaging more generally the distinctive issues posed by illiberal status conflict over legally consequential facts. 

Friday
Sep192014

The more you know, the more you ... Climate change vs. GM foods

A correspondent writes:

I enjoyed your recent talk at Cornell University.  I was especially interested by your data that showed the more you know about climate change, the less you believe in it (if you are on the political  right).   Do you have any similar data that shows how information about GMOS shapes opinion based on political identifiers?

Would love to explore any studies you may have on GMOs

My response:

I wish!

On this topic, I've done nothing more than collect some data showing that there are no political divisions over -- or any other interesting sources of systematic variation in -- the attitudes of general public toward GMOs.  E.g., 

 Consider this (from nationally rep sample of 1500+ in summer 2013):

There's lots of research, though, showing that the vast majority of the public doesn't know anything of consequence about GM foods, a finding that, given efforts to rile them up, suggests a pretty ingrained lack of interest:

American consumers’ knowledge and awareness of GM foods are low. More than half (54%) say they know very little or nothing at all about genetically modified foods, and one in four (25%) say they have never heard of them.

Before introducing the idea of GM foods, the survey participants were asked simply ”What information would you like to see on food labels that is not already on there?” In response, most said that no additional information was needed on food labels. Only 7% of respondents raised GM food labeling on their own. . . .

Only about a quarter (26%) of Americans realize that current regulations do not require GM products to be labeled.

Hallman, W., Cuite, C. & Morin, X. Public Perceptions of Labeling Genetically Modified Foods. Rutgers School of Environ. Sci. Working Paper 2013-2001. 

You should also a look at this guest CCP post by Jason Delbourne, who you might also want to contact, who discusses the invalidity of drawing inferences about public opinion from opinion surveys under such circumstances.

One additional thing:

As you imply, our research group has found that science literacy in general & climate science literacy specifically both increase polarization; they don't have any meaningful general effect in inducing "less belief" in general -- their effect is big, but depends on "what sort of person" one is.  Relevant papers are Kahan, D. M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L. L., Braman, D., & Mandel, G. (2012). The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature Climate Change, 2, 732-735 & Climate Science Communication and the Measurement Problem, Advances Pol. Psych. (in press).  

On "science literacy" generally, consider:

On "climate science literacy," consider:

On GM foods, data I've collected shows that partisans become mildly less concerned w/ GM food risks as their science comprehension (or science literacy or however one wants to refer to it) increases:

 

Thursday
Sep182014

Will a "knowing disbeliever" be the next President (or at least Republican nominee)?

Subjects participate anonymously in CCP studies and supply responses in a form that prevents their being identified.

Still, I have to wonder whether Govr. Jindal might not have been one of the intriguing "knowing disbelievers" featured in The Measurement Problem study.

According to Howard Fineman,

America needs a leader to bridge the widening gulf between faith and science, and Louisiana Gov. Bobby Jindal, a devout Roman Catholic with Ivy League-level science training, thinks he can be that person. . . .

On Tuesday, Jindal showed his strategy for straddling the politics of the divide -- but also the political risks of doing so -- during an hourlong Q&A with reporters at a Christian Science Monitor Breakfast, a traditional early stop on the presidential campaign circuit.

Like the experienced tennis player he is, Jindal repeatedly batted away questions about whether he believes the theory of evolution explains the existence of complex life forms on Earth. Pressed for his personal view, Jindal -- who earned a specialized biology degree in an elite pre-med program at Brown University -- declined to give one. He said only that "as a parent I want my children taught the best science." He didn’t say what that "science" was.

He conceded that human activity has something to do with climate change, but declined to agree that there is now widespread scientific consensus on the severity and urgency of the problem.

Sounds a lot like a harassed "dualist" to me.

In truth, I don't think it is very convincing to use cultural cogntion & like dynamics, which are geared to making sense of the distribution of perceptions of risk and like facts in aggregate, to explain the beliefs of specified individuals, particularly politicians, whose reasoning and incentives for disclosing the same will be shaped by influences very different from those that affect ordinary members of the public.

But I think the spectacle of Jindal's predicament, including the fly-wing-plucking torment he & like-situated poltical figures on the right face in negotiating these issues in the media, definitely illustrates the discourse pathology diagnosed by The Measurement Problem: the relentless, pervasive pressure to force reasoning individuals to make a choice between using their reason to know what's known by science or using it to enjoy their identities as members of particular cultural communities.

There is something deeply disturbing about the demand that people give an account of how they can be "knowing disbelievers," and something deeply flawed about public institutions, whether in education or in politics, that insist on interfering with this apparently widespread and unremarkable way for people to apportion what they know and believe across the different integrated identities that they occupy. 

Escaping from this sort of dysfunction is what good educators do in order to teach evolution to culturally diverse students.  It's also what regions like S.E. Florida are doing to promote constructive political engagement with climate change among culturally diverse citizens....

But in any case, the real issue with Jindal should be how he thinks we could possibly expect nasty foreign terrorists to be afraid of us if we had a leader who insists on being called "Bobby" because his childhood hero was the youngest brother in the Brady Bunch.

 

h/t to my friend David Burns.

Saturday
Sep132014

Weekend update: geoengineering and the expanding confabulation frontier of the "climate communication" debate

Despite its astonishingly long run in grounding just-so story telling about public risk perceptions and science communication (e.g., the Rasputin "bounded rationality" account of public apathy), the "climate debate" at some point has to get the benefit of an infusion of new material or else the players will ultimately die out from terminal boredom. 

That's the real potential, of course, of geoengineering.

Critics took the early lead in the "science communication confabulation game" by proclaiming with absurd overconfidence that the technology could never work: climate is a classic "chaotic system" and thus too unpredictable to admit of self-conscious management (where have I heard that before?), and even talking about it will lull the public into a narcotic state of complacency that will undermine the political will necessary to curb the selfish ethos of consumption that is the root of the problem.

But as anyone who has played the confabulation game knows, even players of modest imagination can effectively counter any move by concocting a story of equal (im)plausibility that supports the opposite conclusion.

So now we are being bombarded with a torrent of speculations on the positive effects geoengineering is likely to have on public engagement with climate science: that talk of it will scare people into taking mitigation seriously;  that foreclosing its development will increse demand for adaptation alternatives that would be even more productive of action-dissipating false confidence; that implementation of geoengineering will avert the economic deadweight losses associated with mitigation, generating a social surplus that can be invested in new, lower-carbon energy sources, etc. etc etc

At least some of the issues about how geoengineering research might affect public risk perceptions can be investigated empirically, of course.

In one study, CCP researchers found that exposing subjects (members of nationally representative US and English samples) to information about geoengineering offset motivated resistance among individuals culturally predisposed to reject evidence of climate change.  Accordingly, on the whole, individuals exposed to this information were more likely to credit evidence on the risks of human-caused climate change than ones exposed to information about mitigation strategies.

But just as the "knowledge deficit" theory doesn't explain the nature of public opinion on climate change, so "knowledge deficit" can't explain the nature of climate-change advocacy.  If furnishing advocates facts about the dynamics of science communication were sufficient to ward them off their self-defeating styles of engaging the public, it would have worked by now.  Evidence that doesn't suit their predispositions on how to advocate is simply ignored, and evidence-free claims that do support it embraced with unreasoning enthusiasm.  

But it's important to realize that the spectacle of the "climate debate" is just a game.

Actually dealing with climate change isn't.  All over the place, real-world decisionmakers--from local govts to insurance companies to utilities to investors to educators formal & informal--are making decisions in anticipation of climate change impacts and how to minimize them.  

Many of these actors are using the best available evidence, not just on climate change but on climate-science communication.  And they are ignoring the game that non-actors engaged in confabulatory story-telling are engaged in.

If this were not the case--if the only game in town were the one being played by those for whom science communication is just expressive politics by other means-- the scientific study of science communication would indeed be pointless.

Friday
Sep122014

How should science museums communicate climate science? (lecture summary & slides)

I had the great privilege of participating in a conference, held at the amazing Museum of Science in Boston, on how museums can engage the public in climate science.  Below are my remarks--as best as I can remember them a week later.  Slides here.

You are experts on the design of science-museum exhibits.

I am not. Like Dietram, I study the science of science communication with empirical methods. 

I share his view that there are things he and I and others have learned that are of great importance for the design of science museum exhibits on climate change.

If you ask me, though, I won’t be able to tell you what to do based on our work—because I am not an expert at designing museum exhibits. 

But you are

So if in fact I am right to surmise that insights gleaned from the scientific study of science communication are relevant to design of climate science exhibits, you should be able to tell me what the implications of this work are for your craft.

I will thus share with you everything I know about climate science communication.

I’ve reduced it all to one sentence (albeit one with a semi-colon):

What ordinary members of the public “believe” about climate change doesn’t reflect what they know; it expresses who they are.

The research on which this conclusion rests actually originates in the study of public opinion on evolution.

One thing such research shows is that there is in fact no correlation whatsoever between what people say they believe about evolution and what they know about it.  Those who say they “believe” in evolution are no more or less likely to understand the elements of the modern synthesis—random mutation, genetic variance, and natural selection—than those who say they “don’t.” 

Indeed, neither is likely to be able to give a sufficiently cogent account of these concepts to pass a high school biology test.

Another thing scholars have learned from studying public opinion on evolution is that what one “believes” about it has no relationship to how much one knows about science generally.

I’ll show you some evidence on that.  It consists in the results of a science literacy test that I administered to a large nationally representative sample.

Like a good knowledge assessment should, this science comprehension instrument consisted of a set of questions that varied in difficulty.

Some, like “Electrons are smaller than atoms—true or false” were relatively easy: even an individual whose score placed him or her at the mean comprehension level, would have had about a 70% chance of getting that one right.

Other questions were harder: “Which gas makes up most of the Earth's atmosphere? Hydrogen, Nitrogen, Carbon Dioxide, Oxygen?”  Someone of mean science comprehension would have only about a 25% chance of getting that one.

If one looks at the item-response profile for “Human beings, as we know them today, developed from earlier species of animals—true or false?,” an item from the NSF’s Science Indicators battery, we see that it’s difficult to characterize it as either hard or difficult.  At the mean level of science comprehension, there is about a 55% chance that someone with an average level of science comprehension will get this one correct. But the probability of getting it right isn’t much different from that for respondents whose science comprehension levels are significantly lower or significantly higher than average.

The reason is that the NSF Indicator Evolution item isn’t a valid measure of science comprehension for a general-population sample of test takers. 

Its item-response profile looks sort of like what one might expect of a valid measure when we examine the answers of those members of the population who are below average in religiosity (as measured by frequency of prayer, frequency of church attendance, and self-reported importance of religion): that is, the likelihood of getting it right slopes upward as science comprehension goes up.

But for respondents who are above average in religiosity, there is no relationship whatsoever between their response to the Evolution item and their science comprehension level.

In them, it simply isn’t measuring the same sort of capacity that the other items on the assessment are measuring. What it’s measuring, instead, is their religious self-identity, which would be denigrated by expressing a “belief in” evolution. 

Among the ways one can figure this out, researchers have learned, is to change the wording of the Evolution item: if one adds to it the simple introductory clause, “According to the theory of evolution,” then the probability of a correct response turns out to be roughly the same in relation to varying levels of science comprehension among both religious and nonreligious respondents.

The addition of those words frees a religious respondent from having to choose between expressing who she is and revealing what she knows. It turns out she knows just as much—or just as little, really, since, as I said, responses to this item, no matter how they are worded, give us zero information on what the respondent understands about the theory of evolution.

But good high school teachers, empirical research show, can impart such an understanding just as readily in a student who says she “doesn’t believe in” evolution as those teachers can in a student who says he “does.” But the student who said she didn’t “believe in” evolution at the outset will not say she does when the course is over.

Her skillful teacher taught her what science knows; the teacher didn’t make her into someone else.

Indeed, insisting that students profess their “belief in” evolution, researchers warn, is the one thing guaranteed to prevent the religiously inclined student from forming a genuine comprehension of how evolution actually works.  If one forces a reasoning individual to elect between knowing what is known by science and being who she is, she will choose the latter.

The teacher who genuinely wants to impart understanding, then, creates a learning environment that disentangles information from identity, so that no one is put in that position.

What researchers have learned from empirical study of the teaching of evolution can be extended to the communication of climate science.

To start, just as it would be a mistake (is a mistake made over and over by people who ought to know better) to treat the fraction of the population who says they “disbelieve in” evolution as a measure of science comprehension in our society, so it is a mistake to treat the fraction who say they “disbelieve” in human-caused climate change as such a measure.

My collaborators and I have examined how people’s beliefs about climate change relate to their science comprehension, too.  Actually, there is a connection: as culturally diverse individuals’ scientific knowledge and reasoning proficiency improve, they don’t converge in their views about the impact of human activity on global temperatures.  Instead they become even more culturally polarized. 

Because what one “believes” about climate change is now widely understood to signify one’s membership in and commitment to one or another cultural group, and because their standing in these groups are important to people, individuals use all manner of critical reasoning ability, experiments show, to form and persist in beliefs consistent with their allegiances.

But that doesn’t necessarily mean that individuals who belong to opposing cultural groups differ in their comprehension of climate science.  This can be shown by examining how individuals of diverse outlooks do on a valid climate science comprehension assessment.

To design such an instrument, I followed the lead of the researchers who have studied the relationship between “belief in” evolution and science comprehension. They’ve established that one can measure what culturally diverse people understand about evolution with items that unconfound or disentangle identity and knowledge.  Like the evolution items that enable respondents to show what they know without making affirmations that denigrate who they are, the items in my climate literacy assessment focus on respondents’ understanding of the prevailing view among climate scientists and not on respondents' acceptance or rejection of climate change “positions” known to be highly correlated with cultural and political outlooks.

Some of these turn out to be very easy. Encouragingly, even the test-taker of mean climate-science comprehension is highly likely (80%) to recognize that adding  CO2 to the atmosphere increases the earth’s temperature.

Others, however, turn out to be surprisingly hard: there is only a 30% chance that someone of average climate-science comprehension believes that adding CO2 emissions associated with burning fossil fuels have been shown by scientists to reduce photosynthesis in plants.

Obviously, someone who gets that  CO2 is a “greenhouse gas” but who believes that human emissions of it are toxic to the things that grow in greenhouses can’t be said to comprehend much about the mechanisms of climate science.

Nevertheless, a decent fraction of the test takers from a general population sample turned out to have a very accurate impression of climate scientists’ current best understandings of the mechanisms and consequences of human-caused global warming.  Not so surprisingly, these were the respondents who scored the highest on a general science comprehension assessment.

Moreover, there was no meaningful correlation between these individuals’ scores and their political outlooks.  “Conservative Republicans” who displayed a high level of general science comprehension and “liberal Democrats” who did both scored highly on the climate assessment test.

Nevertheless, those who displayed the highest scores on the test were not more likely to say they “believed in” human-caused global warming those who scored the lowest. On the contrary, those displayed the greatest comprehension of science’s best prevailing understandings of climate change were the most politically polarized on whether human activity is causing global temperatures to rise.

In other words, what ordinary members of the public “believe” about climate change, like what they “believe” about evolution, doesn’t reflect what they know; it expresses who they are.

The reason our society is politically divided on climate change, then, isn’t that citizens have different understandings of what climate scientists think.  It is that our political discourse, like the typical public opinion poll survey, frames the “climate change question” in a manner that forces them to choose between expressing who they are, culturally speaking, and revealing and acting on what they know about what is known.

This is changing, at least in some parts of the country.  Despite being as polarized as the rest of the country, for example, the residents of Southeast Florida have, through a four-county compact, converged on a comprehensive “Climate Action Plan,” consisting of 100 distinct adaptation and mitigation measures.

People in Florida know a lot about climate.  They’ve had to know a lot, and for a long time, in order to thrive in their environment.

Like the good high-school teachers who have figured out how to create a classroom environment in which curious and reflective students don’t have to choose between knowing what’s known about the natural history of humans and being who they are,  the local leaders who oversee the Southeast Florida Climate Compact have figured out how to create a political environment in which free and reasoning citizens aren’t forced to choose between using what they know and being who they are as members of culturally diverse communities.

Now what about museums?  How should they communicate climate science?

Well, I’ve told you all I know about climate science communication: that what ordinary members of the public “believe” about climate change doesn’t reflect what they know; it expresses who they are.

I’ve shown you, too, some models how of science-communication professionals in education and in politics have used evidence-based practice to disentangle facts from the antagonistic cultural meanings that inhibit free and reasoning citizens from converging on what is collectively known.

I think that’s what you have to do, too.

Using your professional expertise, you have already made museums a place where curious, reflective people of diverse outlooks go to satisfy their appetite to experience the delight and awe of apprehending what we have come to know by employing science’s signature methods of discovery.  

You now need to assure that the museum remains a place, despite the polluted state of our science communication environment generally, where those same people can go to satisfy their appetite to participate in what science has taught us and is continuing to teach us about the workings of our climate and the impact of human activity upon it.

You need, in short, to be sure that nothing prevents them from recognizing that the museum is a place where they don’t have to choose between enjoying that experience and being who they are.

How can you do that?

I don’t know.  Because I am not an expert in the design of science museum exhibits.

But you are—and I am confident that if you draw on your professional judgment and experience, enriched with empirical evidence aimed at testing and refining your own hypotheses, you will be able to tell me.  

 I have a strong hunch, too, that what you will have to say will be something other science-communication professionals will be able to use to promote public engagement with climate science in their domains, too.

 

Sunday
Sep072014

Weekend update: Another helping of evidence on what "believers" & "disbelievers" do & don't "know" about climate science

Data collected in ongoing work to probe, refine, extend, make sense of, demolish the "ordinary climate science intelligence" assessment featured in The Measurement Problem paper.

You tell me what it means ...

Saturday
Sep062014

Weekend update: Some research on climate literacy to check out

I have a bunch of critical administrative tasks that are due/overdue.  Fortunately, I discovered this special "climate literacy" issue of the Journal of Geoscience Education.  It'll make for a weekend's worth of great reading.

Thinking that others might be in need of the same benefit, I decided to post notice of the issue forthwith.

Reader reports on one or another of the articles are certainly welcome.

Friday
Sep052014

Teaching how to teach Bayes's Theorem (& covariance recognition) -- in less than 2 blog posts!

Adam Molnar, in front of graphic heuristic he developed to teach (delighted) elementary school children how to solve Riemann hypothesisThe 14.7 billion regular readers of this blog know that one of my surefire tricks for securing genuine edification for them is for me to hold myself forward as actually knowing something of importance in order to lure/provoke an actual expert into intervening to set the record straight.  It worked again!  After reading my post Conditional probability is hard -- but teaching it *shouldn't* be!, Adam Molnar, a statistician and former college stats instructor who is currently completing his doctoral studies in mathematics education at the University of Georgia, was moved to compose this great guide on teaching conditional probability & covariance detection. Score!

 

Conditional Probability: The Teaching Challenge 

Adam Molnar

A few days ago, Dan wrote a post presenting the results on how members of a 2000-person general population sample did on two problems, named BAYES and COVARY.

Dan posed the following questions: 

  1. "Which"--COVARY or BAYES--"is more difficult?"
  2. "Which is easier to teach someone to do correctly?" and
  3. "How can it be that only 3% of a sample as well educated and intelligent as the one [he] tested"--over half had a college or post graduate dagree--"can do a conditional probability problem as simple as" he understood BAYES to be. "Doesn't that mean," he asked "that too many math teachers are failing to use the empirical knowledge that has been developed by great education researchers & teachers?"

Check out this cool poster summary of Molnar study resultsAs it turns out, these are questions that figure in my own research on effective math instruction. As part of my dissertation, I conducted interviews of 25 US high school math teachers. In the interviews, I included versions of both COVARY and BAYES. My version of COVARY described a different hypothetical experient but used the same numbers as Dan's, while BAYES had slightly different numbers (I used the version from Bar-Hillel 1980).

So with this background, I'll offer my responses to Dan's questions.

Which is more difficult?

According to actual results, Bayes by far.

Dan reports that 55% of the people in his  sample got COVARY correct, compared to 3% for BAYES.

Other studies have shown a similar gap.

In one Dan and some collaborators conducted, 41% of a nationally diverse sample gave the correct response to a similarly constructed covariance problem. Eighty percent of the members of my math teacher sample computed the correct response.

In contrast, on conditional-probability problems similar to BAYES, samples rarely reach double digits. I got 1 correct response out of 25--4%--in my math-teacher sample. Bar-Hillel (1980) asked Israeli students on the college entrance exam and had 6% correct. Only 8% of doctors got a similar problem right (Gigerenzer, 2002).

Teaching Covary

Solving COVARY, like many problems, involves three critical steps.

Step 1 is reading comprehension.

As worded, COVARY is not a long problem, but it includes a few moderately hard words like "experiment" and "effectiveness." These phrases may not challenge the "14.6 billion" readers of this blog, but they can challenge English language learners or students with limited reading skills. Even for people who know all the words, one might misread the problem.

Step 2 is recognition. In this problem, a solver needs to compare probabilities or ratios by knowing "more likely to survive" leads to likelihood, and that likelihood involves computation, not just comparing counts. Comparing counts across a row (223 against 75) or a column (223 against 107) will lead to the wrong answer.

Taking this step involves recognizing a term, "more likely to survive". Learning the term requires work, but the US education system includes this type of problem. In the Common Core adopted by most states, standard 8.SP.A.4 states "Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables." High school standard HSS.CP.A.4 repeats the tables and adds independence. Although students may not study under the Common Core, and adults had older curricula, almost everyone has seen 2 by 2 tables. Therefore, teaching the term "more likely to survive" is not a big step.

Step 3 is computation.

Dan suggested likelihood ratios, but almost all teachers will work with probabilities (relative frequencies) as mentioned in the standard. Problem solvers need to create two numbers and compare them. The basic "classical" way to create a probability is successes over total. The classical definition works as long as solvers remember to use row totals (298 and 128), not the grand total of 426. People will make errors, but as mentioned previously, US people have some familiarity with 2 by 2 tables. Instruction is required, but the steps do not include any brand new techniques.

Of the five errors in my sample, one came from misreading (Step 1), one came from recognition (Step 2) comparing 223 against 107, and three came from computation (Step 3) using the grand total of 426 as the denominator instead of 298 and 126.

Teaching Bayes

For BAYES, a conditional-probability problem, reading comprehension (Step 1) is more difficult than for COVARY. COVARY provides a table, while BAYES has only text. Errors will occur when transferring numbers from the sentences in the problem. Even very smart people make occasional transfer errors.

The best-performing teacher in my interviews made only one mistake--a transfer, choosing the wrong number from earlier in a problem despite verbally telling me the correct process.

As an educator, I would like to try a version of COVARY where the numbers appeared in text without the table, and see how often people correctly built tables or other problem solving structures.

Step 2, recognition, is easier. The problem explicitly asks for "chance (or likelihood)" which means probability to most people. Additionally, all numbers in the problem are expressed as percentages. These suggestions lead most people to offer some percentage or decimal number between 0 and 1. All the teachers in my study gave a number in that range.

Step 3, computation, is much, much harder.

As demonstrated in the recent sample and other research work including Bar-Hillel (1980), many people will just select a number from the problem, either the rate of correct identification or the base rate. Both values are between 0 and 1, inside the range of valid probability values, thus not triggering definitional discomfort. Neither value is correct, of course, but I am not surprised by these results. A correct solution path generally requires training.

Interestingly, the set of possible solution paths is much larger in Bayes. Covary had probabilities and ratios; Bayes has at least eight approaches. Some options might be familiar to US adults, but none are computationally well known. In the list below, I describe each technique, comment on level of familiarity, and mention computational difficulty.

  • Venn Diagrams: A majority of adults could recognize a Venn diagram, because they are useful in logic and set theory. Mathematicians like them. Although Venn diagrams are not specified in the Common Core, they have appeared in many past math classes and I suspect they will remain in schools. I do not believe a majority of adults could correctly compute probabilities with a Venn diagram, however. Doing so requires knowing conditional probability and multiplicative independence rules, plus properly accounting for the overlapping And event. Knowing how to solve the Bayes problem with a Venn diagram almost always means one knows enough to use at least one other technique on this list, such as probability tables or Bayes Theorem. Those techniques are more direct and often simpler.
  • Bayes's Theorem: (which has several different names, including formula, law, and rule; Bayes might end with 's or ' or no apostrophe at all). If you took college probability or a mathy statistics course, you likely saw this approach. When I asked statisticians in the UGA statistics education research group to work this problem, they generally used Bayes' rule. This is not a good teaching technique, however, because the computation is challenging. It requires solid knowledge of conditional probability and remembering a moderately difficult formula. Other approaches are less demanding. 
  • Bayesian updating: A more descriptive name for the approach Dan wrote about, where Posterior odds = prior odds x likelihood ratio. This is even more rare than the formula version of Bayes rule; I first saw this in my masters program. Updating is easier computationally than the formula, but I would not expect untrained people to discover it independently. 
  • Probability-based tables: Many teachers attempted this method, with some reaching a usable representation (but none correctly selecting numbers from the table.) This method requires setting up table columns and rows, and then using independence to multiply probabilities and fill entries. After that, the solver needs to combine values from two boxes (True Blue and False Blue) to find the total chance that Wally perceived a blue bus, and then find the true blue probability by dividing True Blue / (True Blue + False Blue). Computation requires table manipulation, understanding independence, and knowing which numbers to divide. Choosing the correct boxes stumped the teachers most often. They tended to just answer the value of True Blue, 9% in this version.

    This approach was popular because it involves tables and probabilities, ideas teachers and students have seen. Independence is also included in the Common Core. Thus, it's not too far a stretch. The problem is difficulty, between building the table using multiplicative probability and then combining boxes in a specific way. Other approaches are easier. 
  • Probability-based trees: The excellent British mathematics teaching site NRICH has an introduction. AP Statistics students frequently learn tree diagrams. Some teachers used them, including the one teacher who got the explanation completely correct. Several other teachers made the same mistake as with probability tables; they built the representation, but only gave the True Blue probability and neglected the False Blue possibility. 

    Although trees are mentioned briefly in the Common Core as one part of one Grade 7 standard, I don't expect trees to become a popular solution. Because they were uncommon in the past, few (but not zero) non-teacher adults would attempt this approach. 
  • Grid representations: Dan cited a 2011 paper by Spiegelhalter, Pearson, and Short, but the idea is older. A reference at Illuminations, the NCTM's US website for math teaching resources, included a 1994 citation. The idea is to physically color boxes represented possibilities, which allows one to find the answer by counting boxes. At Georgia, we've successfully taught grid shading in our class for prospective math teachers. It works well and it's not very difficult. One study showed that 75% of pictorial users found the correct response (Cosmides & Tooby, 1996) Unfortunately, it's never been part of any standards I know. It also requires numbers expressible out of 100, which works in this problem but not in all cases. 
  • Frequency-based tables: In the 1990s, psychological researchers started publishing about a major realization: Frequency counts are more understandable than probabilities. Classic papers include Gigerenzer (1991) and Cosmides & Tooby (1996). The basic idea is to convert probabilities to frequencies by starting with a large grand total, like 1000 or 100,000, and then multiply probabilities to find counts. The larger starting point makes it likely that all computations result in integers, one problem in grid representation. 

    After scaling, the solver can form a table. In this problem, getting from the table to the correct answer still requires work, as one must know to divide True Blue / (True Blue + False Blue) as in the probability-based table. I know one college textbook with a "hypothetical hundred thousand table", Mind on Statistics by Utts and Heckard, which has included the idea since at least 2003. There are many college statistics textbooks, though, and frequency-based tables do not appear in US school standards. They are not commonly known. 
  • Frequency-based trees: Because tables don't make it obvious which boxes to select, a tree-based approach can combine the natural intuition of counts and the visual representation of trees. This increases teaching time because students are less familiar with trees. In exchange, the problem becomes easier to solve. This might be the most effective approach to teach, but it's very new. Great Britain has included frequency trees and tables in the 2015 version of GCSE probability standards for all Year 10 and 11 students, but they have not appeared in schools on this side of the pond.

The Teaching Challenge

Neither COVARY nor BAYES is easy, because both require expertise beyond what was previously taught in K-12 schools.

In the current US system, looking at Common Core and other standards, COVARY will be easier to teach. COVARY requires less additional information because it can extend easily from two ideas already taught, count tables and classical relative frequency probability. It fits very well inside the Common Core standards on conditional probability.

BAYES has lots of possible approaches. Some, like grid representations and frequency trees, are less challenging than COVARY. But they are relatively new in academic terms. Many were developed outside the US and none extend easily from current US standards. I'm not even sure the sort of conditional-probability problem reflected in BAYES should be considered under Common Core (unlike the new British GCSE standards), even though I believe decision making under conditional uncertainty is a vital quantitative literacy topic. Most teachers and I believe it falls under AP Statistics.

Furthermore, educational changes take a lot of time. Hypothetically (lawyers like hypotheticals, right?), let's say that today we implement a national requirement for conditional probability. States would have to add it to their standards documents. Testing companies would need to write questions. Textbook publishers would have to create new materials. Schools would have to procure the new materials. Math teachers would need training; they're smart enough to handle the problems but don't yet have the experience.

The UK published new guidelines in November 2013 for teaching in September 2015 and exams in June 2017. In the US? 2020 would be a reasonable target.

Right now, Bayes-style conditional probability is unfamiliar to almost all adults.

In Dan's sample, over half had a college degree. That's nice, but that doesn't imply much about conditional probability.

The CBMS reports on college mathematics and statistics. A majority of college grads never take statistics. In 2010, there were about 500,000 enrollments in college statistics classes, plus around 100,000 AP Statistics test takers, but there were about 15,000,000 college students. (For comparison, there were 3,900,000 mathematics course enrollments.) Of the minority that take any statistics, most people take only one semester. Conditional probability is not a substantial part of most introductory courses; perhaps there would be 30 minutes on Bayes' rule.

Putting this together, less than 10% of 2010 college students covered conditional probability. Past numbers would not be higher, since probability and statistics have recently gained in popularity.

I think it's fair to say that less than 5% of the US adult population has ever covered the topic--making that 3% correct response rate sound logical.

In an earlier blog post, Dan wrote "If you don't get Bayes, it's not your fault. It's the fault of whoever was using it to communicate an idea to you." Yes, there are better and worse ways to solve Bayes-style problems. Teachers can and should use more effective approaches. That's what I research and try to help implement. But for the US adult population, the problem is not poor communication; rather, it's never been communicated at all.

References

Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologica, 44, 211Ð233.

Cosmides, L., & Tooby, J. (1996). Are humans good intuitive statisticians after all?: Rethinking some conclusions of the literature on judgment under uncertainty. Cognition, 58, 1-73.

Gigerenzer, G. (1991). How to make cognitive illusions disappear: Beyond "heuristics and biases". In W. Stroebe & M. Hewstone (Eds.), European Review of Social Psychology (Vol. 2, pp. 83-115). Chichester: Wiley.

Gigerenzer, G. (2002). Calculated risks: How to know when numbers deceive you. New York: Simon & Schuster.

Spiegelhalter, D., Pearson, M., and Short, I. (2011). Visualizing Uncertainty About the Future. Science 333, 1393-1400.

Utts, J., & Heckard, R. (2012). Mind on Statistics, 4th edition. Independence, KY: Cengage Learning.

 

Thursday
Sep042014

Political psychology according to Krugman: A degenerative research programme if ever I saw one ... 

As I said, I no longer watch the show "Paul Krugman's Magic Motivated Reasoning Mirror" but do pay attention when a reflective person who still does tells me that I've missed something important.  Stats legend Andrew Gelman is definitely in that category.  He thinks the latest episode of KMMRM can't readily be "dismissed."  

So I've taken a close look.  And I just disagree.

My reasons can be efficiently conveyed by this simple reconstruction of the tortured path of illogic down which the show has led its viewers:

Krugman:  A ha! Social scientists have just discovered something I knew all along: on empirical policy issues, people fit the evidence to their political predispositions.  It’s blindingly obvious that this is why conservatives disagree with me!  And by the way, I’ve made another important related discovery about mass public opinion: the tribalist disposition of conservatives explains why they are less likely to believe in evolution.

Klein: Actually, empirical evidence shows that the tendency to fit the evidence to one's political predispositions is ubiquitous—symmetric, even: people with left-leaning proclivities do it just as readily as people with right-leaning ones.  Indeed, the more proficient people are at the sort of reasoning required to make sense of empirical evidence, the more pronounced this awful tendency is.  Therefore, people who agree with you are as likely to be displaying this pernicious tendency--motivated reasoning--as those who disagree.  This is very dispiriting, I have to say.

EmpiricistHe’s right.  And by the way, your claims about political outlooks and “belief in” evolution are also inconsistent with actual data.

Krugman: Well, that’s all very interesting, but your empirical evidence doesn’t ring true to my lived experience; therefore it is not true. Republicans are obviously more spectacularly wrong.  Just look around you, for crying out loud.

Klein:  Hey, I see it, too, now that you point it out! Republicans are more spectacularly wrong than Democrats!  We’ve been told by empiricists that individual Republicans and individual Democrats reason in the same way.  Therefore, it must be that the collective entity “Republican Party” is more prone to defective reasoning than the collective entity “Democrat.”

Methodological individualist: Look: If you believe Republicans/conservatives don’t reason as well as Democrat/liberals, then there’s only one way to test that claim: to examine how the individuals who say “I’m a ‘liberal’ ” and the ones who say “I’m a ‘conservative’ ” actually reason.  If the evidence says “the same,” then invoking collective entities who exist independently of the individuals they comprise and who have their own “reasoning capacity” is to jump out of the empirical frying pan and into the pseudoscience fire.  I’m not going with you.

Krugman: What I said—and have clearly been saying all along—is that the incidence of delusional reasoning is higher among conservative elites than among liberal elites. I never said anything about mass political opinion!  Your misunderstanding of what I clearly said multiple times is proof of what I said at the outset: the reason non-liberals (conservatives, centrists, et al.) all disagree with me is that they are suffering from motivated reasoning.

Bored observer: What is the point of talking with you?  If you make a claim that is shown to be empirically false, you just advance a new claim for which you have no evidence.  It’s obvious that no matter what the evidence says, you will continue to say that the reason anyone disagrees with you is that they are stupid and biased.  I’m turning the channel.

Gelman: Hold on!  He’s now advanced an empirical claim for which "data are not directly available."  Because it therefore cannot be evaluated, his claim can't simply be dismissed!

Two people Gelman knows know their shit:  Yes it can.  When people react to contrary empirical evidence by resorting to the metaphysics of supra-individual entities or by invoking new, auxiliary hypotheses that themselves defy empirical testing, they are doing pseudoscience, not genuine empiricism.  The path they are on is a dead end.

 

 

 

 

Monday
Sep012014

"Krugman's 'magic motivated reasoning mirror' show"-- I've stopped watching but not trying to learn from reflective people who still are 

So here is an interesting thing to discuss. 

A commenter on the What's to explain? Kulkarni on "knowing disbelief" post made an interesting connection between “knowing disbelief” (KD) and the “asymmetry thesis.”  The occasion for his comment, it’s apparent, was not or not only the Kulkarni post but rather something he saw on the show “Paul Krugman & the Magic Motivated Reasoning Mirror,” in every episode of which Krugman looks in the mirror & sees the images of those who disagree with him & never himself. 

There are lots of episodes—almost as many as in Breaking Bad or 24.  Consider: 

But the "Krugman's magic motivated reasoning mirror show" is way too boring, too monotonous, too predictable.  

I’ve stopped watching – hence didn’t even bother to say anything about the most recent episode or the one before that.

But the commentator had a really interesting point that wasn’t monotonous and that far from being predictable is bound up with things that I’m feeling quite uncertain about recently. So I’ve “promoted” his comment & my response to "full post status" -- & invite others to weigh in.

Mitch:

I think that this discussion skips over what is really interesting here - and which actually can be connected to what Krugman was talking about when he was so derided on this blog.

Let's consider the yellow population on the right-hand side of this chart. As presented here, these are people who are of well-above-average scientific understanding. They are therefore presumably aware of the truly vast array of evidence that supports the proposition that the earth is not 10,000 years old and that today's living creatures are descended from ancestors that were of different species.

Despite this, many in this group answer false to the first question posed (and presumably many also to the question, "True or false, the age of the earth is about 4.5 billion years").

Now this raises the question "Is there any question on which the blue population displays a like disregard of the scientific evidence of which they're aware?"

This question cannot be answered by the sorts of experiments I've seen on this blog. Having read at this point a good number of the posts, what I have seen demonstrated here is that people's minds do work in the same way - and that nobody likes to hear evidence that contradicts their beliefs. However, the question being asked is different - how is this way-of-the-mind playing out in practice by yellow and blue groups on the right-hand side of the chart?

My belief (and evidenly Krugman's as well) is that *at the present moment in the US* there in fact is no symmetry. These two groups believe quite differently - one generally aligning with the scientific consensus and the other not.

I think this is a pretty reasonable question, not worthy of derision.

September 1, 2014 | Unregistered Commenter

Me: 

@Mitch:

A. I agree the question -- of asymmetry -- is not worthy of derision. Derision, though, is worthy of derision, particularly when it assumes an answer to the question & evinces a stubborn refusal to engage with contrary evidence. There are many who subscribe to the the "asymmetry thesis" who are serious and open-minded people just trying to figure things out. Krugman isn't in that category. He is an illiberal zealot & an embarrassment to critically reasoning people of all cultural & political outlooks.

B. The point you raise is for sure getting at what is "interesting here" more than most of the other other comments on this & related posts. Thanks for pointing out the KD/asymmetry connection.... (But note that it it would actually be a mistake to conflate NSF "human evolution disbelievers" w/ "young earth creationists"--the latter make up only a subset of former.)

C. I admit (as I have plainly stated) that I find the relationship between KD & cultural cognition & like mechanisms unclear & even disorienting & unsettling. But I think conflating the whole lot would be a huge error. There are many forms of cultural cognition that don't reflect KD. It's also not clear -- to me at least -- that KD necessarily aggravates the pernicious aspects of cultural cognition. As in the case of the Pakistani Dr -- & the SE Floridians who don't believe in climate change but who use evidence of it for collective decisionmaking -- my hunch is that it is a resource that can be used to counteract illiberal forms of status competition that prevent diverse democratic citizens from converging on valid decision-relevant science.  Rather than extracting empty, ritualistic statements of "belief in" one or the other side's tribal symbol, the point of collective exchange should be to enable acquisition and use of genuine knowledge. It works in the classroom for teaching evolution, so why not use the same sort of approach in the town hall meeting (start there; work your way up) to get something done on climate? Take a look at The Measurement Problem & you'll see where I'm coming from. And if you see where it would make more sense for me to go instead, I'm all ears.

D. But while waiting for anything more you might say on this, let me try to put KD aside -- as I have indicated, I am using the "compartmentalization" strategy for now --& come back to Kruggie's "asymmetry thesis challenge" (made in the last episode of "PK's Magic Motivated Reasoning Mirror" that I bothered watching).

Krugman asks "what is the liberal equivalent of climate change for conservatives?"

Well, what does he mean exactly? If he means an example of an issue in which critical engagement with evidence on a consequential issue is being distorted by cultural cognition, the answer is ... climate change.

Just as there's abundant evidence that most of those who say they "believe in" evolution don't understand natural selection, random mutation & genetic variance (the elements of the modern synthesis in evolutionary science), the vast majority of those who say they "believe in" global warming don't genuinely get the most basic mechanisms of cliamte change (same for "nonbelievers" in both cases-- correlations between believing & understanding the evidence are zero).

It's actually okay to accept what one can't understand: in order to live well-- or just live--people need to accept as known by science much  more than they have time or capacity to comprehend! To make use of science, people use a rational faculty exquisitely calibrated to discerning who actually knows what science knows & who is full of shit.

But here's what's not okay: there's abundant evidence that those on both sides of the climate debate-- "believer" as well as "nonbeliver"--are now using their "what does science know" recognition faculty in a biased way that fits all evidence to their cultural predispositions.

That means that we have a real problem in our science communication environment--one that everyone regardless of cultural outlook has a stake in fixing.

So maybe you can see why I think it is very noxious—a sign of lack of civic virtue as well as critical reasoning ability-- to keep insisting that a conflict like climate change is a consequence of one side being "stupid" or "unreasoning" when it can be shown that both sides are processing information in the same way?  Why doing that is stupid & illiberal, and actually makes things worse by reinforcing the signals of cultural conflict that are themselves poisoning our "who knows what science knows" reasoning faculty

Do you think I'm missing something here?

 

Thursday
Aug282014

"Is politically motivated reasoning rational?" A fragment ...

From something in the works ...

My goal in this paper is to survey existing evidence on the mechanisms of culturally motivated reasoning (CMR) and assess what that evidence implies about the relationship between CMR and rational decisionmaking.

CMR refers to the tendency of individuals to selectively credit diverse forms of information—from logical arguments to empirical data to credibility assessments to their own sensory impressions—in patterns that reflect their cultural predispositions. CMR is conventionally attributed to  over-reliance on heuristic or System 1 information processing. Like other manifestations of bounded rationality, CMR is understood to interfere with individuals’ capacity to identify and pursue courses of action suited to attainment of their personal well-being (e.g., Lodge & Taber 2013; Weber & Stern 2011; Lilienfeld, Ammirati, Landfield 2009; Sunstein 2007).

I will challenge this picture of CMR.  Numerous studies using a variety of observational and experimental designs suggest that the influence of CMR is not in fact limited to heuristic information processing.  On the contrary, these studies find that in disputes displaying pervasive CMR—for example, over the reality and consequences of global warming—individuals opportunistically employ conscious, effortful forms of information processing, reliably deciphering complicated information supportive of their predispositions and explaining away the rest.  As a result, individuals of the highest levels of science comprehension, numeracy, cognitive reflection, and other capacities identified with rational decisionmaking exhibit the greatest degree of cultural polarization on contested empirical issues (Kahan in press; Kahan, Peters, Dawson & Slovic 2013; Kahan 2013; Kahan, Peters, et al. 2012). 

Because CMR is in fact accentuated by use of the System 2 reasoning proficiencies most closely identified with rational decisionmaking, it is not plausible, as a descriptive matter, to view CMR as a product of bounded rationality.

For the same reason, it is unsatisfying to treat decisionmaking characterized by CMR as unsuited to attainment of individual ends. The compatibility of any form of information processing with instrumental rationality cannot be assessed without a defensible account of the goals an actor is seeking to achieve by engaging with information in a particular setting. To be sure, CMR is not a form of information processing conducive to maximizing accurate beliefs.  But the relationship between CMR and the forms of cognition most reliably calibrated to using information to rationally pursue one’s ends furnishes strong reason to doubt that maximizing accuracy of belief is the goal individuals should be understood to be pursuing in settings that bear the signature of pervasive CMR.

One way to make sense of the nexus between CMR and system 2 information processing, I will argue, is to see CMR as a form of reasoning suited to promoting the stake individuals have in protecting their connection to, and status within, important affinity groups.  Enjoyment of the sense of partisan identification that belonging to such groups supplies can be viewed as an end to which individuals attach value for its own sake.  But a person’s membership and good standing in such a group also confers numerous other valued benefits, including access to materially rewarding forms of social exchange (Akerlof & Kranton 2000). Thus, under conditions in which positions on societal risks and other disputed facts become commonly identified with membership in and loyalty to such groups, it will promote individuals’ ends to credibly convey (by accurately conveying (Frank 1988)) to others that they hold the beliefs associated with their identity-defining affinity groups. CMR is a form of information processing suited to attaining that purpose.

Individuals acquire this benefit at the expense of less accurate perceptions of societal risk. But holding less accurate beliefs on these issues does not diminish any individual's personal well-being. Nothing any ordinary member of the public does--as consumer, as voter, as public discussant--can have any material impact on climate change or a like societal risk.  Accordingly, no mistake he makes based on inaccurate perceptions of the facts can affect the level of risk faced by himself or anyone else he cares about. If there is a conflict between using his reasoning capacity to form truth-convergent beliefs and using it to form identity-convergent ones, it is perfectly rational for him to use it for the latter.

This account of the individual rationality of CMR, however, does not imply that this form of reasoning is socially desirable from an economic standpoint. It is reasonable to assume that accurate popular perceptions of risk and related facts will often display the features of a meta-collective good: particularly in a democratic form of government, reliable governmental action to secure myriad particular collective goods will depend on popular recognition of the best available evidence on the shared dangers and opportunities that a society confronts (Hardin 2009).  On an issue characterized by pervasive CMR, however, the members of diverse cultural groups will not converge on the best available evidence or not do as quickly as they should to secure their common interests (Kahan 2013).  Still, this threat to their well-being will not in itself alter the array of incentives that make it rational for individuals to cultivate and display a reasoning style that features CMR (Hillman 2010). Only some exogenous change in the association between positions on disputed facts and membership in identity-defining affinity groups can do that.

This conceptual framing of this tragedy of the science communications commons, the paper will suggest, is the principal benefit that economics can make to ongoing research on CMR.

 Refs

Akerlof, G. A., & Kranton, R. E. (2000). Economics and identity. The Quarterly Journal of Economics, 115(3), 715-753.

Frank, R. H. (1988). Passions within reason : the strategic role of the emotions. New York: Norton.

Hardin, R. (2009). How do you know? : the economics of ordinary knowledge. Princeton: Princeton University Press.

Hillman, A. L. (2010). Expressive behavior in economics and politics. European Journal of Political Economy, 26(4), 403-418.

Kahan, D. M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L. L., Braman, D., & Mandel, G. (2012). The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature Climate Change, 2, 732-735.

Kahan, D.M.  (2013). Ideology, Motivated Reasoning, and Cognitive Reflection. Judgment and Decision Making 8, 407-424.

Kahan, D.M. (in press). Climate science communication and the Measurement Problem. Advances in Pol. Psych. 

Kahan, D.M., Peters, E., Dawson, E. & Slovic, P.  (2013). Motivated Numeracy and Enlightened Self Government. Cultural Cognition Project Working Paper No. 116.

Lilienfeld, S. O., Ammirati, R., & Landfield, K. (2009). Giving Debiasing Away: Can Psychological Research on Correcting Cognitive Errors Promote Human Welfare? Perspectives on Psychological Science, 4(4), 390-398. doi: 10.1111/j.1745-6924.2009.01144.x

Lodge, M., & Taber, C. S. (2013). The rationalizing voter. Cambridge ; New York: Cambridge University Press.

Sunstein, C. R. (2007). On the Divergent American Reactions to Terrorism and Climate Change. Columbia Law Review, 107, 503-557.

Weber, E. U., & Stern, P. C. (2011). Public Understanding of Climate Change in the United States. Am. Psychologist, 66, 315-328. doi: 10.1037/a0023253

Wednesday
Aug272014

What's to explain? Kulkarni on "knowing disbelief"

As always, the investment in asking others for help in dispelling my confusion is paying off.

As the 15.5 billion regular readers of this blog (we’re up 1.5 billion with migration of subscribers from the recent cessation of posts in Russell Johnson’s theprofessor.com) know, I’ve been trying to get a handle on a phenomenon that I’m calling—for now & for lack of a better term—“knowing disbelief” (KD).

I've gotten various helpful tips in comments to the the original blog post & on a follow up, which itself featured some reflections by Steve Lewandowsky.

This time the help comes from Prajwal Kulkarni, a physicist who authors the reflective and provocative blog, “Do I need evolution?” 

I’ll tell you what he said, and what I have to say about what he said.  But first a bit of background – which, if you have seen all the relevant previous episodes, you can efficiently skip by scrolling down to the bolded red text.

1. KD consists in (a) comprehension of and assent to a set of propositions that (b) appear to entail a proposition one professes not to “believe.”

“What is going on in their heads?” (WIGOITH) is the shorthand I’m using to refer to my interest in forming a working understanding (a cogent set of plausible mechanisms that are either supported by existing evidence or admit of empirical testing) for KD.

In that spirit, I formulated a provisional taxonomy consisting of four species of KD: 

  1. FYATHYRIO (“fuck you & the horse you rode in on”), in which the agent (the subject of KD) merely feigns belief in a proposition she knows is not true for the sake of expressing an attitude, perhaps contempt or hostility to members of an opposing cultural group, the recognition of which actually depends on others recognizing that the agent doesn’t really believe it (“Obama was born in Kenya!”);

  2. compartmentalization, in which a belief, or a cluster of beliefs and evaluations (“same-sex relationships enrich my life”), and denial of the same (“homosexuality is a sin”) are both affirmed by the agent, who effortfully cordons them off through behavioral and mental habits that confine their appearance in consciousness to the discrete occasions in which he occupies unintegrated, hostile identities—a form of dissonance avoidance;

  3. partitioning, in which knowledge and styles of reasoning appropriate to the use of it are effectively indexed with situational triggers that automatically summon them to consciousness, creating the risk of the agent will “disbelieve” what she “knows” if an occasion for making use of that knowledge is not accompanied by the triggering condition (think of the expert who doesn’t recognize a problem as being of the type that demands her technical or specialized understanding); and

  4. dualism, in which the agent simultaneously “rejects” and “accepts” some proposition or set of propositions that admittedly have the same state-of-affairs referent but that constitute distinct mental objects individuated by reference to the uses he makes of them in occupying integrated identities, a task he performs without the experience of either “mistake” or “error” (a signature of the kind of bias distinctive of partitioning) or dissonance (the occasion for compartmentalization).

click on me-- or I will cancel the Labor Day 3-day weekend 2. I am most interested in dualism for two reasons.  The first is that I think it is the most plausible candidate explanation for the sort of KD that I believe explains the results in the Measurement Problem (Kahan in press), which reports on a study that found that climate change “believers” and climate change “skeptics” achieve equivalent scores on a “climate science comprehension” assessment test and yet, as indicated, form opposing “beliefs” about the existence of human-caused global warming (indeed, about the existence of global warming regardless of cause).  Indeed, I believe I actually encounter dualism all the time when I observe how diverse citizens who are polarized in their "beliefs in" global warming  use climate science that presupposes human-caused global warming when they make practical decisions.

The second is that I feel it is the member of the taxonomy of the psychological mechanisms that I least understand. It doesn’t answer the WIGOITH question but rather puts it for me in emphatic terms.

3. Here is where Prajwal Kulkarni helps me out.

As I adverted to, Kulkarni’s interest is in public opinion on evolution.  He has insights on KD because that’s another area in which we see KD.

Indeed, KD with respect to evolution supplies the prototype for the “dualism” variant of KD.

click me! c'mon--don't be afraid!As I’ve discussed 439 separate times on this blog, there is zero correlation between “belief in” evolution and the most rudimentary comprehension of the mechanisms of it as represented in the dominant, “modern synthesis” account in evolutionary science.  “Disbelievers” are as likely to comprehend natural selection, random mutation, and genetic variance (and not comprehend them; most on both “sides” of the issue don’t) as “believers.” 

Nor is there any connection between “belief in” evolution and science comprehension generally.

What’s more, “disbelief” is no impediment to learning evolutionary theory. Good teachers can teach smart “disbelieving” kids as readily as they can smart “believing” ones—but doing so doesn’t transform the former into the latter (Lawson & Worsnop 2006).

Indeed, “knowing disbelievers” of evolution can use what they know about the natural history of human beings.  This is the insight (for all of those who, like me I suppose, would otherwise be too obtuse just to notice this in everyday life) of Everhart and Hameed (2014) and Hameed (2014), who document that medical doctors from Islamic cultures simultaneously “reject” evolution “at home,” when they are occupying their identity as members of a religious community, and “accept” it “at work,” when they are occupying their identity—doing their jobs—as professionals.

They are displaying the “dualism” variant of KD.

In response to my admission that they are the occasion for WIGOITH on my part, Kulkarni asks whether I and others who experience WIGOITH are just too hung up on consistency:

I wonder if the problem is that Kahan thinks such people need to be explained in the first place. But why should people be consistent? Why even have that expectation? As Kahan himself notes, even scientists sometimes exhibit cognitive dissonance.

Perhaps we should start from the premise that everyone is intellectually inconsistent at times. Knowing disbelievers should no more need a “satisfying understanding” than amazing basketball players who can’t shoot free-throws. In sports we accept that athletic ability is complicated and can manifest itself in all sorts of unpredictable ways. No one feels the need to explain it because that just the way it is. Why don’t we do the same for intellectual ability?

If we did, we might then conduct research to account for the handful of people who are consistent all the time. Because that’s the behavior that needs explaining.

This is a very fair question/criticism!

Or at least it is to the extent that it points out that what motivates WIGOITH generally—in all instances in which we encounter KD—is an expectation of consistency in beliefs and like intentional states. 

Descriptively, we assume that the agent who harbors inconsistent beliefs is experiencing a kind of cognitive misadventure.  If she refuses to recognize the inconsistency or consciously persists in it, we likely will view her as irrational, a characterization that is as much normative—a person ought to hold consistent beliefs—as descriptive.

Maybe that stance is unjustified (Foley 1979).  In any case, it is rarely openly interrogated and as a result might be blinding us to how living with contradiction coheres with actions and ways of life that we would recognize as perfectly sensible for someone to pursue (although I think if we came to that view, we’d definitely still not think that contradictory beliefs are the “norm”—on the contrary, we’d still likely view them as a recurring source of misadventure and error and possibly mental pathology).

Still, I don’t think any such expectation or demand for “consistency” is what’s puzzling me about dualism!

The reason is that I don’t think there necessarily is any contradiction in the beliefs and related intentional states of the dualist.  For the Pakistani Dr., “the theory of evolution” he “rejects” and the “theory of evolution” he “accepts” are “entirely different things.” 

They appear the same to us, as (obtuse?) observers, because we insist on defining his beliefs with reference solely to their state-of-affairs referents (here, the theory of human being’s natural history that originates in the work of Darwin and culminates in the modern synthesis).

But as objects in the Dr’s inventory of beliefs, attitudes, and appraisals—as objects of reasoning that figure in his competent negotiation of the situations that confront him in one or another sphere of life—they are distinct.

Perhaps, to borrow a bit form the partitioning view, the objects are “indexed” with reference to the situational triggers that correspond to his identity “at home” as an individual with a religious identity” and to his identity “at work” as a medical professional. 

But unlike the expert who as a result of partitioning fails to access the knowledge (or know how) that she herself understands to be requisite to some task (perhaps responding to a brush fire (Lewendowsky & Kirsner 2000), the Dr doesn’t feel he has “made a mistake” when it is brought to his attention that he has “rejected” a proposition that he also “accepts.”  He says, in effect, that you have made a serious mistake in thinking what he rejects and accepts are the same thing just because they have the same state-of-affairs referent. 

I am wondering if he is right. 

Is there a cogent account of the psychology of KD under which we can understand the mental objects of the “theory of evolution” that the Dr “rejects” and the "theory of evolution" that he “accepts” to be distinct because they are properly individuated with reference to the use they play in his negotiating of the integrated set of identities (integrated as opposed to segregated, as in the case of the dissonance-experiencing compartmentalizing, closeted gay man).

If so, what is it?

Once we understand it, we can then decide what to make of this way of organizing the contents of one’s mind—whether we think it is “rational” or “irrational,” a cognitive ability that contributes to being able to live a good life or a constraining form of self-delusion & so forth.

I am grateful to Kulkarni for helping me to get clearer on this in my own thinking.

But I wonder now if he doesn’t agree that there is something very much worth explaining here.

Refs

Everhart, D. & Hameed, S. Muslims and evolution: a study of Pakistani physicians in the United States. Evo. Edu. Outreach 6, 1-8 (2013).

Foley, R. Justified inconsistent beliefs. American Philosophical Quarterly, 247-257 (1979).

Hameed, S. Making sense of Islamic creationism in Europe. Unpublished manuscript (2014).

Kahan, D. M. Climate Science Communication and the Measurement Problem, Advances in Pol. Psych. (in press).

Lawson, A.E. & Worsnop, W.A. Learning about evolution and rejecting a belief in special creation: Effects of reflective reasoning skill, prior knowledge, prior belief and religious commitment.Journal of Research in Science Teaching 29, 143-166 (2006).

Lewandowsky, S., & Kirsner, Kim. Knowledge partitioning: Context-dependent use of expertise. Memory & Cognition 28, 295-305 (2000).

Tuesday
Aug262014

Democracy & the science communication environment (video lecture)

Posted synopsis & slides a while back but for anyone who wants to watch the event (Cardiff Univ., Feb. 2014, here you go!

 

Monday
Aug252014

Lewandowsky on "knowing disbelief"

So my obsession with the WIGOITH (“What is going on in their heads”) question hasn’t abated since last week. 

The question is put, essentially, by the phenomenon of “knowing disbelief.” This, anyway, is one short-hand I can think of for describing the situation of someone who, on the one hand, displays a working comprehension of and assent to some body of evidence-based propositions about how the world works but who simultaneously, on the other, expresses-- and indeed demonstrates in consequential and meaningful social engagements-- disbelief in that same body of propositions.

One can imagine a number of recognizable but discreet orientations that meet this basic description. 

I offered a provisional taxonomy in an earlier post

  • “Fuck you & the horse you rode in on” (FYATHRIO), in which disbelief is feigned & expressed only for the sake of evincing an attitude of hostility or antagonism (“Obama was born in Kenya!”); 
  • compartmentalization, which involves a kind of mental and behavioral cordoning off of recognized contradictory beliefs or attitudes as a dissonance-avoidance strategy (think of the passing or closeted gay person inside of an anti-gay religious community);
  • partitioning, which describes the mental indexing of a distinctive form of knowledge or mode of reasoning (typically associated with expertise) via a set of situational cues, the absence of which blocks an agent’s reliable apprehension of what she “knows” in that sense; and
  • dualism, in which the propositions that the agent simultaneously “accepts” and “rejects” comprise distinct mental objects, ones that are identified not by the single body of knowledge that is their common referent but by the distinct uses the agent makes of them in inhabiting social roles that are not themselves antagonistic but simply distinct

The last of these is the one that intrigues me most. The paradigm is the Muslim physician described by Everhart & Hameed (2013): the “theory of evolution” he rejects  “at home” to express his religious identity is “an entirely different thing” from the “theory of evolution” he accepts and indeed makes use of “at work” in performing his medical specialty and in being a doctor.

But the motivation for trying to make sense of the broader phenomenon—of “knowing disbelief,” let’s call it—comes from the results  of the “climate science literacy” test—the “Ordinary climate science intelligence” assessment—described in the Measurement Problem (Kahan, in press).

Administered to a representative national sample, the OCSI assessment showed, unsurprisingly, that the vast majority of global-warming “believers” and “skeptics” alike have a painfully weak grasp of the mechanisms and consequences of human-caused climate change.

But the jolting (to me) part was the finding that the respondents who scored the highest on OCSI—the ones who had the highest degree of climate-science comprehension (and of general science comprehension, too)—were still culturally polarized in their “belief in” climate change.  Indeed, they were more polarized on whether human activity is causing global warming than were the (still very divided) low-scoring OCSI respondents.

What to make of this?

I asked this question in my previous blog post.  There were definitely a few interesting responses but, as in previous instances in which I’ve asked for help in trying to make sense of something that ought to be as intriguing and puzzling to “skeptics” as it is to “believers,” discussion in the comment section for the most part reflected the inability of those who think a lot about the “merits” of the evidence on climate change to think about anything else (or even see when it is that someone is talking about something else).

But here is something responsive. It came via email correspondence from Stephen Lewandowsky, who has done interesting work on “partitioning” (e.g., Lewandowsky & Kirsner 2000), not to mention public opinion on climate change:

1. FYATHYRIO. I think this may well apply to some people. I enclose an article [Wood, M.J., Douglas, K.M. & Sutton, R.M. Dead and Alive Beliefs in Contradictory Conspiracy Theories. Social Psychological and Personality Science 3, 767-773 (2012)] that sort of speaks to this issue, namely that people can hold mutually contradictory beliefs that are integrated only at some higher level of abstraction—in this instance, that higher level of abstraction is “fuck you” and nothing below that matters in isolation or with respect to the facts.

2. Compartmentalization. What I like about this idea is that it provides at least a tacit link to the toxic emotions that any kind of challenge will elicit from those people.

3. Partitioning. I think as a cognitive mechanism, it probably explains a lot of what’s going on, but it doesn’t provide a handle on the emotions.

4. Dualism. Neat idea, I think there may be something to that. The analogy of the Muslim physician works well, and those people clearly exist. Where it falls down is because the people engaging in dualism usually have some tacit understanding of that and can even articulate the duality. Indeed, the duality allows you to accept the scientific evidence (as your Muslim Dr hypothetically-speaking does) because it doesn’t impinge on the other belief system (religion) that one holds dear.

So what do I think? I am not sure but I can offer a few suggestions: First, I am not surprised by any sort of apparent contradiction because my work on partitioning shows that people are quite capable of some fairly deep contradictory behaviors—and that they are oblivious to it. Second, I think that different things go on inside different heads, so that some people engage in FYATHYRIO whereas others engage in duality and so on. Third, I consider people’s response to being challenged a key ingredient of trying to figure out what’s going on inside their heads. And I think that’s where the toxic emotion and frothing-at-the-mouth of people like Limbaugh and his ilk come in. I find those responses highly diagnostic and I can only explain them in two ways: Either they feel so threatened by [the mitigation of] climate change that nothing else matters to them, or they know that they are wrong and hate being called out on it—which fits right in with what we know about compartmentalization. I would love to get at this using something like an IAT

Anyhow, just my 2c worth for now..

I do find this interesting and helpful. 

But as I responded to Steve, I don’t think “partitioning,” which descirbes a kind of cognitive bias or misfire related to accessing expert knowledge, is a very likely explanation for the psychology of the "knowing disbelievers" I am interested in.

The experts who display the sort of conflict between "knowing" and "disbelieving" that Steve observes in his partitioning studies would, when the result is pointed out to them, likely view themselves as having made a mistake. I don't think that's how the high-scoring OCSI "knowing disbelievers" would see their own sets of beliefs.

And for sure, Steve's picture of the “frothing-at-the-mouth” zealot is not capturing what I'm interested in either.

He or she is a real type--and has a counterpart, too, on the “believer” side: contempt-fillled and reason-free expressive zealotry is as ideologically symmetric as any other aspect of motivated reasoning.

But the “knowing disbeliever” I have in mind isn’t particularly agitated by any apparent conflict or contradiction in his or her states of belief about the science on climate change, and feels no particular compulsion to get in a fight with anyone about it.

This individual just wants to be who he or she is and make use of what is collectively known to live and live well as a free and reasoning person.

Not having a satisfying understanding of how this person thinks makes me anxious that I'm missing something very important.   

References

Everhart, D. & Hameed, S. Muslims and evolution: a study of Pakistani physicians in the United States. Evo. Edu. Outreach 6, 1-8 (2013).

Hameed, S. Making sense of Islamic creationism in Europe. Unpublished manuscript (2014).

Kahan, D. M. Climate Science Communication and the Measurement ProblemAdvances in Pol. Psych. (in press).

Lewandowsky, S., & Kirsner, Kim. Knowledge partitioning: Context-dependent use of expertise. Memory & Cognition 28, 295-305 (2000).

 

Sunday
Aug242014

Weekend update: "Knowing disbelief in evolution"-- a fragment

Covers familiar ground for the 14.6 billion regular readers of this blog, but for the benefit of the 2 or so billion nonregulars who tune in on a given day here is a portion of the Measurement Problem paper exposing the invalidity of the NSF Science Indicators' "evolution" measure.  What is obsessing and confounding me -- as I indicated in the recent "What exactly is going on in their heads?" post--is how to understand and make sense of the perspective of the "knowing disbeliever": in that context, the individual who displays high comprehension of the mechanisms and consequences of human-caused climate change but "disbelieves it"; here, the bright student who (unlike the vast majority of people who say they "believe in" evolution) displays comprehension of the modern synthesis, and who might well go on to be a scientist or other professional who uses such knowlege, but who nevertheless "disbelieves" evolution. . . .

2.  What does “belief in evolution” measure?

But forget climate change for a moment and consider instead another controversial part of science: the theory of evolution. Around once a year, Gallup or another major commercial survey firm releases a poll showing that approximately 45% of the U.S. public rejects the proposition that human beings evolved from another species of animal. The news is inevitably greeted by widespread expressions of dismay from media commentators, who lament what this finding says about the state of science education in our country.

Actually, it doesn’t say anything. There are many ways to assess the quality of instruction that U.S. students receive in science.  But what fraction of them say they “believe” in evolution is not one of them.

Numerous studies have found that profession of “belief” in evolution has no correlation with understanding of basic evolutionary science. Individuals who say they “believe” are no more likely than those who say they “don’t” to give the correct responses to questions pertaining to natural selection, random mutation, and genetic variance—the core elements of the modern synthesis (Shtulman 2006; Demastes, Settlage & Good 1995; Bishop & Anderson 1990).

Nor can any valid inference be drawn about a U.S. survey respondent's profession of “belief” in human evolution and his or her comprehension of science generally.  The former is not a measure of the latter.

To demonstrate this point requires a measure of science comprehension.  Since Dewey (1910), general education has been understood to have the aim of imparting the capacity to recognize and use pertinent scientific information in ordinary decisionmaking—personal, professional, and civic (Baron 1993).  Someone who attains this form of “ordinary science intelligence” will no doubt have acquired knowledge of a variety of important scientific findings.  But to expand and use what she knows, she will also have to possesses certain qualities of mind: critical reasoning skills essential to drawing valid inferences from evidence; a faculty of cognitive perception calibrated to discerning when a problem demands such reasoning; and the intrinsic motivation to perform the effortful information processing such analytical tasks entail (Stanovich 2011).

The aim of a valid science comprehension instrument is to measure these attributes.  Rather than certifying familiarity with some canonical set of facts or abstract principles, we want satisfactory performance on the instrument to vouch for an aptitude comprising the “ordinary science intelligence” combination of knowledge, skills, and dispositions.

Such an instrument can be constructed by synthesizing items from standard “science literacy” and critical reasoning measures (cf. Kahan, Peters et. al 2012). These include the National Science Foundation’s Science Indicators (2014) and Pew Research Center’s “Science and Technology” battery (2013), both of which emphasize knowledge of core scientific propositions from the physical and biological sciences; the Lipkus/Peters Numeracy scale, which assesses quantitative reasoning proficiency (Lipkus et al. 2001; Peters et al. 2006; Weller et al. 2012); and Frederick’s Cognitive Reflection Test, which measures the disposition to consciously interrogate intuitive or pre-existing beliefs in light of available information (Frederick 2005; Kahneman 1998).

The resulting 18-item “Ordinary Science Intelligence” scale is highly reliable (α = 0.83) and displays a unidimensional covariance structure when administered to a representative general population sample (N = 2000).[1] Scored with Item Response Theory to enhance its discrimination across the range of the underlying latent (not directly observable) aptitude that it can be viewed as measuring, OSI strongly predicts proficiency on tasks such as covariance detection, a form of reasoning elemental to properly drawing casual inferences from data (Stanovich 2009).  It also correlates (r = 0.40, p < 0.01) with Baron’s Actively Open-minded Thinking test, which measures a person’s commitment to applying her analytical capacities to find and properly interpret evidence (Haron, Ritov & Mellers 2013; Baron 2008).

 Consistent with the goal of discerning differing levels of this proficiency (Embretson & Reise 2000), OSI contains items that span a broad range in difficulty.  For example, the NSF Indicator Item “Electrons”—“Electrons are smaller than atoms—true or false?”—is comparatively easy (Figure 1). Even at the mean level of science comprehension, test-takers from a general population sample are approximately 70% likely to get the “right” answer.  Only someone a full standard deviation below the mean is more likely than not to get it wrong.

“Nitrogen,” the Pew multiple choice item on which gas is most prevalent in the atmosphere, is relatively difficult (Figure 1).  Someone with a mean OSI score is only about 20% likely to give the correct response. A test taker has to possess an OSI aptitude one standard deviation above the mean before he or she is more likely than not to supply the correct response.

 “Conditional Probability” is a Numeracy battery item (Weller et al. 2012). It requires a test-taker to determine the probability that a woman who is selected randomly from the population and who tests positive for breast cancer in fact has the disease; to do so, the test-taker must appropriately combine information about the population frequency of breast cancer with information about the accuracy rate of the screening test. A problem that assesses facility in drawing the sort of inferences reflecting the logic of Bayes’s’ Theorem, Conditional Probability turns out to be super hard. At the mean level of OSI, there is virtually no chance a person will get this one right.  Even those over two standard deviations above the mean are still no more likely to get it right than to get it wrong (Figure 1).  

.

With this form of item response analysis (Embretson & Reise 2000), we can do two things. One is identify invalid items—ones that don’t genuinely measure the underlying disposition in an acceptably discerning manner. We’ll recognize an invalid item if the probability of answering it correctly doesn’t bear the sort of relationship with OSI that valid items do.

The NSF Indicator’s “Evolution” item—“human beings, as we know them today, developed from earlier species of animals, true or false?”—is pretty marginal in that regard. People who vary in science comprehension, we’ve seen, vary correspondingly in their ability to answer questions that pertain to their capacity to recognize and give effect to valid empirical evidence. The probability of getting the answer “right” on “Evolution,” in contrast, varies relatively little across the range of OSI (Figure 1). In addition, the probability of getting the right answer is relatively close to 50% at both one standard deviation below and one standard deviation above the OSI mean, as well as at every point in between. The relative unresponsiveness of  the item to differences in science comprehension, then, is reason to infer that it is either not measuring anything or is measuring something that is independent of science comprehension.

Second, item-response functions can be used to identify items that are “biased” in relation to a subgroup.  “Bias” in this context is used not in its everyday moral sense, in which it connotes animus, but rather in its measurement sense, where it signifies a systematic skew toward either high or low readings in relation to the quantity being assessed.  If an examination of an item’s response profile shows that it tracks the underlying latent disposition in one group but not in another, then that item is biased in relation to members of the latter group—and thus not a valid measure of the disposition for a test population that includes them (Osterlind & Everson 2009).

That’s clearly true for the NSF’s Evolution item as applied to individuals who are relatively religious.  Such individuals—who we can identify with a latent disposition scale that combines self-reported church attendance, frequency of prayer, and perceived importance of religion in one’s life (α = 0.86)—respond the same as relatively nonreligious ones with respect to Electron, Nitrogen, and Conditional Probability. That is, in both groups, the probability of giving the correct response varies in the same manner with respect to the underlying science comprehension disposition that OSI measures (Figure 2).

Their performance on the Evolution item, however, is clearly discrepant. One might conclude that Evolution is validly measuring science comprehension for non-religious test takers, although in that case it is a very easy question:  the likelihood a nonreligious individual with a mean OSI score will get the “right” answer is 80%—even higher than the likelihood that this person would respond correctly to the relatively simple Electron item.

In contrast, for a relatively religious individual  with a mean OSI score, the probability of giving the correct response is around 30%.  This 50 percentage-point differential tells us that Evolution does not have the same relationship to the latent OSI disposition in these two groups.

Indeed, it is obvious that Evolution has no relation to OSI whatsoever in relatively religious respondents.  For such individuals, the predicted probability of giving the correct answer does not increase as individuals display a higher degree of science comprehension. On the contrary, it trends slightly downward, suggesting that religious individuals highest in OSI are even more likely to get the question “wrong.”

It should be obvious but just to be clear: these patterns have nothing to do with any correlation between OSI and religiosity. There is in fact a modest negative correlation between the two (r = -0.17, p  < 0.01).  But the “differential item function” test (Osterlind & Everson 2009) I’m applying identifies differences among religious and nonreligious individuals of the same OSI level. The difference in performance on the item speaks to the adequacy of Evolution as a measure of knowledge and reasoning capacity and not to the relative quality of those characteristics among members of the two groups.

The bias with respect to religious individuals—and hence the invalidity of the item as a measure of OSI for a general population sample—is most striking in relation to respondents’ performance on Conditional Probability. There is about a 70% (± 10 percentage points, at the 0.95 level of confidence) probability that someone two and a quarter standard deviations above the mean on OSI will answer this extremely difficult question correctly. Of course, there aren’t many people two and a quarter standard deviations above the mean (the 99th percentile), but certainly they do exist, and they are not dramatically less likely to be above average in religiosity.  Yet if one of these exceptionally science-comprehending individuals is relatively religious, the probability that he or she will give the right answer to the NSF Evolution item is about 25% (± 10 percentage points, at the 0.95 level of confidence)—compared to 80% for the moderately nonreligious person who is merely average in OSI and whose probability of answering Conditional Probability correctly is epsilon. 

Under these conditions, one would have to possess a very low OSI score (or a very strong unconscious motivation to misinterpret these results (Kahan, Peters, et al. 2013)) to conclude that a “belief in evolution” item like the one in the NSF Indicatory battery validly measures science comprehension in general population test sample.  It is much more plausible to view it as measuring something else: a form of cultural identity that either does or does not feature religiosity (cf. Roos 2012).

One way to corroborate this surmise is to administer to a general population sample a variant of the NSF’s Evolution item designed to disentangle what a person knows about science from who he or she is culturally speaking.  When the clause, “[a]ccording to the theory of evolution  . . .” introduces the proposition “human beings, as we know them today, developed from earlier species of animals” (NSF 2006, 2014), the discrepancy between relatively religious and relatively non-religious test-takers disappears! Freed from having to choose between conveying what they understand to be the position of science and making a profession of “belief” that denigrates their identities, religious test-takers of varying levels of OSI now respond very closely to how nonreligious ones of corresponding OSI levels do. The profile of the item response curve—a positive slope in relation to OSI for both groups—supports the inference that answering this variant of Evolution correctly occupies the same relation to OSI as do the other items in the scale. However, this particular member of the scale turns out to be even easier—even less diagnostic of anything other than a dismally low comprehension level in those who get it wrong—than the simple NSF Indicator Electron item.

As I mentioned, there is no correlation between saying one “believes” in evolution and meaningful comprehension of natural selection and the other elements of the modern synthesis. Sadly, the proportion who can give a cogent and accurate account of these mechanisms is low among both “believers” and “nonbelievers,” even in highly educated samples, including college biology students (Bishop & Anderson 1990).  Increasing the share of the population that comprehends these important—indeed, astonishing and awe-inspiring—scientific insights is very much a proper goal for those who want to improve the science education that Americans receive.

The incidence of “disbelief” in evolution in the U.S. population, moreover, poses no barrier to attaining it. This conclusion, too, has been demonstrated by outstanding empirical research in the field of education science (Lawson & Worsnop 2006).  The most effective way to teach the modern synthesis to high school and college students who “do not believe” in evolution, this research suggests, is to focus on exactly the same thing one should focus on to teach evolutionary science to those who say they do “believe” but very likely don’t understand it: the correction of various naive misconceptions that concern the tendency of people to attribute evolution not to supernatural forces but to functionalist mechanisms and to the hereditability of acquired traits (Demastes, Settlage & Good 1995; Bishop & Anderson 1990)..

Not surprisingly, the students most able to master the basic elements of evolutionary science are those who demonstrate the highest proficiency in the sort of critical reasoning dispositions on which science comprehension depends. Yet even among these students, learning the modern synthesis does not make a student who started out professing “not to believe in” evolution any more likely to say she now does “believe in” it (Lawson & Worsnop 2006).

Indeed, treating profession of “belief” as one of the objectives of instruction is thought to make it less likely that students will learn the modern synthesis.  “[E]very teacher who has addressed the issue of special creation and evolution in the classroom,” the authors of one study (Lawson & Worsnop 2006, p. 165) conclude,

already knows that highly religious students are not likely to change their belief in special creation as a consequence of relative brief lessons on evolution. Our suggestion is that it is best not to try to [change students’ beliefs], not directly at least. Rather, our experience and results suggest to us that a more prudent plan would be to utilize instruction time, much as we did, to explore the alternatives, their predicted consequences, and the evidence in a hypothetico-deductive way in an effort to provoke argumentation and the use of reflective thought. Thus, the primary aims of the lesson should not be to convince students of one belief or another, but, instead, to help students (a) gain a better understanding of how scientists compare alternative hypotheses, their predicated consequences, and the evidence to arrive at belief and (b) acquire skill in the use of this important reasoning pattern—a pattern that appears to be necessary for independent learning and critical thought.

This research is to the science of science communication’s “measurement problem” what the double slit experiment is to quantum mechanics’.  All students, including the ones most readily disposed to learn science, can be expected to protect their cultural identities from the threat that denigrating cultural meanings pose to it.  But all such students—all of them—can also be expected to use their reasoning aptitudes to acquire understanding of what is known to science.  They can and will do both—at the very same time.  But only when the dualistic quality of their reason as collective-knowledge acquirers and identity-protectors is not interfered with by forms of assessment that stray from science comprehension and intrude into the domain of cultural identity and expression.  A simple (and simple-minded) test can be expected to force disclosure of only one side of their reason.  And what enables the most exquisitely designed course to succeed in engaging the student’s reason as an acquirer of collective knowledge is exactly the care and skill with which the educator avoids provoking the student into using her reason for purposes of identity-protection only.

 


[1] The items comprising the OSI scale appear in the Appendix. The psychometric performance of the OSI scale is presented in greater detail in Kahan (2014)

Refs

Baron, J. Why Teach Thinking? An Essay. Applied Psychology 42, 191-214 (1993).

Bishop, B.A. & Anderson, C.W. Student conceptions of natural selection and its role in evolution. Journal of Research in Science Teaching 27, 415-427 (1990).

Demastes, S.S., Settlage, J. & Good, R. Students' conceptions of natural selection and its role in evolution: Cases of replication and comparison. Journal of Research in Science Teaching 32, 535-550 (1995).

Dewey, J. Science as Subject-matter and as Method. Science 31, 121-127 (1910).

Embretson, S.E. & Reise, S.P. Item response theory for psychologists (L. Erlbaum Associates, Mahwah, N.J., 2000).

Kahan, D.M. “Ordinary Science Intelligence”: A Science Comprehension Measure for Use in the Study of Risk Perception and Science Communication. Cultural Cognition Project Working Paper No. 112  (2014).

Kahan, D.M., Peters, E., Dawson, E. & Slovic, P. Motivated Numeracy and Englightened Self Government. Cultural Cognition Project Working Paper No. 116  (2013).

Kahan, D.M., Peters, E., Wittlin, M., Slovic, P., Ouellette, L.L., Braman, D. & Mandel, G. The polarizing impact of science literacy and numeracy on perceived climate change risks. Nature Climate Change 2, 732-735 (2012).

Lawson, A.E. & Worsnop, W.A. Learning about evolution and rejecting a belief in special creation: Effects of reflective reasoning skill, prior knowledge, prior belief and religious commitment. Journal of Research in Science Teaching 29, 143-166 (1992).

Lipkus, I.M., Samsa, G. & Rimer, B.K. General Performance on a Numeracy Scale among Highly Educated Samples. Medical Decision Making 21, 37-44 (2001).

National Science Foundation. Science and Engineering Indicators (Wash. D.C. 2014). 

National Science Foundation. Science and Engineering Indicators (Wash. D.C. 2006). 

Osterlind, S.J., Everson, H.T. & Osterlind, S.J. Differential item functioning (SAGE, Thousand Oaks, Calif., 2009). 

Peters, E., Västfjäll, D., Slovic, P., Mertz, C.K., Mazzocco, K. & Dickert, S. Numeracy and Decision Making. Psychol Sci 17, 407-413 (2006).

Pew Research Center for the People & the Press. Public's Knowledge of Science and Technology (Pew Research Center, Washington D.C., 2013).

Roos, J.M. Measuring science or religion? A measurement analysis of the National Science Foundation sponsored science literacy scale 2006–2010. Public Understanding of Science  (2012).

Shuman, H. Interpreting the Poll Results Better. Public Perspective 1, 87-88 (1998).

Stanovich, K.E. What intelligence tests miss : the psychology of rational thought (Yale University Press, New Haven, 2009). 

Weller, J.A., Dieckmann, N.F., Tusler, M., Mertz, C., Burns, W.J. & Peters, E. Development and testing of an abbreviated numeracy scale: A rasch analysis approach. Journal of Behavioral Decision Making 26, 198-212 (2012).

 

Saturday
Aug232014

Weekend update: "Culture is prior to fact" & what that implies about resolving political conflict over risk

The idea that cultural cognition and related dynamics are peculiar to "unsettled" issues, or ones where the scientific evidence is not yet "clearly established," is a recurring theme.  For some reason, the recent "What exactly is going on in their heads?" post has stimulated many commentators -- in the discussion thread & in correspondence -- to advance this claim.  In fact, that view is at odds with the central tenet of cultural cognition as a research program.

The cultural cognition thesis asserts that "culture is prior to fact" in a cognitive sense: the capacity of individuals to recognize the validity of evidence on risks and like policy-relevant facts depends on cognitive faculties that themselves are oriented by cultural affiliations. Because cultural norms and practices certify that evidence has the qualities that entitle it to being credited consistent with science's criteria for valid proof, ordinary members of the public won't be able to recognize that scientific evidence is "clear" or "settled" unless doing so is compatible with their cultural identities. 

Below I reproduce one relatively early formulation of this position. It is from  Kahan, D.M. & Braman, D. Cultural Cognition of Public Policy. Yale J. L. & Pub. Pol'y 24, 147-170 (2006).  

In this essay, Don "Shotgun" Braman & I characterize the "cultural cognition thesis" as a "conjecture."  I am happy to have it continue to be characterized as such -- indeed, prefer that it forever be referred to as "conjectural" no matter how much evidence is adduced to support it than that it be referred to as "proven" or "established" or the like, a way of talking that reflects a vulgar heuristic substitute for science's own way of knowing, which treats every current best understanding as provisional and as subject to modification and even rejection in light of additional evidence. 

But in fact, since this essay was published, the Cultural Cognition Project has conducted numerous experiments that support the "cultural cognition thesis."  These experiments present evidence on mechanisms of cognition the operation of which implies that "clear" or valid evidence can be recognized as such only when assent to it affirms rather than denigrates perceivers' cultural identities.  Such mechanisms include (1) culturally biased search and assimilation; (2) cultural source credibility; (3) the cultural availability effect; and (4) culturally motivated system 2 reasoning.  

As the excerpt emphasizes (and as is documented in its many footnotes, which are not reproduced here), all of these involve extensions of well-established existing psychological dynamics.  The nerve of the cultural cognition research program has been been simply to demonstrate important interactions between known cognitive mechanisms and cultural outlooks, a process that we hypothesize accounts for persistent political conflict on risk and other policy-relevant facts that admit of scientific investigation.

Knowing what I (provisionally) do now, there are collateral elements of the account below that I would qualify or possibly even disavow! I'm sure I'll continue to discover holes and gaps and false starts in the future, too--and I look forward to that.

V. FROM HEURISTIC TO BIAS 

Public disagreement about the consequences of law is not just a puzzle to be explained but a problem to be solved. The prospects for enlightened democratic decisionmaking obviously depend on some reliable mechanism for resolving such disputes and resolving them accurately. Because such disagreements turn on empirical claims that admit of scientific investigation, the conventional prescription is the pursuit and dissemination of scientifically sound information.

The hope that democracy can be enlightened in such a straightforward manner, however, turns out to be an idle one. Like most heuristics, cultural cognition is also a bias. By virtue of the power that cultural cognition exerts over belief formation, public dispute can be expected to persist on questions like the deterrent effect of capital punishment, the danger posed by global warming, the utility or futility of gun control, and the like, even after the truth of the matter has been conclusively established.

Imagine—very counterfactually—that all citizens are perfect Bayesians. That is, whenever they are apprised of reliable information, they readily update their prior factual beliefs in a manner that appropriately integrates this new information with all existing information at their disposal.

Even under these circumstances, conclusive discovery of the truth is no guarantee that citizens will converge on true beliefs about the consequences of contested public policies. For while Bayesianism tells individuals what to do with relevant and reliable information, it doesn’t tell them when they should regard information as relevant and reliable. Individuals can be expected to give dispositive empirical information the weight that it is due in a rational-decisionmaking calculus only if they recognize sound information when they see it.

The phenomenon of cultural cognition suggests they won’t. The same psychological and social processes that induce individuals to form factual beliefs consistent with their cultural orientation will also prevent them from perceiving contrary empirical data to be credible. Cognitive-dissonance avoidance will steel individuals to resist empirical data that either threatens practices they revere or bolsters ones they despise, particularly when accepting such data would force them to disagree with individuals they respect. The cultural judgments embedded in affect will speak more authoritatively than contrary data as individuals gauge what practices are dangerous and what practices are not. And the culturally partisan foundation of trust will make them dismiss contrary data as unreliable if they perceive that it originates from persons who don’t harbor their own cultural commitments.

This picture is borne out by additional well-established psychological and social mechanisms. One constraint on the disposition of individuals to accept empirical evidence that contradicts their culturally conditioned beliefs is the phenomenon of biased assimilation. This phenomenon refers to the tendency of individuals to condition their acceptance of new information as reliable based on its conformity to their prior beliefs. This disposition to reject empirical data that contradict one’s prior belief (for example, that the death penalty does or doesn’t deter crime) is likely to be especially pronounced when that belief is strongly connected to an individual’s cultural identity, for then the forces of cognitive dissonance avoidance that explain biased assimilation are likely to be most strongly aroused.

Two additional mechanisms reinforce the tendency to see new information as unreliable when it challenges a culturally congenial belief. The first is naïve realism. This phenomenon refers to the disposition of individuals to view the factual beliefs that predominate in their own cultural group as the product of “objective” assessment, and to attribute the contrary factual beliefs of their cultural and ideological adversaries to the biasing influence of their worldviews. Under these conditions, evidence of the truth will never travel across the boundary line that separates a factually enlightened cultural group from a factually benighted one.

Indeed, far from being admitted entry, the truth will be held up at the border precisely because it originates from an alien cultural destination. The second mechanism that constrains societal transmission of truth—reactive devaluation—is the tendency of individuals who belong to a group to dismiss the persuasiveness of evidence proffered by their adversaries in settings of intergroup conflict.

We have been focusing on the impact of cultural cognition as a bias in the public’s recognition of empirically sound information. But it would be a mistake to infer that the immunity of social and natural scientists to such bias improves the prospects for truth, once discovered, to penetrate public debate.

This would be a mistake, first, because scientists aren’t immune to the dynamics we have identified. Like everyone else, scientists (quite understandably, even rationally) rely heavily on their priors when evaluating the reliability of new information. In one ingenious study, for example, scientists were asked to judge the experimental and statistical methods of what was represented to be a real study of the phenomenon of ESP. Those who received the version of the fictitious study that found evidence of ESP rated the methods to be low in quality, whereas those who received the version that found no evidence of ESP rated the methods to be high in quality, even though the methods were in fact independent of the conclusion. Other studies showing that cultural worldviews explain variance in risk perceptions not just among lay persons but also among scientists who specialize in risk evaluation fortify the conclusion that for scientists, too, cultural cognition operates as an information-processing filter.

But second and more important, any special resistance scientists might have to the biasing effect of cultural cognition is beside the point. The issue is whether the discovery and dissemination of empirically sound information can, on its own, be expected to protect democratic policymaking from the distorting effect of culturally polarized beliefs among citizens and their representatives.

Again (for the umpteenth time), ordinary citizens aren’t in a position to determine for themselves whether this or that scientific study of the impact of gun control laws, of the deterrent effect of the death penalty, of the threat posed by global warming, et cetera, is sound. Scientific consensus, when it exists, determines beliefs in society at large only by virtue of social norms and practices that endow scientists with deference-compelling authority on the issues to which they speak. When they address matters that have no particular cultural valence within the group-grid matrix—What are the relative waterrepellant qualities of different synthetic fabrics? Has Fermat’s Last Theorem been solved?—the operation of these norms and practices is unremarkable and essentially invisible.

But when scientists speak to policy issues that are culturally disputed, then their truth-certifying credentials are necessarily put on trial. For many citizens, men and women in white lab coats speak with less authority than (mostly) men and women in black frocks. And even those who believe the scientists will still have to choose which scientists to believe. The laws of probability, not to mention the professional incentives toward contrarianism, assure that even in the face of widespread professional consensus there will be outliers. Citizens (again!) lack the capacity to decide for themselves whose work has more merit. They have no choice but to defer to those whom they trust to tell them which scientists to believe. And the people they trust are inevitably the ones whose cultural values they share, and who are inclined to credit or dismiss scientific evidence based on its conformity to their cultural priors.

These arguments are necessarily interpretative and conjectural. But in the spirit of (casual) empirical verification, we invite those who are skeptical to perform this thought experiment. Ask yourself whether you think there is any credible scientific ground for believing that global warming is/isn’t a serious threat; that the death penalty does/doesn’t deter; that gun control does/doesn’t reduce violent crime; that abortion is/isn’t safer than childbirth. If you believe the truth has been established on any one of these issues, ask yourself why it hasn’t dispelled public disagreement. If you catch yourself speculating about the possible hidden cognitive motivations the disbelievers might have by virtue of their cultural commitments, you may proceed to the next Part of this Essay (although not until you’ve reflected on why you think you know the truth and whether your cultural commitments might have anything to do with that belief).  If, in contrast, you are tempted to answer, “Because the information isn’t accessible to members of the public,” then please go back to the beginning of this Essay and start over.

VI. OVERCOMING CULTURAL BIAS: IDENTITY AFFIRMATION

Nothing in our account implies either that there is no truth of the matter on disputed empirical policy issues or that the public cannot be made receptive to that truth. Like at least some other cognitive biases, cultural cognition can be counteracted. . . .  

 

Friday
Aug222014

For what it's worth: breaking down "belief in" GW vs. "belief in" AGW as function of partisanship & OCSI

As a result of (a) my aggregation of responses to the two-part question used to assess "belief in" human-caused global warming and (b) my failure to indicate that in the Figure label, there was some understandable confusion in the discussion in response to the "What exactly is going on in their heads?" post.

This should help.

Again, the "belief in" question I used -- patterned on standard opinion polling ones used by firms like Pew & Gallup-- has two parts:

  1. "From what you’ve read and heard, is there solid evidence that the average temperature on earth has been getting warmer over the past few decades?" [YES/NO]
     
  2. If yes: "Do you believe that the earth is getting warmer (a) mostly because of human activity such as burning fossil fuels or (b) mostly because of natural patterns in the earth’s environment?"

Among the people (N = 2000, nationally representative) who took the "Ordinary climate science intelligence" assessment, here is the breakdown for question (1) for respondents defined by their scores in relation to the mean on a "right-left" outlook scale (one that combined responses to items on party allegiance and liberal-conservative ideology):

These results are consistent with what US general public opinion surveys have shown for better part of a decade.

Here are the "item response" profiles-- plots of the predicted probability of answering these questions as indicated -- for subjects of opposing political outlooks in relation to their scores on the OCSI scale:

As can be seen, the probability both of "believing in" global warming and "belief in" human-caused global warming among those who believe in global warming becomes more politically polarized as individuals score higher on OCSI.

Note that OCSI itself is made up of items relating to the mechanisms and consequences of human-caused global warming.  Items on "belief in" global warming -- human or otherwise -- are not part of the scale, since the point was to see if comprehension of the mechanisms and consequences of human-caused climate change, on the one hand, have any particular connection to "belief in" human-caused global warming, on the other. The former clearly don't "cause" the latter!

 I've disabled comments here in order to prevent "forking" the discusison going on in connection with the "Whats going on ..." post.  So feel free to dispense your wisdom on these data there.

Wednesday
Aug202014

Conditional probability is hard -- but teaching it *shouldn't* be!

So, consider these two problems: 

A. Which is more difficult?

B. Which is it easier to teach someone to do correctly?

My answers: BAYES is more difficult but also easier to each someone to do correctly. 

Does that seem plausible to you? I won't be surprised if you say no, particularly if your answer reflects experience in seeing how poorly people do with conditional probability problems.

But if you disagree with me, I do want to challenge your sense of what the problem is

Okay, so here are some data.

For sure, BAYES is harder.  In a diverse sample of 1,000 adults (over half of whom had either a four-year college or post-graduate degree), only 3% got the correct answer (50%). For COVARY, 55% percent got the correct answer (“patients administered the new treatment were not more likely to survive”).

This is not surprising. BAYES involves conditional probability, a concept that most people find very counterintuitive.  There is a strong tendency to treat the accuracy rate of the witness’s color discernment-- 90% --  as the likelihood that the bus is blue.  

That was the modal answer—one supplied by 34% of the respondents—within the sample here. This response ignores information about the base rate of blue versus green buses.  

Another 23% picked 10%--the base rate frequency of blue buses. They thus ignored the additional information associated with the witness’s perception of the color of the bus.

How to combine the base rate information with the accuracy of the witness’s perception of color (or their equivalent in other problems that involve the same general type of reasoning task) is reflected in Bayes’s Theorem, a set of logical operations that most people find utterly baffling.

COVARY is a standard “covariance detection” problem.  It’s not as hard as BAYES, but it’s still pretty difficult!

Many people (usually most; this fairly well educated sample did better than a representative sample would) use one of two heuristics to analyze a problem that has the formal characteristics of this one (Arkes & Harkness 1983).  The first, and most common, simply involves comparing the number of “survivors” to the number of “nonsurvivors” in the treatment condition.  The second involves comparing in addition the number of survivors in the treatment and the number of survivors in the control.

Both of these approaches generate the wrong answer—that patients given the new treatment were more likely to survive than those who didn’t receive it—for the data generated in this hypothetical experiment.

What’s important is the ratio of survivors to nonsurvivors in the two experimental groups.  In the group whose members received the treatment, patients were about three times more likely to survive (223:75 = 2.97:1).  In the untreated group, however, parents were just over five times more likely to survive (107:21 = 5.10:1).

Pretty much anyone who got the wrong answer can see why the correct one is right once the difference in the “likelihood ratios” (which is actually an important common element in conditional probability and covariance problems) is pointed out. 

The math is pretty tame (a fifth grader should be able to handle it), and the inferential logic (the essence of the sort of causal inference strategy that informs controlled experimentation) pretty much explains itself.

The reason such a significant number of people get the answer wrong is that they don’t reliably recognize that they have to compare the ratios of positive to negative outcomes. They effectively succumb to the temptation to settle for “hypothesis-confirming” evidence without probing for the disconfirming evidence that one can extract only by making use of all the available information in the 2x2 contingency table.

Now, why do I feel that it is nevertheless easier to teach people how to solve conditional probability problems of the sort reflected in BAYES than to teach them how to reliably solve covariance-detection ones of the sort reflected in COVARY?

The answer has to do with what someone has to learn to consistently get the problems right.

Doing conditional probability problems is actually easy once one grasps why the base rate matters—and enabling someone to grasp that turns out to be super easy too with the right pedagogical techniques.

The most important of these is to illustrate how a conditional probability problem can be conceived of as a population-sampling one (Spiegelhalter, Pearson & Short 2011).

In BAYES, we are told that 90% of the buses that could have struck Bill are green, and 10% of them are blue.

Accordingly, if we imagine a simulation in which Bill was hit by 100 city buses drawn at random, we’d expect him to be run down by a green bus 90 times and a blue one 10 times.

If we add Wally to the simulation, we’ll expect him correctly to perceive 81 or 90% of the 90 green buses that struck Bill  to be green and incorrectly perceive 9 (10%) of them to be blue.

Likewise, we’ll expect him to correctly perceive 9 of the 10 blue buses (90%) that hit Bill to be blue, but incorrectly perceive 1 of them (10%) to be green.

Overall, then, in 100 trials, Wally will perceive Bill to have been hit 18 times by a blue bus. Nine of those will be cases in which Wally correctly perceived a blue bus to be blue.  But nine will be cases in which Wally incorrectly perceived as blue a bus that was in fact green.

Because in our 100-trial simulation, the number of times Wally was correct when he identified the bus that hit Bill as blue is exactly equal to the number of times he was incorrect, Bill will have been hit by a blue bus 50% of the time and by a green one 50% of time in all the cases in which Wally perceives Bill was hit by a blue bus.

This “natural frequency” strategy for analyzing conditional probability problems has been shown to be an effective pedagogical tool in experimental studies (Sedlmeier & Gigerenzer 2001; Kurzenhäuser & Hoffrage 2002; Wheaton & Deshmuk 2009). 

After using it to help someone grasp the conceptual logic of conditional probability, one can also connect the steps involved to a very straightforward rendering of Bayes’s Theorem: prior odds x likelihood ratio = revised (posterior) odds.

In this rendering, the base rate is represented in terms of the odds that a particular proposition or hypothesis is true: here, independently of Wally’s observation, we’d compute the odds that the bus that struck Bill was blue at 10:90 (“10 in 100”) or 1:9.

The new information or evidence is represented as a likelihood ratio, which reflects how much more consistent that evidence is with the hypothesis or proposition in question being true than with its negation (or some alternative hypothesis) being true.

Wally is able correctly to distinguish blue from green 90% of the time.

So if the bus that struck Bill was in fact blue, we’d expect Wall to perceive it as blue 9 times out of 10, whereas if the bus that struck Bill was in fact green, we’d expect Wally to perceive it as blue only 1 time out of 10. 

Because Wally is nine times (9 vs. 1 or 90% vs. 10%) more likely to perceive a bus was “blue” when it was truly blue than when it was in fact green, the likelihood ratio is 9.

“Multiplying” the prior odds by the likelihood ratio involves computing the product of (1) the element of the odds expression that corresponds to the hypothesis  and (2) the likelihood ratio value. 

Here the prior odds were 1:9 that the bus that struck Bill was blue.  Nine (likelihood ratio) times one (from 1:9) equals 9

The revised odds that the bus that struck bill was blue is thus 1:9 x 9 = 9:9 or 1:1, which is equivalent to 50%.

I’m not saying that one exposure to this sort of exercise will be sufficient to reliably program someone to do conditional probability problems.

But I am saying that students of even middling levels of numeracy can be expected over the course of a reasonable number of repetitions to develop a reliable facility with conditional probability. The “natural frequencies” representation of the elements of the problem makes sense, and students can see which parts of that conceptualization map onto the “prior odds x likelihood ratio = revised odds” rendering of Bayes’s theorem and why.

If you want to make it even easier for this sort of lesson to take hold, & related hardwiring to settle in, give your students this cool Bayes's calculator.

Students can’t be expected, in contrast, to see why any of the other more complex but logically equivalent rendering of Bayes’s Theorem actually makes sense.  They thus can't be expected to retain them, to become adept at heuristically deploying them, or to experience the sort of improvement in discernment and reasoning that occurs as one assimilates statistical concepts.  

Teachers who try to get students to learn to apply these formalisms, then, are doing a shitty job!

Now what about covariance?

Actually, there’s really nothing to it from an instructional point of view.  It explains itself, as I said.

But that’s exactly the problem: facility with it is not a matter of learning how to do any particular thing.

Rather it is a matter of reliably recognizing when one is dealing with a problem in which the sort of steps necessary to detect covariance have to be done.

The typical reaction of someone when it's pointed out that he or she got the covariance problem wrong is an instant recognition of the mistake, and the sense that the error was a result of an uncharacteristic lapse or even a “trick” on the part of the examiner. 

But in fact, in order to make reliable causal inferences based on observation in their everyday life, people will constantly be required to detect covariance.  If they are unable to see the need for, or just lack the motivation to perform, the necessary operations even when all the essential information has been pre-packaged for them into a 2x2 contingency table, then the likelihood that they will lapse into the defective heuristic alternative when they encounter covariance-detection problems in the wild is very very high (Stanovich 2009).

How likely someone is to get the right answer in the covariance problem is associated with their numeracy. The standard numeracy scale (e.g., Peters et al. 2006) is a measure not so much of math skill as of the capacity to reliable recognize when a quantitative reasoning problem requires one or another type of effortful analysis akin to what's involved in detecting covariance.

Frankly, I’m pessimistic that I can instill that sort of capacity in students.  That's not because I have a modest sense of my abilities as a teacher.  It’s because I have due respect for the difficulty that many indisputably great researchers and teachers have encountered in trying to come up with pedagogical techniques that are as successful in imparting critical reasoning dispositions in students as the “natural frequencies” strategy is for imparting a reliable facility in them to do conditional probability problems.

Of course, in order for students to successfully use the “natural frequencies” strategy and—after they become comfortable with it—the prior odds x likelihood ratio = revised odds rendering of Bayes theorem, they must reliably recognize conditional probability problems when they see them. 

But in my experience, at least, that’s not a big deal. When a conditional probability problem makes its appearance, one is about as likely to overlook it as one is to fail to notice that a mother black bear w/ its cub or a snarling honey badger has appeared along side the trail during a hike in the woods.

Which then leads me to the question, how can it be that only 3% of a sample as well educated and intelligent  as the one I tested can get do a conditional probability problem as simple as the one I put in this battery?

Doesn't that mean that too many math teachers are failing to use the empirical knowledge that has been developed by great education researchers & teachers?

Or am I (once again; it happens!) missing something?

References

Arkes, H.R. & Harkness, A.R. Estimates of Contingency Between Two Dichotomous Variables. J. Experiminal Psychol. 112, 117-135 (1983).

Kurzenhäuser, S. & Hoffrage, U. Teaching Bayesian reasoning: an evaluation of a classroom tutorial for medical students. Medical Teacher 24, 516-521 (2002).
 

Peters, E., Västfjäll, D., Slovic, P., Mertz, C.K., Mazzocco, K. & Dickert, S. Numeracy and Decision Making. Psychol Sci 17, 407-413 (2006).

Sedlmeier, P. & Gigerenzer, G. Teaching Bayesian reasoning in less than two hours. Journal of Experimental Psychology: General 130, 380-400 (2001).

Spiegelhalter, D., Pearson, M. & Short, I. Visualizing Uncertainty About the Future. Science 333, 1393-1400 (2011).

Stanovich, K.E. What intelligence tests miss : the psychology of rational thought (Yale University Press, New Haven, 2009).

Wheaton, K.J., Lee, J. & Deshmukh, H. Teaching Bayesian Statistics To Intelligence Analysts: Lessons Learned. J. Strategic Sec. 2, 39-58 (2009).

 

 

Tuesday
Aug192014

"What exactly is going on in their heads?" (And in mine?) Explaining "knowing disbelief" of climate change

During my trip to Australia, I presented The Measurement Problem twice in one day, first at Monash University and then at RMIT University (slides here). I should have presented two separate lectures but I’m obsessed—disturbed even—by the results of the MP study so I couldn’t resist the opportunity to collect two sets of reactions.

In fact, I spent the several hours between the lectures discussing the challenges of measuring popular climate-science comprehension with University of Melbourne psychologist Yoshi Kashima, co-author of the very interesting study Guy, S., Kashima, Y., Walker, I. & O'Neill, S. Investigating the effects of knowledge and ideology on climate change beliefs. European Journal of Social Psychology 44, 421-429 (2014).

The challenges, we agreed, are two.

The first is just to do it. 

If you want to figure out what people know about the mechanisms of climate change, asking them whether they “believe in” human-caused global warming definitely doesn’t work.  The answer they give you to that question tells you who they are: it is an indicator of their cultural identity uninformed by and uncorrelated with any meaningful understanding of evidence or facts.

Same for pretty much any question that people recognize as asking them to “take a position” on climate change.

To find out what people actually know, you have to design questions that make it possible for them to reveal what they understand without having to declare whose side they are on in the pointless and demeaning cultural status competition that the “climate change question” has become in the US—and Australia, the UK, and many other liberal democracies.

This is a hard thing to do! 

Item response curves for OCSIBut once accomplished, the second challenge emerges: to make sense of the surprising picture that one can see after disentangling people's comprehension of climate change from their cultural identities.

As I explained in my Monash and RMIT lectures, ordinary members of the public—no matter “whose side” they are on—don’t know very much about the basic mechanisms of climate change.  That’s hardly a surprise given the polluted state of the science communication environment they inhabit.

What’s genuinely difficult to sort out, though, is how diverse citizens can actually be on different sides given how uniform their (mis)understandings are.

Regardless of whether they say they “believe in” climate change, most citizens’ responses to the “Ordinary Climate Science Intelligence” (OCSI) assessment suggest they are disposed to blame human activity for all manner of adverse climate impacts, including ones wholly at odds with the mechanisms of global warming.

This result suggests that what’s being measured when one disentangles knowledge from identity is a general affective orientation, one that in fact reflects a widespread apprehension of danger.

The only individuals whose responses don’t display this generic affective orientation are ones who score highest on a general science comprehension assessment—the “Ordinary science intelligence” scale (OSI_2.0).  These respondents can successfully distinguish the climate impacts that scientists attribute to human activity from ones they don’t.

This discriminating pattern, moreover, characterizes the responses of the most science-comprehending members of the sample regardless of their cultural or political outlooks.

Yet even those individuals still don’t uniformly agree that human activity is causing global warming.

On the contrary, these citizens—the ones, again, who display the highest degree of science comprehension generally & of the mechanisms of climate change in particular—are also the most politically polarized on whether global warming is occurring at all.

Maybe not so surprising: what people “believe” about climate change, after all, doesn’t reflect what they know; it expresses who they are.

But still, what is going on inside their heads?

This is what one curious and perceptive member of the audience asked me at RMIT.  How, he asked, can someone simultaneously display comprehension of human-caused global warming and say he or she doesn't “believe in” it?

In fact, this was exactly what Yoshi and I had been struggling with in the hours before the RMIT talk.

Because I thought the questioner and other members of the audience deserved to get the benefit of Yoshi’s expansive knowledge and reflective mind, too, I asked Yoshi to come to the front and respond, which he kindly—and articulately—did.

Now, however, I’ll try my hand. 

In fact, I don’t have an answer that I’d expect the questioner to be satisfied with. That’s because I still don’t have an answer that satisfies me.

But here is something in the nature of a report on the state of my ongoing effort to develop a set of candidate accounts suitable for further exploration and testing.

Consider these four general cases of simultaneously “knowing” and “disbelieving”:

1. “Fuck you & the horse you rode in on!” (FYATHYRIO).  Imagine someone with an “Obama was born in Kenya!” bumper sticker. He in fact doesn’t believe that assertion but is nonetheless making it to convey his antagonism toward a segment of society. Displaying the sticker is a way to participate in denigration of that group’s status. Indeed, his expectation that others (those whom he is denigrating and others who wish to denigrate them) will recognize that he knows the proposition is false is integral to the attitude he intends to convey.  There is no genuine contradiction, in this case, between any sets of beliefs in the person’s mind.

2. Compartmentalization.  In this case, there is a genuine contradiction, but it is suppressed through effortful dissonance-avoiding routines.  The paradigmatic case would be the closeted gay man (or the “passing” Jew) who belongs to a homophobic (or anti-Semitic) group.  He participates in condemnation and even persecution of gays (or Jews) in contexts in which he understands and presents himself to be a member of the persecuting group, yet in other contexts, out of the viewing of that group’s members, he inhabits the identity, and engages in the behavior, he condemns.  The individual recognizes the contradiction but avoids conscious engagement with it through habits of behavior and mind that rigidly separate his experience of the identities that harbor the contradictory assessments.  He might be successful in maintaining the separation or he might not, and for longer or or shorter periods of time, but the effort of sustaining it will take a toll on his psychic wellbeing (Roccas & Brewer 2002).

3. Partitioning. In this case, too, the contradiction is real and a consequence, effectively, of a failure of information access or retrieval.  Think of the expert who possesses specialized knowledge and reasoning proficiencies appropriate to solving a particular type of problem.  Her expertise consists in large part in recognizing or assenting to propositions that evade the comprehension of the nonexpert.  The accessing of such knowledge, however, is associated with certain recurring situational cues; in the absence of those, the cognitive processes necessary to activate the expert’s consciousness and appropriate use of her specialized knowledge will fail. The expert will effectively believe in or assent to some proposition that is contrary to the one that she can accurately be understood to “know.”  The contradiction is thus in the nature of a cognitive bias. The expert will herself, when made aware of the contradiction, regard it as an error (Lewandowsky & Kirsner 2000).

4. Dualism. The contradiction here is once again only apparent—except that it is likely not even to appear to be one to the person holding the views in question. 

Everhart & Hameed (2013) describe the Muslim medical doctor who when asked states that he “rejects Darwinian evolution”: “Man was made by Allah—he did not descend from monkeys!” Nevertheless, the Dr. can readily identify applications of evolutionary science in his own specialty (say, oncology).  He also is familiar with and genuinely excited by medical science innovations, such as stem-cell therapies, that presuppose and build on the insights of evolutionary science.

With prodding, he might see that he is both “rejecting” and “accepting” a single set of propositions about the natural history of human beings.  But the identity of the propositions in this sense does not correspond to any identity of propositions within the inventory of beliefs, assessments, and attitudes that he makes use of in his everyday life.

Within that inventory, the “theory of evolution” he “rejects” and the “theory of evolution” he "accepts" are distinct mental objects (Hameed 2014).  He accesses them as appropriate to enable him to inhabit the respective identities to which they relate (D’Andrade 1981). 

Integral to the “theory of evolution” he “rejects” is a secular cultural meaning that denigrates his religious identity. His “rejection” of that object expresses—in his own consciousness, and in the perception of others—who he is as a Muslim. 

The “theory of evolution” he “accepts” is an element of the expert understandings he uses as a professional. It is also a symbol of the special mastery of his craft, a power that entitles those who practice it to esteem.  “Accepting” that object enables him to be a doctor. 

The “accepted” and “rejected” theories of evolution are understandings he accesses “at home” and “at work,” respectively.

But the context-specificity of his engagement with these understandings is not compartmentalization: there is no antagonism between the two distinct mental objects; no experience of dissonance in holding the sets of beliefs and appraisals that correspond to them; no need effortfully to cordon these sets off from one another. They are "entirely different things!," (he explains with exasperation to the still puzzled interviewer). 

It’s actually unusual for the two mental objects to come within sight of one another. “Home” and “work” are distinct locations, not only physically but socially: negotiating them demands knowledge of, and facility with, sets of facts, appraisals, and the like suited to the activities distinctive of each.

But if the distinct mental objects that are both called "theories of evolution" are summoned to appear at once, as they might be during the interview with the researcher, there is no drama or crisis of any sort. “What in the world is the problem,” the Dr. wonders, as the seemlingly obtuse interviewer continues to press him for an explanation.

So what should we make of the highly science comprehending individual who gets a perfect score on the OCSI but who, consistent with his cultural identity, states, “There is no credible evidence that human activity is causing climate change”?

I feel fairly confident that what’s “going on” in his or her head is neither FYATHYRIO nor “compartmentalization.”

I doubt, too, that this is an instance of “partitioning.”

“Dualism” seems like a better fit to me.  I think something like this occurs in Florida and other states, where citizens who are polarized on “climate change” make use of climate science in local decisionmaking.

But I do not feel particularly confident about this account—in part because even after constructing it, I still myself am left wondering, “But what exactly is going on in their heads?”

It’s not unusual—indeed, it is motivating and exhilarating—to discover that one’s understanding of some phenomenon that one is studying involves some imperfection or puzzle.

Nevertheless, in this case, I am also a bit unsettled. The thing to be explained took me by surprise, and I don’t feel that I actually have figured out the significance of it for other things that I do feel I know.

But after my talk at RMIT, I put all of this behind me, and proceeded to my next stop, where I delivered a lecture on “cultural cognition” and “the tragedy of the science communications commons.” 

You see, I am able to compartmentalize . . . .

References

D'Andrade, R.G. The cultural part of cognition. Cognitive science 5, 179-195 (1981).

Everhart, D. & Hameed, S. Muslims and evolution: a study of Pakistani physicians in the United States. Evo. Edu. Outreach 6, 1-8 (2013).

Hameed, S. Making sense of Islamic creationism in Europe. Unpublished manuscript (2014).

Kahan, D. M. Climate Science Communication and the Measurement Problem, Advances in Pol. Psych. (in press).

Lewandowsky, S., & Kirsner, Kim. Knowledge partitioning: Context-dependent use of expertise. Memory & Cognition 28, 295-305 (2000).

Roccas, S. & Brewer, M.B. Social identity complexity. Pers Soc Psychol Rev 6, 88-106 (2002).

Tuesday
Aug122014

I ♥ Item Response Theory -- and you can too!

As the 14 billion readers of this blog are aware, I’ve been working for the last 37 years—making steady progress all the while—on developing a “public science comprehension measure” suited for use in the study of public risk perception and science communication.

The most recent version of the resulting scale—“Ordinary Science Intelligence 2.0” (OSI_2.0)—informs the study reported in Climate Science Communication and the Measurement Problem. That paper also presents the results of a proto— public climate-science comprehension instrument, the “Ordinary Climate Science Intelligence” (OCSI_0.01).

Both scales were developed and scored using Item Response Theory.

Since I’m stuck on an 18-hour flight to Australia & don’t have much else to do (shouldn’t we touch down in Macao or the Netherlands Antilles or some other place with a casino to refuel?!), I thought I’d post something (something pretty basic, but the internet is your oyster if you want more) on IRT and how cool it is.

Like other scaling strategies, IRT conceives of responses to questionnaire items as manifest or observable indicators of an otherwise latent or unobserved disposition or capacity.  When the items are appropriately combined, the resulting scale will be responsive to the items’ covariance, which reflects their shared correlation with the latent disposition. At the same time, the scale will be relatively unaffected by the portions of variance in each item that are random in relation to the latent disposition and that should more or less cancel each out when the items are aggregated.

By concentrating the common signal associated with the items and muting the noise peculiar to each, the scale furnishes a more sensitive measure than any one item (DeVellis 2012).

While various scaling methods tend to differ in the assumptions they make about the relative strength or weight of individual items, nearly all treat items as making fungible contributions to measurement of the latent variable conceived of as some undifferentiated quantity that varies across persons.

IRT, in contrast, envisions the latent disposition as a graded continuum along which individuals can be arrayed. It models the individual items as varying in measurement precision across the range of that continuum, and weights the items appropriately in aggregating responses to them to form a scale (Embretson & Reise 2000). 

The difference in these strategies will matter most when the point of making measurements is not simply to characterize the manner in the which the latent disposition (“cultural individualism,” say) varies relative to one or another individual characteristic within a sample (“global warming risk concern”) but to rank particular sample members (“law school applicants”) in relation to the disposition (“critical reasoning ability”). 

In the former case, I’ll do fine with measures that enable me to sum up the “amount” of the disposition across groups and relate them to levels of some covariate of interest.  But in the latter case I’ll also value measures that enable me to discriminate between varying levels of the disposition at all the various points where accurate sorting of the respondents or test takers matter to me.

IRT is thus far and away the dominant scaling strategy in the design and grading of standardized knowledge assessments, which are all about ranking individuals in relation to some aptitude or skill of interest.

Not surprisingly, then, if one is trying to figure out how to create a valid public science comprehension instrument, one can learn a ton from looking at the work of researchers who use IRT to construct standardized assessments. 

Indeed, it’s weird to me, as I said in a previous post, that the development of pubic science comprehension instruments like the NSF Indicators (2014: ch. 7)—and research on public understanding of science generally—has made so little use of this body of knowledge.

I used IRT to help construct OSI_2.0.

Below are the “item response curves” of four OSI_2.0 items, calibrated to the ability level of a general population sample.  The curves (derived via a form of logistic regression) plot the probability of getting the “correct” answer to the specified items in relation to the latent “ordinary science intelligence” disposition. (If you want item wording, check out the working paper.)

One can see the relative “difficulty” of these items by observing the location of their respective “response curves” in relation to the y-axis: the further to the right, the “harder” it is.

Accordingly, “Prob1_nsf,” one of the NSF Indicators “science methods” questions is by far the easiest: a test taker has to be about one standard deviation below the mean on OSI before he or she is more likely than not to get this one wrong.

“Cond_prob,” a Bayesian conditional probability item from the Lipkus/Peters Numeracy battery, is hardest: one has to have a total score two standard deviations above the mean before one has a better than 50% chance of getting this one right (why are conditional probability problems so hard? SENCER should figure out how to teach teachers to teach Bayes’s’ Theorem more effectively!).

“Copernicus_nsf,” the “earth around the sun or sun around the earth?” item, and “Widgets_CRT,” a Cognitive Reflection Test item, are in between.

It's because IRT scoring weights items in relation to their difficulty—and, if one desires, in relation to their “discrimination,” which refers to the steepeness of the item-response curve slope (the steeper the curve, the more diagnostic a correct response is to the disposition level of the respnodent)—that one can use it to gauge a scale's variable measurement precision across the range of the the relevant latent disposition.

All 4 of these OSI_2.0 items are reliable indicators of the latent disposition in question (if they weren’t, the curves would be flatter).  But because they vary in difficulty, they generate more information about the relative level of OSI among heterogeneous test takers than would a scale that consisted, say, of four items of middling difficulty, not to mention four that were all uniformly easy or hard.

Indeed, consider:

The figures illustrate the variable measurement precision of two instruments: the NSF Indicators battery, formed by combining its nine “factual knowledge” and three “science methods” items; and a long (10-item) version of Frederick’s Cognitive Reflection Test (Frederick 2005). 

The “Test Information Curves” plotted in the left panel illustrate the relative measurement precision of each in relation to the latent dispositions each is measuring. Note, the disposition isn’t the same one for both scales; by plotting the curves on one graph, I am enabling comparative assessment of the measurement precision of the two instruments in relation to the distinct latent traits that they respectively assess.

Information” units are the inverse of the scale's measurement variance—a concept that I think isn’t particularly informative (as it were) for those who haven’t used IRT extensively enough to experience the kind of cognitive rewiring that occurs as one becomes proficient with a statistical tool. 

So the right-hand panel conveys the same information for each assessment in the form of a variable “reliability coefficient.”  It’s not the norm for IRT write-ups, but I think it’s easier for reflective people generally to grasp.

The reliability coefficient is conceptualized as the proportion of the variance in the observed score that can be attributed to variance in the "true score" or actual disposition levels of the examined subjects.  A test that was perfectly reliable—that had no measurement error in relation to the latent disposition—would have a coefficient of 1.0. 

Usually 0.7 is considered decent enough, although for “high stakes” testing like the SAT, 0.8 would probably be the lowest anyone would tolerate.

Ordinarily, when one is assessing the performance of a latent-variable scale, one would have a reliability coefficient—like Cronbach’s α, something I’ve mentioned now and again—that characterizes the measurement precision of the instrument as a whole.

But with IRT, the reliability coefficient is a continuous variable: one can compute it—and hence gauge the measurement precision of the instrument—at any specified point along the range of the latent disposition the instrument is measuring.

What one can see from the Figure, then, is that these two scales, while comparable in “reliability,” actually radically differ with respect to the levels of the latent disposition in relation to which they are meaningfully assessing individual differences. 

The NSF Indicators battery is concentrating most of its discrimination within the space between -1.0 and -2.0 SDs.  So it will do a really really good job in distinguishing people who are merely awful from those who outrageously awful.

You can be pretty confident that someone who scores above the mean on the test is at least average.  But the measurement beyond that is so pervaded with error as to make it completely arbitrary to treat differences in scores as representing genuinely different levels in ability.

The test is just too darn easy! 

This is one of the complaints that people who study public science comprehension have about the Indicators battery (but one they don’t voice nearly as often as they ought to).

The CRT has the opposite problem! 

If you want to separate out Albert Einstein from Johnny von Neumann, you probably can with this instrument! (Actually, you will be able to do that only if “cognitive reflection” is the construct that corresponds to what makes them geniuses; that’s doubtful.) The long CRT furnishes a high degree of measurement reliability way out into the Mr. Spock Zone of +3 SDs, where only about .01% (as in “one hundredth of one percent”) of the human population (as in 1 person in 10,000) hangs out.

In truth, I can’t believe that there really is any value in distinguishing between levels of reflection beyond +2.0 (about the 98th percentile) if one is studying the characteristics of critical reasoning in the general population. Indeed, I think you can do just fine in investigating critical reasoning generally, as opposed to grading exams or assessing admissions applications etc., with an instrument that maintains its reliability out to 1.8 (96th percentile).

There’d be plenty of value for general research purposes, however, in being able to distinguish people whose cognitive reflection level is a respectable average from those whose level qualifies them as legally brain dead.

But you can’t with this instrument: there’s zero discrimination below the population mean.

Too friggin’ hard!

The 10-item battery was supposed to remedy this feature of the standard 3-item version but really doesn't do the trick—because the seven new items were all comparably difficult to the original three.

Now, take a look at this:

These are the test information and IRT reliability coefficients for OSI 2.0 as well as for each of the different sets of items it comprises.

The scale has its highest level of precision at about +1 SD, but has relatively decent reliability continuously from -2.0 to +2.0.  It accomplishes that precisely because it combines sets of items that vary in difficulty.  This is all very deliberate: using IRT in scale development made it possible to select an array of items from different measures to attain decent reliability across the range of the latent "ordinary science intelligence" disposition.

Is it “okay” to combine the measures this way?  Yes, but only if it is defensible to understand them as measuring the same thing—a single, common latent disposition.

That’s a psychometric quality of a latent variable measurement instrument that IRT presupposes (or in any case, can’t itself definitively establish), so one uses different tools to assess that.

Factor analysis, the uses and abuses of which I’ve also discussed a bit before, is one method of investigating whether a set of indicators measure a single latent variable.

I’ve gone on too long—we are almost ready to land!—to say more about how it works (and how it doesn’t work if one has a “which button do I push” conception of statistics).  But just to round things out, here is the output from a common-factor analysis (CFA) of OSI_2.0. 

It suggests that a single factor or unobserved variable accounts for 87% of the variance in responses to the items, as compared to a residual second factor that explains another 7%. That’s pretty strong evidence that treating OSI_2.0 as a “unidimensional” scale—or a measure of a single latent disposition—is warranted.

At this point, the only question is whether what it is measuring is really “ordinary science intelligence,” or the combination of knowledge, motivations, and reasoning dispositions that I’m positing enable an ordinary citizen to recognize and give property effect to valid scientific evidence in ordinary decisionmaking contexts.

That’s a question about the “external validity” of the scale.

I say something about that, too, in “ ‘Ordinary Science Intelligence’: A Science Comprehension Measure for Use in the Study of Risk Perception and Science Communication,” CCP Working Paper No. 112.

I won’t say more now (they just told us to turn off electronic devices. . .) except to note that to me one of the most interesting questions is whether OSI_2.0 is a measure of ordinary science intelligence or simply a measure of intelligence.

A reflective commentator put this question to me.  As I told him/her, that’s a challenging issue, not only for OSI_2.0 but for all sorts of measures that purport to be assessing one or another critical reasoning proficiency . . . .

Holy smokes--is that George Freeman?!

References

DeVellis, R.F. Scale development : theory and applications (SAGE, Thousand Oaks, Calif., 2012).

Embretson, S.E. & Reise, S.P. Item response theory for psychologists (L. Erlbaum Associates, Mahwah, N.J., 2000).

 Frederick, S. Cognitive Reflection and Decision Making. Journal of Economic Perspectives 19, 25-42 (2005).

National Science Foundation. Science and Engineering Indicators (Wash. D.C. 2014).