Check out wild & crazy "coherence based reasoning"! Are rules of evidence "impossible"?, part 2 (another report from Law & Cognition seminar)m
A recap. Last time I outlined a conception of “the rules of evidence” I called the “Bayesian Cognitive Correction Model” or BCCM. BCCM envisions judges using the rules to “cognitively fine-tune” trial proofs in the interest of simulating/stimulating jury fact-finding more consistent with a proper Bayesian assessment of all the evidence in a case.
Cognitive dynamics like hindsight bias and identity-protective cognition can be conceptualized as inducing the factfinder to over- or undervalue evidence relative to its “true” weight—or likelihood ratio (LR). Under Rule 403, Judges should thus exclude an admittedly “relevant” item of proof (Rule 401: LR ≠ 1) when the tendency of that item of proof to induce jurors to over- or undervalue of other items of proof (i.e., to assign them LRs that differ from 1 more than they actually do) impedes verdict accuracy more than constraining the factfinder to assign the item of proof in question no weight at all (LR = 1).
“Coherence based reasoning”—CBR—is one of the kinds of cognitive biases a judge would have to use the BCCM strategy to contain.. This part in the series describes CBR and the distinctive threat it poses to rational factfinding in adjudication.
Today's episode. CBR can be viewed as an information-processing dynamic rooted in aversion to residual uncertainty.
A factfinder, we can imagine, might initiate her assessment of the evidence in a reasonably unbiased fashion, assigning modestly probative pieces of evidence more or less the likelihood ratios they are due.
But should she encounter a piece of evidence that is much more consistent with one party’s position, the resulting confidence in that party’s case (a state that ought to be only provisional, in a Bayesian sense) will dispose her to assign the next piece of evidence a likelihood ratio supportive of the same inference—viz., that that party’s position is “true.” As a result, she’ll be all the more confident in the merit of that party’s case—and thus all the more motivated to adjust the weight assigned the next piece of evidence to fit her “provisional” assessment, and so forth and so on (Carlson & Russo 2001).
Once she has completed her evaluation of trial proof, moreover, she will be motivated to revisit earlier-considered pieces of evidence, readjusting the weight she assigned them so that they now fit with what has emerged as the more strongly supported position ( (Simon, Pham, Quang & Holyoak 2001; Holyoak & Simon; Pennington & Hastie 1991). When she concludes, she will necessarily have formed an inflated assessment of the probability of the facts that support the party whose “strong” piece of evidence initiated this “likelihood ratio cascade.”
What does this matter?
Well, to start, in the law, the party who bears the “burden of proof” will often be entitled to win only if she establishes the facts essential to her position to a heightened degree of certainty like “beyond a reasonable doubt.” One practical consequence of the overconfidence associated with CBR, then, will be to induce the factfinder to decide in favor of a party whose evidence, if evaluated in an unbiased fashion, would not have satisfied the relevant proof standard (Simon 2004). Indeed, one really cool set of experiments (Scurich 2012) suggests that "coherence based reasoning" effects might actually reflect a dissonance-avoidance mechanism that manifests itself in factfinders reducing the standard of proof after exposure to highly probative items of proof!
But even more disconcertingly, CBR makes the outcome sensitive to the order in which critical pieces of evidence are considered (Carlson, Meloy & Russo 2006).
A piece of evidence that merits considerable weight might be assigned a likelihood ratio of 1 or < 1 if the factfinder considers it after having already assigned a low probability to the position it supports. In that event, the evidence will do nothing to shake the factfinder’s confidence in the opposition position.
But had the factfinder considered that same piece of evidence “earlier”—before she had formed a confident estimation of the cumulative strength of the previously considered proof—she might well have given that piece of evidence the greater weight it was due.
If that had happened, she would then have been motivated to assign subsequent pieces of proof likelihood ratios higher than they in fact merited. Likewise, to achieve a “coherent” view of the evidence as a whole, she would have been motivated to revisit and revise upward the weight assigned to earlier considered, equivocal items of proof. The final result would thus have been a highly confident determination in exactly the opposite direction from the one she in fact reached.
This not the way things should work if one is engaged in Bayesian information processing—or at least any normatively defensible understanding of Bayesian information processing geared to reaching an accurate result!
Indeed, this is the sort of spectacle that BCCM directs the judge to preempt by the judicious use of Rule 403 to exclude evidence the “prejudicial” effect of which “outweighs” its “probative value.”
But it turns out that using the rules of evidence to neutralize CBR in that way is IMPOSSIBLE!
Why? I’ll explain that in Part 3!
# # #
But right now I’d like to have some more, “extra-credit”/“optional” fun w/ CBR! It turns out it is possible & very enlightening to create a simulation to model the accuracy-annihilating effects I described above.
Actually, I’m just going to model a “tame” version of CBR—what Carlson & Russo call “biased predecisional processing.” Basically, it’s the “rolling confirmation bias” of CBR without the “looping back” that occurs when the factfinder decides for good measure to reassess the more-or-less unbiased LRs she awarded to items of proof before she became confident enough to start distorting all the proof to fit one position.
Imagine that a factfinder begins with the view that the “truth” is equally likely to reside in either party’s case—i.e., prior odds of 1:1. The case consists of eight “pieces” of evidence, four pro-prosecutor (likelihood ratio > 1) and four pro-defendant (likelihood ratio <1).
The factfinder makes an unbiased assessment of the “first” piece of evidence she considers, and forms a revised assessment of the odds that reflects its “true” likelihood ratio. As a result of CBR, however, her assessment of the likelihood ratio of the next piece of evidence—and every piece thereafter—will be biased by her resulting perception that one side’s case is in fact “stronger” than the other’s.
To operationalize this, we need to specify a “CBR factor” of some sort that reflects the disposition of the factfinder to adjust the likelihood ratios of successive pieces of proof up or down to match her evolving (and self-reinforcing!) perception of the strength disparity in the parties’ the party’s case.
Imagine the factfinder misestimates the likelihood ratio of all pieces evidence by a continuous amount that results in her over-valuing or under-valuing an item of proof by a factor of 2 at the point she becomes convinced that the odds in favor of one party’s position rather than the other’s position being “true” has reached 10:1.
What justifies selecting this particular “CBR factor”? Well, I suppose nothing, really, besides that it supplies a fairly tractable starting point for thinking critically about the practical upshot of CBR.
But also, it’s cool to use this function b/c it reflects a “weight of the evidence” metric developed by Turing and Good to help them break the Enigma code!
For Turing and Good, a piece of evidence with a likelihood ratio of 10 was judged to have a weight of “1 ban.” They referred to a piece of proof that had a likelihood ratio 1/10 that big as a “deci-ban”—and were motivated to use that as the fundamental unit of evidentiary currency in their code-breaking system based on their seat-of-the-pants conjecture that a “deciban” was the smallest shift in the relative likelihoods of two hypotheses that human beings could plausibly perceive (Good 1985).
So with this “CBR factor,” I am effectively imputing to the factfinder a disposition to “add to” (or subtract from) an item of proof one “deciban”—the smallest humanly discernable “evidentiary weight,” in Turing and Good’s opinion—for every 1-unit increase (1:1 to 2:1; 2:1 to 3:1, etc.) or (decrease--1:1 to 1:2; 1:3 to 1:4) in the “odds” of that party’s position being true.
And this figure illustrates how this distorting potential can be affected by CBR generally:
In the “unbiased” table, “prior” reflects the factfinder’s current estimate of the probability of the “prosecutor’s” position being true, and “post odds” the revised estimate based on the weight of the current “item” of proof, which is assigned the likelihood ratio indicated in the “LR” column. The “post %” column transforms the revised estimate of the probability of “guilt” into a percentage.
I’ve selected an equal number of pro-prosecution (LR >1) and pro-defense (LR<1) items of proof, and arranged them so they are perfectly offsetting—resulting in a final estimate of guilt of 1:1 or 50%.
In the “coherence based reasoning” table, “tLR” is the “true likelihood ratio” and “pLR” the perceived likelihood ratio assigned the current item of proof. The latter is derived by applying the CBR factor to the former. When the odds are 1:1, CBR is 1, resulting in no adjustment of the weight of the evidence. But as soon as the odds shift in one party’s favor, the CBR factor biases the assessment of the next item of proof accordingly.
As can be seen, the impact of CBR in this case is to push the factfinder to an inflated estimate of the strength of the prosecution’s position being true, which the factfinder puts at 29:1 or 97% by the “end” of the case.
But things could have been otherwise. Consider:
I’ve now swapped the “order” of proof items “4” and “8,” respectively. That doesn't make any difference, of course, if one is "processing" the evidence they way a Bayesian would; but it does if one is CBRing.
The reason is that the factfinder now “encounters” the defendant’s strongest item of proof -- LR = 0.1—earlier than the prosecution’s strongest—LR = 10.0.
Indeed, it was precisely because the factfinder encountered the prosecutor’s best item of proof “early” in the previous case that she was launched into a self-reinforcing spiral of overvaluation that made her convinced that a dead-heat case was a runaway winner for the prosecutor.
The effect when the proof is reordered this way is exactly the opposite: a devaluation cascade that convinces the factfinder that the odds in favor of the prosecutor’s case are infinitesimally small!
These illustrations are static, and based on “pieces” of evidence with stipulated LRs “considered” in a specified order (one that could reflect the happenstance of when particular pieces register in the mind of the factfinder, or are featured in post-trial deliberations, as well as when they are “introduced” into evidence at trial—who the hell knows!).
But we can construct a simulation that randomizes those values in order to get a better feel for the potentially chaotic effect that CBR injects into evidence assessments.
The simulation constructs trial proofs for 100 criminal cases, each consisting of eight pieces of evidence. Half of the 800 pieces of evidence reflect LRs drawn randomly from a uniform distribution between 0.05 and 0.95; these are “pro-defense” pieces of evidence. Half reflect LRs drawn randomly from a uniform distribution between 1.05 and 20. They are “pro-prosecution” pieces.
We can then compare the “true” strength of the evidence in the 100 cases —the probability of guilt determined by Bayesian weighting of each one’s eight pieces of evidence—to the “biased” assessment generated when the likelihood ratios for each piece of evidence are adjusted in a manner consistent with CBR.
This figure compares the relative distribution of outcomes in the 100 cases:
As one would expect, a factfinder whose evaluation is influenced by CBR will encounter many fewer “close” cases than will one that engages in unbiased Bayesian updating.
This tendency to form overconfident judgments will, in turn, affect the accuracy of case outcomes. Let’s assume, consistent with the “beyond a reasonable doubt” standard, that the prosecution is entitled to prevail only when the probability of its case being “true” is ≥ 0.95. In that case, we are likely to see this sort of divergence between outcomes informed by rational information processing and outcomes informed by CBR:
The overall “error rate” is “only” about 0.16. But there are 7x as many incorrect convictions as incorrect acquittals. The "false conviction" rate is 0.21, wheras the "false acquittal" rate is 0.04....
The reason for the asymmetry between false convictions and false acquittals is pretty straightforward. In the CBR-influenced cases, there are a substantial number of “close” cases that factfinder concluded “strongly” supported one side or the other. Which side—prosecution or defendant—got the benefit of this overconfidence is roughly equally divided. However, a defendant is no less entitled to win when the factfinder assesses the strength of the evidence to be 0.5 or 0.6 than when the factfinder assesses the strength of the evidence as 0.05 or 0.06. Accordingly, in all the genuinely “close” cases in which CBR induced the factfinder to form an overstated sense of confidence in the weakness of the prosecution’s case, the resulting judgment of “acquittal” was still the correct one. But by the same token, the result was incorrect in every close case in which CBR induced the factfinder to form an exaggerated sense of confidence in the strength of the prosecution’s case. The proportion of cases, in sum, in which CBR can generate a “wrong” answer is much higher in ones that defendants deserve to win than in ones in which the prosecution does.
This feature of the model is an artifact of the strong “Type 1” error bias of the “beyond a reasonable doubt” standard. The “preponderance of the evidence” standard, in contrast, is theoretically neutral between “Type 1” and “Type 2” errors. Accordingly, were we treat the simulated cases as “civil” rather than “criminal” ones, the false “liability” outcomes and false “no liability” ones would be closer to the overall error rate of 16%.
Okay, I did this simulation once for 100 cases. But let’s do it 1,000 times for 100 cases—so that we have a full-blown Monte Carlo simulation of the resplendent CBR at work!
These are the kernel distributions for the “accurate outcome” “false acquittal,” and “false conviction” rates over 1000 trials of 100 cases each:
Carlson, K.A., Meloy, M.G. & Russo, J.E. Leader‐driven primacy: using attribute order to affect consumer choice. Journal of Consumer Research 32, 513-518 (2006).
I.J. Good, Weight of Evidence: A Brief Survey, in Bayesian Statistics 2: Proceedings of the Second Valencia International Meeting (J.M. Bernardo, et al. eds., 1985).
Keith J. Holyoak & Dan Simon, Bidirectional Reasoning in Decision Making by Constraint Satisfaction, J. Experimental Psych. 128, 3-31 (1999).
Simon, D. A Third View of the Black Box: Cognitive Coherence in Legal Decision Making. Univ. Chi. L.Rev. 71, 511-586 (2004).