Saturday, June 27, 2020

The CBGM: Critically Biased?

Our guest today is Dr. Stephen Carlson of Australian Catholic University.  He is perhaps best-known to some readers due to his 2012 dissertation (at Duke University) that featured a very detailed compilation of the book of Galatians.   Dr. Carlson, thank you for joining us.

SCC: Thank you, Jim, for your interest.

JSJ:  You wrote a recent article that appeared in the Journal of Biblical Literature with a provocative title:  “A Bias at the Heart of the Coherence-Based Genealogical Method.”  Before we get to the article’s substance, could you briefly explain the claims that advocates of the Coherence-Based Genealogical Method have made about it?  What is the CBGM supposed to provide that we did not have before, and how?

SCC: The basic claim of the advocates of the CGBM is that they have a “more rigorous” way to evaluate external evidence in the textual criticism of the New Testament. External evidence, your readers may recall, is the weight we put on a particular variant reading due to the manuscripts that record it. Prior to the CBGM, the usual way to deal with external evidence is to sort them into text types like Alexandrian, Western, and Byzantine, and then evaluate the external evidence based on how well the text types support a particular variant reading. And the CBGM folks are right that his approach is not sufficiently rigorous. Indeed, a big problem with this traditional approach is contamination, where a manuscript may obtain its readings from multiple sources. This makes it difficult to define the various text types (the rise and fall of the “Caesarean” text type in Mark is a good case in point) and hard to assign some manuscript to a particular type when it has the characteristic readings of more than one text type. In essence, the CBGM proposes to be more rigorous than this by eschewing text types altogether and looking at relations between “potential ancestors” of various manuscripts. In my article, I argue that the way that potential ancestors are identified and even defined is fundamentally flawed and we should look for other ways for evaluating the external evidence.

JSJ:  The CBGM has a reputation for being complex and inaccessible.  But in your recent article, you state that you have been able to implement its algorithms, and that as a result, you noticed a problem.  Would it be accurate to say that you detected a built-in bias in the “genealogical coherence” aspect of the C.B.G.M. as it currently exists?  

SCC: Yes, I detected a bias in how they identify genealogical coherence. In the CBGM genealogical coherence comes from manuscripts having a common, extant “potential ancestor” in their textual flows, and potential ancestors are identified on how much they differ from the initial text. But distance from the initial text is not a valid genealogical criterion and it can be misled by genealogically irrelevant data. As a result, the CBGM is biased against bad copies of earlier texts and in favor of good copies of later texts. Bias is a problem of course because it distorts our ability to evaluate the external evidence and it gives more weight to certain manuscripts (or less weight to others) than we would if we knew the actual history of the text. The worst that can happen is that the CBGM would give apparently strong support to a late, non-initial reading, especially where the internal evidence is not decisive enough to countermand the misleading impression of the CBGM.

JSJ:  Generally, it’s understandable to assume parsimony, but random things sometimes happen that affect the text, such as having the same scribal accident occasionally occur independently in different transmission-streams.  How are these things handled?  How many “accidental agreements” have to occur before one says, “These agreements are not accidental”?  Or to put it another way:  could you explain the concept of coherence and non-coherence?

SCC: Accidental coincidence is a major problem. In fact, I think it is the most underappreciated problem among New Testament textual critics (who tend to be more worried about contamination). The CBGM does have an approach to accidental coincidence, which its proponents tend to call “multiple emergence.” Basically, you look at all the manuscripts attesting a particular reading and sort them to groups, so that each group is coherent (that is, having a textual flow that goes through a common, potential ancestor). When the manuscripts are not coherent, they’ll be in their own group. If you have multiple groups of such manuscripts, then you have multiple emergence of the variant. Of course, if the CBGM is not able to identify correctly that a group of manuscripts attesting the same reading is actually coherent because of some bias, then the CBGM will wrongly subdivide them into several groups and suggest that some readings are coincidental when they are in fact not.

JSJ:  Here’s a diagram [resembling Figure 4 in your article] reconstructing a simple transmission-stream.  In your article the flow is from left to right; here, it is from top to bottom, waterfall-style.  Can you tell us what this diagram is saying, and what is wrong with this picture?

SCC: This diagram is a simple stemma of a hypothetical history textual transmission. The story here begins at the top with A, the initial text. Two copies, B and X, are made of it, and B has one error, while X has two. (This is represented in the diagram with a length between A and X being twice the length of the branch between A and B.) Likewise, two copies, C and Y, are made of X, with C being more error prone than Y. Similarly, two copies of made of Y, E and D, with D more error prone than E. If we lose A, X, and Y, can we reconstruct the true history of the text based on B, C, D, and E?
            It turns out that if we assume no contamination or coincidences, we can reconstruct the history on the traditional “common-error” principle, but under the same assumptions we cannot under the CBGM. The reason that the CBGM cannot reconstruct the true history of the text under these very ideal condition is that it has a bias that makes accidental coincidences between B and E look coherent when they are not. And it suggests that the variants that B and E carry are better than the ones carried by C and D. For 1 John, these relations actually hold if you translate B to the fourth-century 03 (B/Vaticanus), C to the fourth-century 01 (ℵ/Sinaiticus), D to the fifth-century 02 (A/Alexandrinus), and E to the tenth-century 1739 (but a very good copy of a much earlier text). So this simple stemma does not point to a merely theoretical problem but an actual one in the transmission of 1 John.

JSJ:  How realistic is it, in your opinion, to use real-life manuscripts’ texts as proxies for potential ancestors of other manuscripts’ text?   Especially considering that we have a relatively small representation of surviving manuscripts, and also considering that no versional evidence and no patristic evidence is used in the  CBGM?

SCC: It’s only realistic within the Byzantine text and only if we look at a lot of them. Otherwise, it’s not realistic at all. Outside of the Byzantine text, the manuscripts are too few and too divergent from each other to be good proxies for potential ancestors. Due to the bias at the heart of the CBGM, the extent of these divergences are enough to make many of them appear to be potential descendants of more carefully copied text, when they are in fact cousins to varying degrees. Indeed the big problem with the potential ancestor notion in the CBGM is that it assumes that all relations between manuscripts can be characterized in terms of ancestors and descendants, instead of siblings and cousins, which is vastly more common on the historical record we actually possess. As for versional and patristic evidence, the CBGM does not even look at them, and even if they did, they may be so incomplete that it could yield nonsensical results (imagine if an Old Latin manuscript is a potential ancestor of a Greek one?).

JSJ:  Toward the end of your article, you pointed out that the CBGM gives an unjustifiable level of weight to a combination of witnesses – a combination that includes 1739 – in First John 1:7, where δε is not included in the text of Nestle-Aland 28 even though its support is both ancient and vast.  Again:  what’s wrong with this picture?

This is the bias in action. The CBGM really likes 1739 due to its relatively short distance to the initial text. This means that every reading it has—including its singular readings—is potentially the initial reading even when every earlier text disagrees with it. This means that the critic has to establish the text based solely on internal evidence, which is notoriously difficult in cases involving “particles and articles” that don’t really affect the propositional meaning or translation of the text. In the past, textual critics didn’t think this external evidence was good enough to warrant serious consideration; with the CBGM now they apparently do. I can only hope that our ability to evaluate the internal evidence for more substantive variant readings is good enough to overcome the CBGM’s bias.

JSJ:  Let’s look at another textual variant that was adopted in Nestle-Aland 28:  the Byzantine reading Πρεσβυτέρους τους at the beginning of First Peter 5:1.  Vaticanus, Alexandrinus, P72 and 2412 read Πρεσβυτέρους οὖν, Sinaiticus, Y, 623, and 1611 read Πρεσβυτέρους οὐν τους, and 1505 simply supports Πρεσβυτέρους.  I can see how internal arguments could lead to the adoption of τους, but how does the CBGM get there?  And how can one tell when the CBGM has had a decisive role in decision-making in NA28, and when it was not a factor?

SCC: This variation unit is one of those where the editors of the Editio Critica Maior (ECM) changed their mind. In the first edition of the ECM for 1 Peter in 2000, they went with Πρεσβυτέρους οὖν with 03 (B, Vaticanus); but in the second edition of the ECM in 2013, they went with Πρεσβυτέρους τοὺς with 1739 instead. Now, 03 and 1739 are the two closest manuscripts to the initial text for the CBGM, so their readings are always going to look good for the CBGM, particularly when the Byzantine text agrees with them. Moreover, all the variants are coherent, so there is little guidance on that front. Apparently, what happened is that that the editors changed their mind on the internal evidence between the two editions. Why they did so is unclear, and I cannot find any documentation or commentary on this variant. The only clue I have are the local genealogies published on Muenster’s institute’s website ( ), and they differ between the two editions. In any case, the external evidence is effectively neutralized here under the CBGM and plays no important role.

JSJ:  Do you think that the recent decision to adopt μέρει in First Peter 4:16 was made primarily due to a rethinking of internal considerations, and the CBGM was simply along for the ride?  Mink’s argument (see p. 72 of Wasserman & Gurry’s New Approach) sure sounds like it was driven by internal evidence.

SCC:  This variant gets a bit outside the scope of my paper but it shows a different way that the bias at the heart of the CBGM can pop up, but it takes some explaining. There are two readings in 1 Pet 4:16. The older reading of the NA27 is “in this name” (ἐν τῷ ὀνόματι τούτῳ) and is supported by an all-star cast of P72, 01 (ℵ/Sinaiticus), 02 (A/Alexandrinus), 03 (B/Vaticanus), 044 (Ψ), 33, 81, 1611, 1739, Old Latins, Coptic, Syriac, Armenian, Gothic, Ethiopian, and Cyril. The testimony of the earliest and most widespread witnesses is unanimously in favor of “in this name.” And it makes good sense in light of our knowledge of the earliest persecutions against Christians. The newer reading in the NA28 is “in this respect” (ἐν τῷ μέρει τούτῳ) is entirely Byzantine (049, P, 104, 180, etc.).
            It is important to note that the Byzantine text is not monolithically in favor of the second reading: there are also quite a few Byzantine manuscripts that have the “in this name” reading. In my research, this is the result of contamination, because I have ways of connecting the Byzantine manuscripts with this reading to older, non-Byzantine texts, but the bias of the CBGM can’t find this contamination because its potential ancestor formula is flawed. In fact, it gets the source relationships backwards, and is unable to recognize the actual sources of the contamination. As a result, the user of the CBGM is misled into thinking that going from “in this respect” to “in this name” is a common, independent change, when in fact the opposite was actually more common, to correct an older manuscript with “in this name” to “in this respect” in conformance with the more common, contemporary Byzantine reading. As a result, I strongly suspect that the CBGM results in this case have colored the editors’ reassessment of the internal evidence, causing them to favor a different sense of the transcriptional probabilities than their predecessors. For a good internal analysis on the merits of the previous NA27 reading (“in this name”) see Jarrett Knight’s article in JBL last year.

JSJ:  I’ve gotten the impression that the more rival readings there are in a particular variant-unit, the less useful the CBGM becomes – downright chaotic – and the more unstable the Nestle-Aland compilation is likely to become at those points.  Have you gotten this impression, and if so, why does this seem to be the case?

SCC: There is a big issue over the size and scope of variation units that is largely ignored in our discussions to date, so there isn’t much to go on. I suspect that, as in the case of 1 Pet 5:1, when all the variants are coherent (which seems to be easier to happen when there are more of them), then the CBGM does not have much to offer the textual critic for decision. But I’ll need to look at a lot more of them to be more confident.

JSJ:  When I look at things like the diagram of the textual flow for Second Peter 3:10 (on page 76 of Wasserman & Gurry’s A New Approach to Textual Criticism, about the CBGM), it looks like the CBGM began by building a line of descent for each set of rival variants in a specific variation-unit, and somewhere along the line its focus shifted, from being about relationships of readings, to something more concrete, involving relationships of manuscripts (or, manuscripts’ texts).  (See the diagrams on pages 89-91, and then, on p. 105, “The global stemma for the Harklean Group in the Catholic Letters.)  I still don’t quite grasp how that was done – how the global stemma was made without simply ignoring some of the data.  Could you explain that in a little more detail? 

SCC:  The key thing to know about the global stemma is that, aside from a few toy examples, it was never published or used to edit the text in the ECM. I spent a lot of time trying to understand it and how it relates to the textual flows, only to learn that it is still under development and irrelevant to the text of the NA28. I recommend ignoring it until it is actually implemented because it is still under development and who knows how it can change. All I can say is that the portion of the global stemma published in Wasserman & Gurry defies easy historical interpretation.

JSJ:  Are there any other reasons to approach the CBGM with caution?

SCC: Let me enumerate some of them.
(1) In addition to its bias, we mentioned that the CBGM does not take into account versional and patristic evidence, an important set of evidence for the early periods of the text.
(2) The behavior of the “connectivity parameter,” which we have not discussed, seems to be affected by the sampling bias, so the number would have to be different in the well-sampled Byzantine text than outside of it, but the CBGM has no provision for this.
(3) Another issue is that the method may be too beholden to what the editors think is the initial text. For the ECM/NA28, the editors started with a subset of the NA27 for all intents and purposes, but what if they started with the Byzantine or Codex Bezae? At this point, it’s an open question.
(4) Further, I suspect that the CBGM is not even finding contamination correctly (see above for 1 Pet 4:16), but that is something under active research and a matter for a different time.
(5) Finally, the major problem I have with the ECM and the NA28 is that the editors have not adequately explained their reasoning in all the places where they changed the text. This is particularly important because the CBGM’s problems mean that the external evidence will appear less decisive than it used to and put a lot more pressure on getting the internal arguments right. Yet, when the internal arguments are not documented, it forces people to assume it was the CBGM that caused the change when it could have been something else. Fortunately, the decisions leading to the Acts is better documented that the Catholics, but even then I still want something even more thorough.

JSJ:  Thank you for sharing your thoughts with us.

SCC: You’re welcome. I hope my explanations are helpful to your readers.


Timothy Joseph said...

Thanks for the interview with Stephen! The CBGM has become the new ‘Received Method’ with little interaction with the process except by proponents of the method! I hope you and Stephen continue to interact so that the rest of us can hear more than one side!


Matthew M. Rose said...

Very interesting! Thank you gentlemen for the helpful interview.

"Now, 03 and 1739 are the two closest manuscripts to the initial text for the CBGM, so their readings are always going to look good for the CBGM, particularly when the Byzantine text agrees with them."

...This statement speaks volumes.

It will be interesting to see how the more-or-less mixed texts of the papyri and the anti-coherence 'black sheep' that is codex א are dealt with in the Gospels. I suspect more obscure 'sniper shot' variant readings and perhaps a slight readjustment towards the Byzantine;--but all in all, Codex B will probably be given precedence (time will tell). It seems (to me at least) that the Byz. Text is the most viable option as a base Text for any *new* critical edition of the Greek NT. Because--honestly, how much could the original autographic Text truly differ from the common Ecclesiastical Text that the Byz. Txt. *was/is* within the Greek manuscript tradition?

The position of Hort continues to age very badly, and CBGM appears to be no exception in this regard.

Maurice A. Robinson said...

Many thanks to Dr Carlson for his perceptive comments, particularly since for most of us CBGM remains a black box of which few beyond Mink know the actual program or methodological implementation.

What would be preferable of course would be a release of the *actual* computer code, in order that those best qualified to evaluate such (whether Carlson, Gurry, Wasserman, McCollum, or various programmers) might derive some more definitive answers as opposed to the usual technobabble: this in order that the overall GIGO effect or lack of such could legitimately be evaluated by everyone (unless, of course, Mink and/or Münster simply don’t want a fair and open evaluation; but they of course are honorable people).

As matters stand, I envision a forthcoming article (not by me of course): "The New Wardrobe of Textual Criticism: Why Your Greek New Testament is Changing and Why It Probably Shouldn't Be" (with apologies to the Emperor and Peter Gurry).