HomeDNADNA as Destructive Proof, Revisited – The DNA Geek

DNA as Destructive Proof, Revisited – The DNA Geek

Destructive proof is the absence of one thing you’d anticipate to see underneath the circumstances.  We use unfavorable proof each day in life, with out even being conscious of it.  For instance, in case your neighbor all the time parks in her driveway and her automobile isn’t there, you’d naturally assume she isn’t house.

Vincent (gold) didn’t match the Barton descendants (inexperienced). Diagram drawn in LucidChart.

We will additionally use unfavorable proof consciously and intentionally to attract conclusions about family tree.  An earlier submit described Vincent, who was anticipated to share DNA with a bunch of 5 folks.  When he didn’t, we concluded that he was not associated to them the best way we assumed.

We will make inferences about our DNA matches even when we don’t have direct entry to their DNA kits.  Contemplate Cousin Bob, who shares 650 cM.  This DNA quantity is nicely in vary for a primary cousin, however it’s simply as doubtless (actually, barely extra doubtless) to be a half first cousin.  Given the best DNA matches, you need to use unfavorable proof to determine which is true.

Diagram drawn in LucidChart.

How?  If a senior 1C1R (first cousin as soon as eliminated within the older era) has examined on either side, you possibly can used shared matches to see whether or not Bob matches each of them.

If he’s your full first cousin, he ought to.  However, if Bob matches the 1C1R in your grandmother’s aspect and never the one in your grandfather’s aspect, then he’s a half cousin somewhat than full.  That’s as a result of 1C1Rs all the time share DNA.


Shared Matches

Shared matches permit us to attract conclusions about our matches even when we will’t see their match lists ourselves.  All the primary DNA databases have some model of this.  It’s known as “Relations in Frequent” at 23andMe and “ICW” (in frequent with) at FamilyTreeDNA, however they do principally the identical factor:  for any given DNA match (like Bob), the characteristic reveals you which of them of your different matches additionally share DNA with that particular person.

There’s one exception, although.  AncestryDNA’s model of this characteristic has a threshold.  It solely reveals shared matches who share at the very least 20 cM with each you and the relative of curiosity.  If two matches share 19.99 cM or much less, they gained’t present as shared matches even when they actually do match each other.

For Bob, the edge doesn’t matter.  First cousins as soon as eliminated all the time share greater than 20 cM.  In reality, the bottom of three,700 self-reported values within the Shared cM Challenge was 102 cM, and the bottom I discovered in 10,000 simulations utilizing Ped-Sim was 50 cM.  (For causes I gained’t go into right here, Ped-Sim barely underestimates shared centimorgans, however it’s shut sufficient for our functions.)

The image adjustments for second cousins as soon as eliminated and extra distant relationships.  It’s potential to have a real 2C1R who shares lower than 20 cM, and the possibilities improve for 3C, 3C1R, and so forth.  So how can we use unfavorable proof for these extra distant connections after we don’t have direct entry to the DNA equipment in query?


Was Mae William’s Daughter?

Contemplate William and Nancy.  Nancy had a daughter named Mae who was born suspiciously near the time Nancy married William.  Was William Mae’s organic father?

Jacob, the grandson of William and Nancy, has examined, as has Mae’s nice granddaughter, Justine.  They share 117 cM.  Jacob and Justine can be 1C2R if William have been Mae’s father and half 1C2R if not.  The whole of 117 cM favors half 1C2R, however with a WATO rating of solely 4.  That is weak proof.  Can we do higher?

William had two different wives, and at the very least 14 descendants from these different two marriages have examined at AncestryDNA.  If William have been Mae’s father (that’s, if Speculation 1 is true), these matches would symbolize two half 1C2R, two half 2C1R, 5 half 3C, and 5 half 3C1R to Justine.  None of them are shared matches together with her.  (I’ve entry to Jacob’s equipment however not Justine’s.)

Diagram drawn in LucidChart.

I really feel fairly snug concluding that William was not, actually, Mae’s father.  However how assured can we be?  Technically, it’s potential for every of those relationships to share lower than 20 cM, however all 14 of them?  What are the percentages of that taking place?

Dr. Andrew Millard, a professor at Durham College within the UK, kindly carried out some simulations to reach at reply.  He got here up with chances {that a} match of recognized relationship would share lower than 20 cM.

Now let’s work some math mojo on this downside!  If Speculation 2 is true and William was not Mae’s father, there’s a 100% probability that Justine will share lower than 20 cM with the 14 matches, or a likelihood PH2 = 1.  In reality, she gained’t share any DNA with them in any respect, and AncestryDNA’s threshold doesn’t matter.

If as an alternative Speculation 1 is true and Mae was William’s daughter, the likelihood that Justine would share lower than 20 cM with each of the half 1C2Rs is PH1 = 0.029 x 0.029 = 0.00084.  We will convert to a WATO rating (an odds ratio) by dividing PH2 by PH1:  1/0.00084 = 1,190.  That’s a fairly convincing rating!

We don’t have to cease there.  We will embrace the opposite 12 descendants of William within the calculations.  (Keep in mind:  The likelihood of a number of occasions all occurring is the a number of of every particular person likelihood.)  Once we do that, we get a mixed likelihood of three.52 x 10-7 for Speculation 1 and an odds ratio of 1/(3.52 x 10-7) = 2,839,994.  Wowza!

Even when WATO was inconclusive and we didn’t have entry to Justine’s match listing, unfavorable proof gave us overwhelming help for the conclusion that William was not Mae’s father.


To date, I’ve uncared for to say an necessary statistical consideration:  independence.  If two occasions are correlated to at least one one other, you possibly can’t use each within the calculations.  Within the Vincent instance, discover that Barton 2’s little one has additionally examined however is labeled “Ignored Barton”.  That’s as a result of how a lot DNA the kid shares with Vincent depends on how a lot Barton 2 shares.  The kid will not be unbiased.

Within the Mae instance, there have been truly 10 different descendants of William that I excluded from the evaluation, as a result of they have been intently associated to a different match.  That’s, they weren’t unbiased.  I selected to attract the road at matches who have been extra intently associated than first cousins (youngsters, grandchildren, siblings, and niblings have been ignored).

Sooner or later, we’ll have instruments that may account for non-independence mechanically.  Till then, you should definitely exclude intently associated matches from calculations like these.



This weblog submit is predicated on a dialogue in The DNA Roundtable Fb group.  Many due to all who participated, particularly Dr. Andrew Millard and Malcolm Peach, who each carried out laptop modeling to reach on the chances.  Particular due to William Greatest for reminding me to elucidate the significance of independence in statistical evaluation.

Updates to this Put up

15 August 2022 — Added a proof of independence.



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments