There are a number of necessary questions that face Challenge Directors of Surname DNA Tasks:
- Why ought to I group individuals collectively?
- How ought to I group individuals collectively?
- What does every group inform me?
As an Administrator of 15 DNA Tasks for quite a lot of Irish Surnames, I’ve contemplated these points, explored completely different alternate options, fallen down rabbit-holes, and revised my pondering. So right here is my present streamlined method – little doubt it should evolve additional as time goes by. These are simply my very own private musings – different admins might differ of their method (and that is high quality – there is no such thing as a proper approach or fallacious solution to run a undertaking). And the dialogue beneath applies solely to Surname DNA Tasks – different DNA Tasks may have completely different causes for grouping and due to this fact various grouping methods.
I supply these ideas and concepts in order that undertaking members might get a greater understanding of the pondering behind the method of grouping individuals, and in order that undertaking directors would possibly choose up a couple of helpful ideas – please take what you want and discard the remaining.
So let’s discover every of those matters in flip.
Why ought to I group individuals collectively?
For me, the aim of a Surname DNA Challenge is to check the surname. That will appear apparent but it surely has a number of necessary implications.
Firstly, mounted inherited surnames arose in Eire about 1000 years in the past and in England & Scotland about 800 years in the past. Wales was a bit later nonetheless (with some components of Wales not adopting the apply of a hard and fast inherited surname till the 1850s). This defines the interval of research as being roughly the final 1000 years. And due to this fact, we must always purpose to create subgroups of people who find themselves associated to one another inside that timeframe.
For Irish and Scottish surnames at the least, something past 1000 years in the past steps into the realm of Clan historical past, and that in itself is an enchanting space of analysis, however one which falls extra below the remit of geographic tasks (e.g. the Munster Irish undertaking), haplogroup tasks (e.g. R-L226 undertaking), and even particular clan tasks (e.g. Historical Breifne Clans undertaking).
So for surname tasks, we must be aiming to establish teams of associated individuals, with the identical surname, who’re prone to be associated to one another inside the final 1000 years. Such teams are prone to descend from a single particular person who was the progenitor of the surname for that group.
And if we’re fortunate, we might be able to make a case for having recognized the genetic signature of the primary individual to bear the title 1000 years in the past. For Irish surnames, we might even be capable of hyperlink this to a few of the Conventional Genealogies and due to this fact to a selected Irish clan, thus connecting undertaking members with a a lot deeper a part of their ancestral heritage.

How ought to I group individuals collectively?
Some years in the past I developed the idea of Markers of Potential Relatedness (MPR). Merely stated, these are markers that time you towards the conclusion that two or extra persons are associated to one another. And by “associated” I imply inside the final 1000 years.
These Markers of Potential Relatedness assist us to establish individuals who could also be associated inside the final 1000 years and who due to this fact belong inside the similar subgroup.
You may see a presentation that takes a deep dive into this idea on this video right here, however probably the most helpful MPRs in apply (and the principle ones I exploit for grouping individuals collectively) are as follows:
- a identified relationship
- similar downstream SNP
- shut Genetic Distance to individuals with the identical surname
- similar USP (Distinctive STR Sample)
Let’s undergo every in flip.
A Identified Relationship
The primary one is apparent – if two individuals have a identified relationship, then clearly they’re “associated inside the final 1000 years” and due to this fact belong in the identical group. Some individuals might not know that they’re associated (e.g. 4th cousins) however have the identical frequent ancestor displaying up within the “Paternal Ancestor Title” column on the undertaking’s Outcomes Web page. A bit communication between these undertaking members can verify the connection and justify their being grouped collectively.

Similar downstream SNP
If two individuals share the identical “downstream” SNP (i.e. near 1000 years outdated or much less), then I group them collectively, particularly if they’ve the identical surname.
Rob Spencer’s Admin Utilities instrument is a good way of seeing precisely the place a selected SNP sits and what SNPs sit above it. Coming into any SNP will generate the SNP Sequence for that SNP.

TMRCA dates for downstream SNPs will be checked by merely googling the SNP title and YFULL.
Folks with the identical downstream SNP however a unique surname could also be a sign {that a} Surname Change has occurred in some unspecified time in the future up to now – the difficulty is that with out different data, you will not know on whose ancestral line the change occurred. Then you might be confronted with the basic query: which got here first? – the Fry rooster or the Boylan egg?
The Huge Y check provides rather more definitive knowledge than SNP Packs or single SNP exams and is my most popular (and really useful) technique of SNP-testing.
Shut Genetic Distance to individuals with the identical surname
When a brand new member joins one in all my tasks, the very first thing I do is test whether or not or not he has the surname being studied (or one in all its potential variants). I then test his Y-STR Matches and see if he matches some other undertaking members – if he does, I assign him to the identical group that they’re in. I may also double test that any downstream SNP knowledge he has is per the SNP outcomes of different members of that group. And I may additionally test to see if he shares any Distinctive STR Sample that characterises that specific group (see beneath).
A lot of the time this criterion is completely high quality for grouping individuals collectively, however we will run into main difficulties if there’s important Convergence current i.e. simply by likelihood, the genetic profile of an individual is much like the profile of many different “non-related” individuals. This has been a big difficulty with the M222 teams in a few of my tasks.
You may recognise when Convergence is prone to be current by trying on the variety of matches – if a undertaking member has an enormous variety of matches to all kinds of various surnames, then Convergence is probably going and most of those could be “false constructive” matches. Sure, he does share a standard ancestor with each single match however this can be hundreds of years in the past slightly than tons of. In different phrases, the connection is quite a bit additional again than it seems to be. And it might be nicely past the arbitrary 1000 yr threshold we now have set for outlining subgroups.
On this scenario, I might group everybody with the identical surname (or variant) into the identical massive overarching group (name it, say, Group 3). All of those individuals might or will not be associated inside the final 1000 years.
Then inside this massive group, I might create subgroups (3a, 3b, and so forth) of individuals with identified downstream SNP knowledge that locations them on a downstream department of the Tree of Mankind near our 1000 yr threshold. I could search for the age of the SNP on YFULL to verify the TMRCA date is roughly someplace between 1000-1500 years in the past.

Having created these SNP-defined subgroups, I might then add in non-SNP-tested people primarily based on rather more restrictive Genetic Distance standards than these used for “declaring a match” i.e. 2/37, 4/67 and 5/111 versus 4/37, 7/67 and 10/111. This method minimises the danger of inappropriate grouping however doesn’t eliminate it fully. In the end the one approach of being certain that somebody has been positioned within the right subgroup is for that individual to do the Huge Y check to establish their SNP profile. That is the really useful plan of action for anybody who has not managed to make it into one of many SNP-defined subgroups.
![]() |
Members who don’t meet the standards for a subgroup are left within the overarching group (Group 1 on this instance) |
A superb instance of this course of in apply is from my O’Malley DNA Challenge. Many Mayo O’Malley’s check constructive for the M222 SNP marker. I positioned them in Group 3 – a big overarching group for all M222+ O’Malley’s. To this point, downstream SNP testing has recognized 6 subgroups beneath this. The frequent ancestor for all 6 subgroups lived about 2000 years in the past (the TMRCA for the M222 SNP Block), and the frequent ancestor for every subgroup lived about 1000 years in the past or much less. You may learn an in depth account of this particular instance on this weblog publish right here.
![]() |
The frequent ancestor for every of the person 6 subgroups is inside the final 1000 years |
Similar USP (Distinctive STR Sample)
When a bunch of individuals have the identical worth for a number of particular STR markers, this may point out a selected “signature” for that specific group and anybody with the identical signature will be deemed to be “associated” and thus must be grouped with them. The variety of STR markers that make up a Distinctive STR Sample varies quite a bit, however the extra markers concerned, the extra strong the USP.
USPs had been straightforward to identify on the Outcomes Pages of the outdated WorldFamilies.Internet (WFN) web site (sadly now defunct) and an analogous scheme could be most welcome on FTDNA’s Outcomes Pages. The WFN web site in contrast every group’s genetic signature towards the signature (modal haplotree) of an upstream department of the Tree of Mankind and thus recognized any USPs and introduced them as colored columns on their Outcomes Pages. The colored sample within the diagram beneath fantastically portrays the Distinctive STR Sample inside completely different subgroups of the Gleason DNA Challenge.
It’s rather more troublesome to see USPs on the FTDNA pages as a result of they aren’t highlighted in color. You would want to make use of Dave Vance’s SAPP Programme or Chase Ashley’s Y-DNA Grouping App to spotlight any USPs.
So these are the principle strategies I exploit for assigning undertaking members to a selected group.
As well as, I’ve some basic recommendation on formatting the title for every group:
- quantity every group (01, 02, 03, and so forth) – it makes it simpler to discuss with when writing about it or discussing it with undertaking members.
- embrace the attainable ancestral location (this can be apparent from the MDKA data)
- embrace the abbreviated SNP Sequence (get it from Rob Spencer’s Admin Utilities)
- embrace any particular steerage (e.g. if R-M269, improve to Huge Y) or level members towards extra data (e.g. see Updates tab in About part for Subsequent Steps) – this may occasionally embrace hyperlinks to haplogroup, geographic & clan tasks that they need to be part of, in addition to helpful basic data (e.g. find out how to get probably the most out of your Y-DNA check, important issues everybody ought to do).

What does every group inform me?
Way more has been written about how to group undertaking members than about find out how to analyse the resultant teams. The grouping course of solely takes you half-way … you then must analyse every group in flip. If the general goal of a Surname DNA Challenge is to check the surname, then grouping merely lays the inspiration upon which subsequent evaluation is predicated.
The type of questions that may be explored in any evaluation of a selected group embrace: the place is the group from? does this hyperlink us to the identified historical past of the surname? how outdated is the group? what’s the branching construction? how did the title evolve over time? is there an affiliation with a pre-surname clan?
A sensible instance of find out how to method evaluation of particular person teams is detailed in this video right here (delivered on the O’Malley Clan Gathering in 2019).

Having a transparent image of the specified outcomes of your analysis means that you can create extra particular undertaking objectives. Thus the goals for any surname research might embrace the next:
- To establish distinct genetic teams of individuals carrying surname X (or one in all its variants)
- To analyse every genetic group and assess the place did it come from, how outdated it’s, and is there any connection to a pre-surname “clan”?
- To speak the conclusions of the evaluation for every genetic group
- To assist focus undertaking members on particular instructions for their very own ongoing genealogical analysis
In any case this work, you will want an efficient approach of speaking it to your undertaking members. Completely different admins use completely different strategies. Some publish common updates on the undertaking web site on FTDNA. Others create a separate web site or weblog or publication or annual report. No matter technique you select, you need to plan to maintain your undertaking members knowledgeable concerning the present standing of the undertaking and any new developments affecting particular teams. Additionally keep in mind that you’ll finally must go this activity on to a successor so it’s smart to design your communication technique with this in thoughts.
Hope you discover one thing of use amongst these hints and ideas.