Page Title: Managing your DNA matches with Excel | learnalittleeveryday

  • This webpage makes use of the TITLE meta tag - this is good for search engine optimization.

Page Description: Normally when I write something here it's because I think I have something useful to say. This time I want to write about how I record my DNA matches in Excel. I'll be honest, I'm sure I'm not the first person to do this, and I'm not suggesting that this is the best method, but…

  • This webpage makes use of the DESCRIPTION meta tag - this is good for search engine optimization.

Page Keywords:

  • This webpage DOES NOT make use of the KEYWORDS meta tag - whilst search engines nowadays do not put too much emphasis on this meta tag including them in your website does no harm.

Page Text: Lets start with the basic concept: Each row on your spreadsheet represents one DNA match AND the “most-likely” shared ancestors(s). Your “Most-likely” Shared Ancestors I will be using the term “most-likely” quite a lot in this article. DNA helps significantly in genealogical research, but commonly doesn’t totally prove something, especially when you are dealing with distant relatives. There are two important factors here: There may be dead-ends in your family tree that hide other relationships. As an example my gg-grandmother, Ann Atkinson was illegitimate. There are no records identifying her father. Through her husband I have a number of DNA matches with his ancestors. Since both Ann and her husband came from the same small community (Hawnby in North Yorkshire) then it is possible that either there is another inheritance path that connects my matches with our “most-likely” shared ancestors OR I share other unknown ancestors of my gg-grandmother with my matches. There are circumstances that may well be hidden from any documentation that you have used in your research. Small DNA matches wouldn’t tell you that, say, one of your ancestors was having an affair with the brother of her husband, or was raped by her father-in-law. What to record ? My spreadsheet is split into three “logical” parts representing different things I want to store about my DNA matches. Each row of the spreadsheet represents one connection between myself and my DNA match. Part 1 – Details about the Match This is everything you want to know about the match themselves. Obviously you can record what information you want. I record the following Match Name (or their Username, or their Initials) Relationship. Although I enter this manually it’s based on calculated information in the second part of the spreadsheet. More later. DNA testing company used (ancestry, ftDNA, MyHeritage , GEDmatch etc.). I colour-code these details. It wsometimes makes it easier to find matches. Amount of shared DNA (both the centiMorgan total and the number of matches segments). Whilst the number of segments isn’t so interesting the amount shared is. Whether they match on my Mum’s or Dad’s side of the family Part 2 – Relationship details Here I record their degree of Cousinship, normally in a shorthand form eg. 3C1R for a third cousin once removed. I also have a number of other fields I use to calculate and validate this information. As ever, a picture paints a thousand words. below is a skeleton spreadsheet that I’ve prepared. Sample Excel spreadsheet – this shows the details I record for each match and the calculations used to validate Cousinship Part 3 – the Names of our ancestors up to a common match It’s probably easiest if I describe this part of the spreadsheet as a Christmas Tree. At the core I have a “central” column where I record the most-likely common ancestor and then progress with the children of both my match and myself until they are all recorded. I use one column for each generation. Again an illustration is better: The ancestor “Christmas Tree”. My ancestors, Common Ancestry, DNA Matches ancestors. Some points to note: To add new DNA matches to this, right-click on the row number below the point where you want to insert a new. match. Our most-likely shared ancestor(s) are recorded in black. I’ve use blue and pink to identify male and female ancestors. The colours help you spot direct paternal/maternal lines that may be interesting for additional yDNA/mitochondrial DNA testing. For all ancestors I record their year of birth and a location. The year of birth helps me spot any errors in the paths to a shared ancestor. The locations really helps me remember where my relatives migrated to. Since I’ve already recorded whether a match is on my paternal or maternal side (in the “Details of the Match” section) I don’t bother to re-record my Mum or Dad in this part of the spreadsheet. There is a visual trade off between how many columns of data you allocate to record the descendants of your shared ancestors. The more columns, the greater the amount of time you spend scrolling. I’ve allocated 6 columns for “my side” ancestors. This allows me to record up to 7th Cousins. (My most extreme match is currently a 7th cousin twice removed). I order matches based on which side of the family they come. Matches on my Mum’s side go at the top, matches on my Dad’s side are at the both. Since both my parents were “only-childs” my nearest relatives are going to be second cousins, sharing one of 4 sets of great grandparents which I use for clustering matches. Sometimes, thanks to cousin marriages, you may be related to someone in more than one way. To deal with this I record both paths of inheritance, but skip repeating some of the “Details of the Match” data to avoid screwing up some of my stats. For ease of use I’ve split and frozen the spreadsheet after column B. This mans I always see the match name and their relationship to myself. I often end up having to put in male or female “unknowns” in my DNA matches list. These are normally living parents and grandparents of my DNA match, which Ancestry hides for living people. Excel Calculations This is where the difficult bits come on. These are the five columns G to K in my spreadsheet. Each column is explained separately. Count of Generations to Myself (column G). I can count the number of direct ancestors I have recorded between myself and my common shared ancestor.  The Excel function COUNTA does this by counting the non-blank cells in a range. Since I don’t record myself or my parents in the ancestors I add 2 to the COUNTA value to give me the total number of generations. Count of Generations to my DNA match (column H) Again the COUNTA function counts the number of generations between the shared ancestor and the DNA match. Since I have recorded all the generations to the match I dont need to add additional generations like I did with my own calculations. Generations Difference (column I) This is the result of subtracting column G (Generations to Myself) from column H (Generations to my DNA match). A positive number means that they have more generations to our shared ancestor. I use Excel’s ability to do Conditional Formatting to help highlight this information. As an aside I’ve found that in most cases where there is a difference it is my DNA Match that has more generations. My ancestors were slow to procreate! Cousinship (column J) I normally calculated Cousinship based on the minimum number of generations from either my DNA match or myself. The Excel function MIN does this. Given that first cousins are two generations back to a common ancestor I need to deduct one from this figure. Number of Generations Removed (column K) This is basically a count of the number of generations difference between myself and my DNA match. I’ve already, in column I calculated this, but since it’s always a positive number I use the Excel ABS function to calculate this. Once I have all these fields calculated my relationship with my DNA match is column I and column J. I’m sure I could use Excel to enter this information directly in column B (the Relationship) field, however I’ve yet to work on this. To summarise, below is a table of the formulas I use on in the spreadsheet. Column Description

  • This webpage has 1201 words which is between the recommended minimum of 250 words and the recommended maximum of 2500 words - GOOD WORK.

Header tags:

  • It appears that you are using header tags - this is a GOOD thing!

Spelling errors:

  • This webpage has no spelling errors that we can detect - GOOD WORK.

Broken links:

  • This webpage has no broken links that we can detect - GOOD WORK.

Broken image links:

  • This webpage has no broken image links that we can detect - GOOD WORK.

CSS over tables for layout?:

  • It appears that this page uses DIVs for layout this is a GOOD thing!

Last modified date:

  • We were unable to detect what date this page was last modified

Images that are being re-sized:

  • This webpage has no images that are being re-sized by the browser - GOOD WORK.

Images that are being re-sized:

  • This webpage has no images that are missing their width and height - GOOD WORK.

Mobile friendly:

  • After testing this webpage it appears to be mobile friendly - this is a GOOD thing!

Links with no anchor text:

  • This webpage has 1 links that do not contain anchor text - this is NOT a good thing.

W3C Validation:

  • This webpage has no W3C validation errors - GOOD WORK.

Print friendly?:

  • It appears that the webpage does NOT use CSS stylesheets to provide print functionality - this is a BAD thing.

GZIP Compression enabled?:

  • It appears that the serrver does NOT have GZIP Compression enabled - this is a NOT a good thing!