QuickLesson 21: Citing DNA Evidence: Five Ground Rules

Citing Genetic Sources

Oh, the confusion! It’s tough enough for most history-minded people to wrap their neurons around all the scientific jargon and concepts involved in genetic research. But when we attempt to use it in a historical context, to prove the identity of a long-gone person and connect that person to a biological family, we seem to have more questions than chromosomes.

Not surprisingly, most of those questions come in to EE waving a flag that says “Help! How do I cite this?” If truth be told, most of the confusion stems not from how to cite but how to handle the evidence.

Let’s take five basic ground rules for citing and using sources—and then see how they apply to our use of genetic test results. First the ground rules:

1. There’s a difference between evidence and citation.

Evidentiary issues are what we discuss in our narrative. Citations identify the sources of our evidence.

2. DNA is just another form of evidence.

Ergo, citations to sources of DNA evidence  follow the same rules for citing any other kind of evidence.  There’s no special set of do’s and don’ts for creating citations to identify genetic sources.

3. Citations are created to support an assertion.

It matters not whether we are asserting the involvement of a certain person or military unit in a particular battle, or DNA evidence that Sally Hemings’s son Madison carried the Jefferson Y and did not carry the Carr Y Thomas Jefferson’s devilish nephews. The specifics of the evidence differs, but the task of citing the source  is the same.

4. The format of a citation depends upon the physical qualities of the source.

DNA evidence may be brand-spanking new in the history-research world, but it comes to us in old familiar formats: analytical reports, certificates, databases, finding aids, and instructional articles.

5. Source citations often have to carry explanations.

When the assertion we’ve made in our text is based on simple direct evidence, all our citation needs to do is cite the source. Sometimes, however, a simple citation is not self-evident. If, for example, we discuss a will and then cite a deed book as our source, readers may assume we’ve made an error if we don’t add an explanation. When our assertion is based on complex evidence (indirect, conflicting, or negative), we often have to add an explanation to a source citation to help readers understand how that particular source supports our assertion. If our source misidentifies the name of the person about whom we’ve asserted something, that error needs to be pointed out in the citation. Similarly, with citations to genetic evidence, identifying that “item of interest” often means we add a free-form explanation of the who and what that is involved in that database or report that we cite.

So ...  how do we apply these ground rules to our genetic research?  Erick M., in our Citation Issues Forum, has posed several useful questions that we’ll summarize here:


How do I compare multiple atDNA matches? Do I create a separate reference note for each comparison to a constant or is there a simpler way to cite multiple comparisons?


Let’s start with the point that citations are not attached to matches; they are attached to assertions. (Rule 3, above) What we cite depends upon what we assert. If we assert that one person’s atDNA does not match that shared by the two “constants” to whom we compare them, then our citation will identify the database in which we made those matches and it will identify the specific items of interest within that database. In this case, those “items of interest” are people. Each citation will need to identify those people to whom the assertion applies.


For an Ancestry DNA Circle with 10 or more testers, should I create a citation for each match within the Circle, or can I just cite the Circle (which by its nature may be continuously changing)?


Let’s relate this question to a simple situation we all know well. If our narrative made an assertion that involved ten random pages from a book, would it suffice to cite just the book and not say which ten pages support our assertion?  No. With DNA evidence, if our assertion involves ten people from a group but excludes others, our “item of interest” field would normally identify only those ten people.


Should I bother to cite testers that are within a Circle, but do not fall within the parameter of matching both constants to whom I am comparing?


Again, let’s draw upon common practice with other sources. If we use a source that yields no evidence, do we need to cite it in our narrative discussion of our findings?  Sometimes, but usually not. The choice we make here depends upon what assertion we have made. If a negative finding is relevant, per se, then we discuss it in our narrative and we cite (identify) the body of records, search terms, etc., that yielded the negative results. If a negative finding is not relevant, we don’t discuss it and we don’t cite the body of records in which it did not appear or the research strategies we used.


What about privacy issues? Is the use of their various kit numbers a privacy issue? At what point does the reader just have to take my word for the matches, since few will be able to access the data to see for themselves.


Question 1: It depends.

Question 2: Sometimes.

Question 3: Never! 

Protecting privacy:  As with conventional history research, privacy issues can conflict with our need for full disclosure of evidence. Owners of a set of records we’ve used may forbid us to identify them. That raises issue of provenance, authenticity, and credibility. In that case, the common solution is to use the evidence from those records as clues to find other records with which we can build the case for our conclusion that lacks credible direct evidence.

Kit numbers?  These can be a useful compromise when individuals don't want to be named in a discussion of our genetic evidence. But, as with names, we should ask before using.

"Just trust me”:  Never, ever, should be expect a reader to just “take our word” for whatever we assert. If we cannot present evidence others can verify, then we have not made our case or proved our conclusion.

In the best of cases, testers are willing to be identified. More commonly, some may be willing and some may not be. In the latter case, if the evidence from those willing to be identified is not sufficient to prove our case, then we have two choices (a) extend our research to include other willing testers until we have an adequate pool; or (b) suspend our research and declare no conclusion until and unless other willing-to-go-public matches happen to materialize in one of the databases we are using. 

Given the newness of this type of research and the paucity of pedogogical materials—our best guidance for these ethical issues comes from

  • analyzing the studies published in peer-reviewed journals.  As a genealogist, your natural choice would be genetic-evidence cases published over the last several years in the National Genealogical Society Quarterly, but valuable guidance can also be gleaned from genetic journals. If you are writing for a journal, your peer reviewers may offer suggestions and your editor will likely have a policy in place.
  • following the guidelines of the Genetic Genealogy Standards that were formulated by leaders in that specialty—online at http://www.thegeneticgenealogist.com/wp-content/uploads/2015/01/Genetic-Genealogy-Standards.pdf.   Of specific interest is Item 9.

“When lecturing or writing about genetic genealogy, genealogists respect the privacy of others. Genealogists privatize or redact the names of living genetic matches from presentations unless the genetic matches have given prior permission or made their results publicly available. Genealogists share DNA test results of living individuals in a work of scholarship only if the tester has given permission or has previously made those results publicly available. Genealogists may confidentially share an individual’s DNA test results with an editor and/or peer-reviewer of a work of scholarship.”

As with everything else in life, things new are often old things repackaged. They may include new quirks, but the basic issues often have been around forever. With historical evidence, applying the solutions and the workarounds we’ve already learned will usually help us through the new issues that, these days, happen far faster than anybody can develop standards and best practices for handling.

And, oh yes. As Erick pointed out, EE has a new QuickSheet: Citing Genetic Sources for History Research, Evidence Style (http://amzn.to/2yLvMcz). That 4-page guide provides an overview of genetic testing, the basic terminology, the core standards, the typical types of reports and—of course, many citation examples covering different types of genetic sources, different forms of reporting, different companies, and different online tools.

PHOTO CREDIT: Elizabeth Shown Mills, "Citing Genetic Sources," a collage assembled from a variety of copyright free images used under license.

How to Cite This Lesson

Elizabeth Shown Mills, “QuickLesson 21: Citing DNA Evidence: Five Ground Rules,” Evidence Explained: Historical Analysis, Citation & Source Usage (https://www.evidenceexplained.com/content/quicklesson-21-citing-dna-evidence-five-ground-rules : posted 29 June 2015; accessed [date]).

Oh, the confusion! It's tough enough for most history-minded people to wrap their neurons around all the scientific jargon and concepts involved in genetic research. But when we attempt to use it in a historical context, to prove the identity of a long-gone person, we seem to end up with more questions than chromosomes.