Advice needed for referencing clusters

I'm looking for any advice on referencing clusters (or groups) of people when assembling units of evidence into a write-up. I need a convenient and meaningful reference system because of the large number of group references in this case.

I'm working on a huge problem to prove the identity of a single lady, but it has required me to use a cluster approach to piece together different families, and their generational relationships, from the groups described by the different sources.

Although the FAN Quicksheet doesn't give me an answer, it's not quite a FAN approach since the lady in question has apparently no documented connections to her birth family. I'm hoping, therefore, to work inwards by identifying everyone else and arguing a connection to who's left, and that's why there will be so many references.


Submitted byyhoitinkon Thu, 04/07/2016 - 10:08

It sounds similar to a problem I had trying to identify the location of a house in a record. In that time and place (Bredevoort, Gelderland, Netherlands, early 1600s), deeds only identified the location of a house in terms of its neighbors, e.g. "With the front to the street, between the houses of so and so, with the back towards so and so's land."

I would find whole groups of records that referred to a chain of houses, where the first record intified the right neighbor, the right neighbor's record identified the same left neighbor and another right neighbor, and so on. So I ended up with about 100 clusters of houses for that town, of which one cluster included the house that I was trying to locate. 

In the end I just numbered the clusters. I then made charts of each cluster to chart how they related to other clusters, e.g. cluster 1 is across the street from cluster 5, which is to the back of cluster 99. And I created a table of attributes for each cluster, e.g. cluster 1 has three corner houses, and only five in total. 

I then created a custom map based on contemporary maps that showed the blocks of houses on the map, and named those from A to P. I then started to map each of the numbered clusters onto the blocks. E.g., if I had a triangular block with five plots, that was a candidate for cluster 1. I was able to locate some of the blocks because they used absolute references (next to a named gate, next to the church) and that allowed me to put other clusters on the map. By the time I was done, I had placed all 100 clusters on the map and identified the owners of each of the circa 200 houses within the town walls. And we could organize a grand unveiling of the location of the house where Hendrickje Stoffels, the mistress of Rembrandt van Rijn (and my 11th great-aunt) was born. 

Perhaps a similar approach would work for you? You could just assign a number to each cluster of family members, to make it easy to refer to them. 

Also, have you read Anderson's Elements of Genealogical Analysis? His discussion of linkeage bundles sounds similar to what you are trying to do. 

By the way, I don't quite understand your remark that "it's not quite a FAN approach since the lady in question has apparently no documented connections to her birth family. " Isn't the whole idea of researching friends, associates and neighbors that you can tie somebody to her birth family indirectly by tying them to their FANs and then tie the FANs to their birth family? So why would you need documented connections to her birth family if you use the FAN-approach? 

Submitted byEEon Thu, 04/07/2016 - 12:25

Tony, Yvette's closing question is mine also. This case study, for example, covers a situation in which the birth family was totally unidentified for several generations of women in a mitochondrial line—in each case, no surviving record "documented a connection" to her. The identities were definitely resolved using the FAN, with the resulting proof arguments tested against DNA.  QuickLesson 11 deals with the same situation, but is simpler because it involves only one woman.

That said, I don't understand your own question about how to "referenc[e] clusters (or groups) of people when assembling units of evidence into a write-up."  You say you "need a convenient and meaningful reference system because of the large number of group references in this case." I

  • f, by references, you mean citations, then a normal citation would provide evidence for the assertion you are making, rather than being a reference for a "person" or a group of people.
  • If by references, you mean "mentions" of them in the narrative, there are several ways that might be handled, depending upon the situation.
  • Is it possible that your dilemma centers upon an effort to work within a relational database rather than working with traditional research reports?

Can you give us an example of your quandary?


Submitted byACProctoron Thu, 04/07/2016 - 16:19

Sorry for the delay responding here, and any typos I might introduce. I had a hand operation today so I have a plaster cast and can only type with one hand. Yvette: Your approach sounds quite similar to mine. This lady was born c1810 -- before any English census or civil reg -- but the first mention of her is when she married one of my Proctor ancestors. Hence, I know her maiden name but there is no corresponding baptism. The marriage witnesses were not from her family, and she seems to have had no obvious connection with them thereafter. I've ruled out any previous marriage, and illegitimacy, but not a change of given name. There is very limited scope for following her FAN connections outward to identify her birth family (this is a long-standing brick wall), so I'm now looking further afield (other people of same surname, occupation, parish, etc) and working my way inwards. A numeric reference would be easy (e.g. "Table 1") but it does not meet my "meaningful" criterion. Having to continually scroll back to see what I'm referring to can become very tedious for a reader. Editor: I don't mean citations here; I'm referring to information extracted from the relevant sources. For instance, I may have a transcription of some pertinent source information, or tabulated information assembled from a given source. When I first enter that information into my narrative (I do not use any database!) then I would cite where it came from. However, I will typically have frequent references to those extracted details as I build complex arguments for how they relate to each other. I'm therefore looking for guidance or advice on how to reference them that would help the reader follow those arguments. Tony

Submitted byEEon Fri, 04/08/2016 - 09:30

Tony, I feel your pain. I've just struggled with the writing of a case study that required 1000+ hours of FAN work to reach a conclusion. Much, much, much of what I developed along the way could not be covered in the proof argument.  This point prompts me to wonder about the "narrative" you are writing and what stage you are at in this project. 

From your description, it seems that you are not yet at the proof argument stage—rather, you are still in the process of developing evidence.  If so, this is the stage at which research reports provide me with the best framework.  For each new area in which I do research, each new hypothesis I explore, each new family I investigate, I set up a report in which I record my findings (positive and negative), my analyses of each record or detail, and the new hypotheses on which I will build.  (I also clearly separate my analysis from the transcription or abstract of the document I am analyzing, so my thought process does not "corrupt" the actual details of the document.) A reader of that particular report would then have all the detail needed to follow that block of findings and the analyses I had made. For issues that involve a prior report, my analytical comments would refer back to the earlier report by name and page or—if it would enhance clarity—repeat or summarize a prior document and explain how the new finding fits into or contradicts this earlier evidence.  

At the end of a project like this, after I feel I have resolved the issue, I then create a totally new narrative in the form of a proof argument. Portions of some few documents might then be quoted but, for the most part, I'm simply citing documents whose details I've used to assemble each point of the argument. Then, when the case study is published, the underlying research reports can be posted at my website so that others can explore the full body of evidence.

Submitted byACProctoron Fri, 04/08/2016 - 10:23

I've been amassing information on this for several years, and even spent a long time chasing a lady of the same name in the next county (only to prove she wasn't the right one), but not with much success generally. Only recently have some parts started to come together and bear fruit. That's caused a slow cascade of other inferences and so I've decided to write-up selected parts to help me retain it better. Even putting together generations of the same family will not be obvious and so I can already write-up how and why I believe that to be the case -- although the ultimate goal is still some distance away. These are not really as large as "prior reports" and would be consulted frequently as I forge ahead. Eventually, I'd like to weave together as many of those small write-ups as possible and publish it on my blog; even splitting it up across multiple blog-posts, though, will require a lot to stay on the "cutting-room floor". Finding (or developing) a scheme for referencing these partial results is as much for my benefit as for the final reader. If, as a simple instance, you'd tabulated baptisms for the same two parents (correctly cited, of course, so that the original data could be examined) and then wanted to reference your tabulated work in various subsequent arguments, how would you label or identify it? Do you simply use a numeric scheme as Yvette suggested?


Submitted byEEon Fri, 04/08/2016 - 20:13

Tony, if I 'tabulated baptisms for the same two parents (correctly cited, of course, so that the original data could be examined) and then wanted to reference my tabulated work in various subsequent arguments'—assuming you mean an analytical tabulation, not just a simple list—I would likely create that analytical tabulation as a report. Any report or narrative that referred to it would then refer to the report by name and page. However, if that "tabulation" were a simple compilation of children produced by the couple, based on relatively uncomplicated evidence, then my subsequent work would simply identify the children right there in the new narrative and cite the evidence for each. For uncomplicated evidence, there would have to be some special reason for requiring a reader to go back to an earlier report for the citations they need in order to find the originals.

Submitted byACProctoron Sat, 04/09/2016 - 05:04

Even the tabulation of the baptisms is not necessarily obvious, though, and different entries at least have to be correlated on parish, abode, occupation of father, date range, and any similar information from death registers, in addition to parent names. Having established such a reliable step, I would probably need to refer to it again in order to build other steps (e.g. related to the appearance of common names in geneations). The number of these steps, and the frequency of reference, might be a possible difference here.

I'm wondering if the production of separate reports in just obscuring the deeper issues. I'm trying to pull stuff together into a single (or nearly so) readable narrative report. If your "prior reports" were included in this way then you would be using section headings rather than report titles -- very similar, in principle. Having physically separate reports is not helpful to the casual reader (and I'm unlikely to get academic readers) so my labelling of prior steps has to help them follow an essentially non-sequential process of deduction in a sequential reading (i.e. without too much scrolling back). I received criticism for a previous article (, which should have been a lot simpler than than this larger project, because it was too difficult to follow in a single read. My father's comments are not even printable.

Taking a step back, and acknowledging that my background is not one of historical writing, I'm wondering if my approach might seem a bit foreign. I thought that that labelling of prior steps -- be they tablulations or inferences -- would he helpful in terms readability and in following the overall argument. Compare this with, say, a mathematical project where different steps such as lemmas, or even just prior equations, would be labelled and used to build further steps. Mathematics has quite a vocabulary of such steps ( but I think lemmas could have a correspondence in establishing a conclusion from historical evidence.

Am I out on a limb here? Is this not something that real genealogists try to accomplish?

(thanks for listening to my rambles)


Submitted byEEon Sat, 04/09/2016 - 09:10

Tony, there's one good way to find out if that approach works.  Use it. Create your final narrative, then submit it to a peer-reviewed journal. While it's difficult to discuss a framework that exists in theory, applying it to actual historical research and testing the response of skilled readers (at the review level) and those of mixed levels (if it reaches publication) will be a lot more informative.

Submitted byEEon Sat, 04/09/2016 - 09:33

Tony, re your comment:

"Having physically separate reports is not helpful to the casual reader (and I'm unlikely to get academic readers)"

On this point I strongly disagree. Historical researchers, for generations, have written reports for individuals who are 'casual readers,' not academics. In today's world, I'd argue, most individuals who are doing quality historical research for themselves are not academics and most of the public sector who commission historical research are not academics.

Most quality historical research is done in organized segments, with the end-product being a report that is a standalone account of one focused question. Customarily it begins by identifying the question to be answered, the premises on which the investigation is made,  and the people involved. It then covers the records used, the information found, the analysis of that information, and the hypothesis or work plan to be used for subsequent investigation. Each new report then builds on prior work or takes a new tack suggested by prior work.

Eventually, when a final conclusion is justified, then an over-arching narrative is written to bring together the evidence that most effectively makes the case. Each time it is necessary to cite an underlying argument or tabulation, then the specific report would be cited by name and page—or section as your propose.

The mathematical processes you describe are sound ones in the field of math, where theorems and lemmas are expected to work for many types of problems. But—I'd argue—the vagaries of history, human behavior, and record survival pose a quite different world for those who attempt to reconstruct the lives of those who lived in ages past.