Citing Online Resources

I'm trying to resurrect points made in a previous forum discussion, as best I can, that was lost during an outage of the Amazon Cloud server around 4 Aug 2016 (see https://www.facebook.com/evidenceexplained/posts/1095977760477605).

The original question involved (I believe) citing a FamilySearch family tree, but a discussion involving the difference between a "database" and a "content management system" arose, with contributions from both myself and Robert Raymond. Put simply, is there a difference between citing a Web site as one of the above, and how would you tell the difference?

Firstly, there are a couple of things to note about the technologies involved:

1) A database is a tool, and normally doesn't have much of a user-interface beyond accepting a query (e.g. a name search) and presenting the details for one or more database entries. For citations, there are certain 'derivative types' that are regularly used in this general area:

a) Index -- An index is also tool, but a tool used simply to help find something. An index is not necessarily a digital tool, and a good example might be the GRO (General Register Office) Index of vital events for England and Wales. A citation for that index might be:

Transcribed GRO Index for England and Wales (1837–1983), database, FreeBMD (http://freebmd.rootsweb.com/cgi/seach.pl : accessed 12 Aug 2016), marriage entry for Owen Polin and Rosanna Carr; citing Nottingham, 1843, Jun [Q2], vol. 15:833.

Note that this is actually citing a digital version of the paper index, and that is considered a database because it includes transcribed information that was originally being indexed (e.g. the registration district). Normally, though, an index simply leads you to a source, and you would be more likely to consult and cite that source rather than the intermediate index (see EE 2015, sec. 2.12).

b) Database -- A database is a digital tool that includes transcribed information and multiple indexes to help locate corresponding entries given different combinations of parameters.

c) Digital images -- Scanned or photographic images of records. These may be indexed or simply browsable.

d) Database with images -- This is considered a hybrid of resources that include both a database of transcribed textual information together with images of the original records. A good example might be census collections. In reality, many modern databases can store images as well as text, plus other data types, but this is rarely used except for small images (e.g. molecule diagrams); it is more likely that the images are stored in individual files and one of the database indexes used to help locate them. Since we're unlikely to know exactly how a given provider has organised things then we shouldn't get too hung-up on the difference.

2) OK, so what about a 'content management system' (CMS)? This is a fancy tool for managing Web pages, including creation/deletion, versioning, collaboration (i.e. multiple authors), indexing, formatting, and auxiliary resource files (e.g. script, CSS) . While there are indexes present, this does not make it a database -- no more than the index in each folder of your local computer would constitute a database. There may even be a commercial database helping with the organisation. The essential difference, though, is that if it contains authored work then it should be cited as an electronic publication, using the 'Book' paradigm (see EE 2015, p. 657). From this point of view, a FamilySearch family tree might be cited as a database, but their "memories" pages should be cited as electronic publications.

IMHO, of course!

Tony

Submitted byrraymondon Mon, 11/14/2016 - 17:39

Tony,

I'm an electrical/computer engineer by training, so that influences my interpretation of the word database. The term database was coined in the early 1962 by computer scientists and had a particular meaning and usage. A CMS is a database (many of which utilize MySQL, I hazard to guess). So is a file system and its folders (HPFS is an example utilizing B-trees for their indexes).

When you approach a reference librarian in a university library and use the term database, it means something different.

Were you to have approached my mother, the term would have had no meaning whatsoever. My guess is that that is true for most adults without a college degree. FamilySearch mostly avoids using the term for that reason. Ancestry has databases; we have online record collections. 

So what does database mean in Evidence Style citations? I should let Elizabeth speak for herself, but I'm a fool, and well, fools rush in.... I think her background influences her intrepretation of the word. Elizabeth is a well educated individual who has spoken with many a reference librarian and used many a database and her definition of the word refects that. However, Elizabeth taught me an important lesson many years ago: don't try and give words special meanings. My guess is that Elizabeth's intention is that the term database means what the dictionary and most people say it means. Afterall, they are the audience for the citation.

So where does that bring us to in a discussion of database versus CMS? I would base my decision on questions such as these: Who is the audience for your citations? What words do they know and what do those words mean to them? Should a technology under-the-covers change the citation? Does your citation communicate with clarity? Is it succinct? Does it communicate the strength of the source? Does it allow a person to straightforwardly find the source? And perhaps the source-of-the-source?

Submitted byACProctoron Tue, 11/15/2016 - 04:44

I agree that there are different interpretations of the "database" term, Robert, and that the technical one won't be the most useful in the context of citations. Hence, my underlying point is that you cite the resource as it appears to you, rather than trying to figure out how the provider has implemented things internally. The derivative types I listed are pretty close to EE usage, albeit a bit more explicit in their descriptions.

I'm afraid that I disagree slightly with your own perceptions: a CMS is really a "database application", and includes far more than the MySQL database (or whatever) at its heart. Also, an index is different in that it simply locates something rather than contains it -- hence, a folder's directory is technically an index. And database engines (whether relational, OLAP, or OO) incorporate multiple indexes in order to achieve their data organisation..

Rather than rushing in, I believe that I've treaded a fine path between the technical realities and the recommendations already in EE, especially those relating to electronic publications. I'm trying to clarify existing recommendations rather than undo or question them.

I think we made some good points in the previous thread, and just felt that it would be a shame to let Amazon lose them forever  :-)

Tony

Submitted byEEon Tue, 11/15/2016 - 10:22

Tony, thanks for reopening this discussion. Given that my background is history, rather than IT, for the time being I plan to just sit back and learn from the experts here. 

That said, I'll introduce one tangent: the meaning of that word transcription.  Like many words, of course, it carries different meanings in different fields—music and genetics as well as history and literature. From the history/literature standpoint, the term describes a full written (or typed) copy. In history-speak, the act of picking out bits and pieces of information for insertion into fields of a {umhh} database, would be extraction. For a researcher, this is a difference of major significance: the difference between making analytical decisions on the basis of a full record vis à vis partial data that someone else has cherry-picked.

http://www.thefreedictionary.com/transcribe