Project Index





Schema for Benjamin Franklin's Correspondence Network During the London Years

Claire Arcenas and Caroline Winterer , July 2016

The data schema is a description of the author's data model. It is both a guide to understanding the values in a data set and a model that may be applied to other data sets. For example, the data schema for John Locke's correspondence network might also be applied to the correspondence of Thomas Hobbes or René Descartes. We consider the data schema an essential research product which, by itself, expresses the design of the research inquiry while also supporting effective data sharing, discovery and analysis.

The visualizations of the data tables on this page were created using Breve, a free, open source tool developed at Humanities + Design.

Data Table Viewer

Source Base:

As its source base, this project used the online Papers of Benjamin Franklin (hereafter online PBF) accessible here: To translate information from each item from online PBF into “data” accessible for large-scale analysis and visualization, we developed and then followed a series of steps, which we explain in what follows here. We checked each document from the online PBF against its corresponding document in the published editions—volumes 7 through 22—of the Papers of Benjamin Franklin (Yale, 1959-) (hereafter PBF). We also relied on the published volumes for their editorial apparatus. We do, however, include documents online PBF that remain “unpublished.” A document’s publication status is indicated in the title (unpublished documents will say “unpublished”) as well as the document URL explained below.

Data Tables:

There are three data tables: one for “documents,” one for “people,” and one for “places” associated with the documents and the people.

The documents table records every document in Franklin’s papers from January 1, 1757 to December 31, 1775—the nearly two decades with which our study is concerned. There are 3,443 documents.

For each document, we recorded the following information: document ID, document title, document URL, date, primary language, additional language, author name, recipient name, source location (smallest, city, state, country) and destination location (smallest, city, state, country).

In addition to the documents themselves, we are interested in the make-up of Franklin’s social network (i.e. his correspondents and the people and groups whose writings are amongst Franklin’s papers). Therefore, we created a second people table to record information about those individuals and groups who appear as either authors or recipients in the first “document” spreadsheet. There are 774 individuals and groups in the “people” spreadsheet.

For each individual or group, we record the following information: name; vernacular name (e.g. Lord North); gender (or status as a group); kin status; birth year; death year; birth place; birth place geo coordinates; birth place location (smallest, city, state, and country).

The places table combines the location information from the source and destination columns in the documents table with the birth location information in the people table and matches each place with its geocoordinates—its latitude and longitude.


The goals underlying our methodology were threefold: First, we strove to stay as true to the primary source as possible, given the constraints of spreadsheet realities (explained below), accurately recording information extrapolated from each document. Second, we strove for consistency. In short, if we did one thing supplied additional information or omitted uncertain information) for one document, we aimed to do the same thing for every other similar entry. Third, we strove for transparency so others can replicate each decision we made and choose to follow or revise it.

Documents Table


The enormous variety of genres these documents encompass is only a small testament to the many hats Franklin wore during the London years. These 3,443 documents consist primarily of personal letters exchanged between Franklin and his correspondents. Also included among his papers, however, are bills, receipts, deeds, petitions, reports, instructions, remonstrances, degrees, essays, sketches, recipes, and a whole host of other types of documents, including, for example, Franklin’s own Poor Richard’s Almanac and sketches of grave inscriptions made by his son William Franklin. Most of the contents of Franklin’s papers consist of things sent through mail, but many items were not. To impose our own modern conception of what constituted a letter or not was to impose an artificial order onto the past. Thus, regardless of the genre or type, we include all documents that are included in the online PBF. They all lend important insight into the nature and make-up of Franklin’s social network during his “London years.” Not all documents papers are either written by or sent to Franklin. As do the editors of the PBF, we include everything.

Multiple entries within a cell (multi-value cells) are separated by “:”. E.g. A document co-authored by Franklin and John Foxcroft would appear as “Benjamin Franklin:John Foxcroft.” We do not list multiple dates or multiple locations.

Throughout, we only add information not supplied by the document itself or by the editors when we are confident, without a serious doubt, that it is correct (see location explanations below, for example). When we do not know something, or could guess, but cannot, without a serious doubt, be sure, we leave it blank, or in the case of the author and recipient, indicate that it is “unspecified.” We believe it is important to capture these unknowns as well as what we know.

Document ID In this column, we record the Document ID, a specific number associated with each document in the online PBF. These range from 623544 to 627004.

Document Title In this column, we record the title assigned to each document by the editors of PBF and label this as the Document Title. We copy these exactly from the online PBF titles, which match the published PBF titles.

Document URL This column contains the specific URL for each document. It includes the volume number and page number from the published editions. indicates that this document can be found in volume 7, on page 94. The “a” following the page number indicates that it is the first item on that page (“b” and “c” indicate second and third items on respectively). If a document is unpublished, the URL will include a series of numbers following the page number and locator. It will look like this, for example:

Date In the format year-month-day, we record the date of each document in this column. When a range of dates is given, we follow the practice of the editors and list the earliest date. E.g. 1757-1759 will be 1757 and 13 February 1764 – 18 March 1764 will be 1764-2-13.

Many documents in the online and published PBF have a known month and year associated with them, but no day. Whenever the month and year of a letter is known, and the day is unknown, we record the document as having been written on the 15th of the known month in the known year. For example, “ May 1770” becomes 1770-5-15. We decided that the 15th is preferable as it falls nearly in the middle of each month with varying numbers of days. If the month is unknown, we list only the year (e.g. 1770). If the year is unknown, we leave the column empty.

From time to time, the editors of the published editions of PBF group documents together (e.g. 624029-32) under one title while the online version does not. As is our practice, we follow the online version and separate the documents, which means that we record separate dates for each sub document under the larger heading. If the online version groups two documents together (e.g. 624581), we do the same and list the first of multiple dates, in the case of document 624581, October 23, rather than November 7.

Given the long duration of the editing project for PBF, from time to time, the editors made mistakes and included a document in one volume, when they realized it should have gone later. If the date is definite, we list the revised date (e.g. 624108). If the date is not definite, we keep the date the editors assign it so it retains its position in the volume order (e.g. 625255)

Primary Language This column lists the language(s) of the document. If a document is in about equal parts of two languages we adopt the same multi-value cell disambiguation and list, for example, “English:French.”

Additional Language If the document is written in more than one language, this column records the additional language(s). We consider there to be an additional language when there are more than 3 words of it written. Two words in Latin (e.g. 623879), for example, does not make Latin an “additional language.” There remains importa future work to be done on the languages of Franklin’s correspondence, with respect to what documents in PBF are translations, for example.

Author Name This column lists the name of the author or authors of the document or records “unspecified,” if the author is unknown. Occasionally, the editors of the published PBF will venture a guess as to the authorship of a document with uncertain authorship. We follow their guidance on the matter. If authorship is unclear and there is conclusive proof neither for nor against Franklin as the author (e.g. 624677 when the editors say it is up to the reader to determine for herself), we list Franklin as the author. We do so on the grounds that these are his papers. If the editors do not think Franklin wrote it, we either list the person they do, or leave the cell as “unspecified.”

Recipient Name In this column, we recorded the name the person receiving the document. Not every document with a recipient is a letter. For example, when Franklin gives his wife, Deborah, the power of attorney, Franklin is the “author” and Deborah, the “recipient.” Some documents—such as essays or Poor Richard’s Almanac, for example—have no “recipient.” Others have Franklin as the “recipient,” even if he ultimately wa intended or ultimate audience, but only a messenger (e.g. 624042). If there is no recipient or the recipient is unknown, we record “unspecified” in this column.

Additional Notes Regarding Names We assigned each person one name, so if someone went by multiple names or changed her name, we use the same name throughout. For example, Mary Stevenson (Polly), the daughter of Franklin’s London landlady, married William Hewson in 1770 and thus became Mary Hewson. To avoid listing her a two people, we list her as Mary Stevenson Hewson throughout. So each person had a unique name, we sometimes included an additional, more specific identifier together with the name itself. For example, Mr. and Mrs. Lloyd are recorded as “? Lloyd (male)” and “? Lloyd (female).” Multiple peopl disambiguation. (e.g. Patrick Wilson and James Parker),

For the documents and the people tables to link in Palladio, each cell in the author and recipient columns must have a value, so listing unspecified here is preferable to leaving the cells with unknown information empty as we do for location information.

Source/Destination Smallest In these columns, we list the most particular source location (e.g. village, town, city, or county) for a document. This includes places such as Kensington, Bromley, or Hampstead, which are now part of the Greater London area; Prestonfield, which is now part of the greater Edinburgh area; and Roxbury, which is today part of the greater Boston area, for example. Although in same cases we know much more specific location information (e.g. Franklin’s street address or what coffee house from which he wrote something), we don’t list specific addresses or neighborhoods. Exploring the importance of be an interesting line of inquiry for future projects.

Source/Destination City In most cases, these columns list the same location as Source Smallest. With those examples (such as Kensington, Bromley, Hampstead, Prestonfield, and Roxbury) listed above, which are in the greater metropolitan areas with which our study is most concerned, we list London or Boston, for example, as the Source City. The same goes for Fairhill, Pennsylvania—Isaac Norris’s home—which we list as Fairhill in “Source Smallest” and Philadelphia in “Source City.” Such a distinction allows us to record both the more specific locator (e.g. Kensington) and the level of metropolitan area (e.g. London) with which we are most concerned.

Source/Destination State These columns list the next larger/specific locator, such as the colony (e.g. Pennsylvania), county (e.g. Derbyshire), state/district (e.g. North Holland or Hesse), or island (e.g. Antigua).

Source/Destination Country In these columns, we list the largest locator, such as America, England, Scotland, or France.

Source/Destination These columns contain the combination of the information contained in the three previous columns. Additional notes regarding location information: As was our practice for all document information, with respect to the source and destination location information for each document, we erred on the side of caution when supplying uncertain information (meaning, as a general rule, we leave blank what we do not know). When, however, it seemed likely without a reasonable doubt that a document was going from or to a particular place, even if the location was not given in the document itself, we supplied it. For example, even if there was no indication on a letter from Deborah Franklin to Benjamin Franklin that she was writing from Philadelphia, when we have no indication that she was elsewhere, we list Philadelphia as the source location. When we remained unable to discern with confidence or certainty the location, we left it blank. For the sake of clarity and consistency, we adopt modern place names. For example, we include “Germany” and “Italy” as countries. Woodbridge is Woodbridge Township and Elizabeth Town is Elizabeth, Brunswick is New Brunswick. In the instance of Coldengham, NY, we updated the spelling to Coldenham, NY.

It is important to note here, too, that the location listed as the destination, does not necessarily mean the location where the recipient of the letter actually is. Instead, it conveys the location where the person sending the document meant it to go, even if the person to whom it is going is not there at the time. 623801 provides a good example of a letter being sent to Franklin in London, even though he was not there at the time (between August 8 and November 2, Franklin was on a tour of the North, but his correspondents in America did not necessarily know this at the time). Yet, just because Franklin was not in London did not mean that he did not receive letters that were sent there. See for example, Franklin’s reply to William Strahan: “Dear Sir, Your agreeable letter of the 4th of August, is just come to hand, being sent back to me from London hither. I have been a Month on my Journey…” (Franklin to William Strahan, from Edinburgh, September 6, 1759, vol 8, page 435). Often, we do in fact know a great deal about someone’s location and movements, but cannot say with certainty where letters written to them are going. Take, for instance, George Whitefield, whose movements were closely documented by the Pennsylvania Gazette during the summer of 1764. When Franklin wrote to him on June 19, 1764, however, we do not know if the letter was going to Boston (where Whitefield had been until this point) or to New York, where Whitefield was going.

Less frequently, the source location does not convey the location of the author. For example, in January 1765, Franklin’s Poor Richard’s Almanac is listed as coming from Philadelphia (where he prepared it in the summer of 1764 and it was printed in January of 1765 even though Franklin was in London at the time of publication.

People Table

Name This column records the name of the individual (e.g. Henry Holm) or group (e.g. Académie Royale des Sciences). For more on names, see above.

Vernacular Name/Title If a person was well known by another title, this column records that (e.g. Henry Holm’s vernacular name is Lord Kames). We also record additional known information for those people whose names are incomplete (e.g. ? Viny is Mrs. R. W. Viny) and alternative spellings. Because we consistently standardize women’s maiden and married names, we do not include those in this column.

Gender This column records whether an author or recipient in Franklin’s network was a group (G) or, if an individual, either male (M) or female (F). We designate those within Franklin’s network as either “individuals” or “groups.” For example in document 625117 Thomas Livezey wrote as an individual and in 625119 he wrote in his capacity as part of a group (the Pennsylvania Assembly Committee of Correspondence). This means that we list Thomas Livezey at 625117 and at 625119 list the Pennsylvania Assembly Committee of Correspondence. We take “group” in a broad sense, including corporate bodies, governmental bodies, institutions, societies, newspapers, and designated groups of people. In short, anything that is not an individual person is a group. This does not mean, however, that two or more individuals necessarily constitute a group. Multiple individuals and multiple groups can be authors and recipients of letters. Both multiple individuals and multiple groups are separated by “:”.

Kin In this column, we record whether or not the person is “kin” of Franklin’s. We consider kin to be by marriage or by birth. We designate this on Y/N binary.

Birth Year Here we record the year a person was born, if known.

Death Year Here we record the year a person died, if known.

Birth Location (Smallest, State, Country, and all combined as birthplace) In these four columns we record what we know about a person’s place of birth. We follow the same methodology for determining and recording location as described above for the documents table. We do list the location of a group as its “birth location.” E.g. Oxford University has a birth location of Oxford.

General Notes on People Table:

For biographical information, we rely on information supplied by the online and published PBF, the Oxford Dictionary of National Biography, and American National Biography Online. As the ____ interactive visual indicates, it is important to remember how much remains unknown or uncertain with respect to biographical information for Franklin’s correspondents.


The process of converting information from the documents contained in Benjamin Franklin’s papers to “data” presented us with important reminders both of the scope as well as the limits of converting messy sources and evidence into clean, “data”-filled cells of an excel spreadsheet. Here, we offer a few reflections on these matters.


Conveying uncertainty is difficult—if not almost impossible—when working within the constraints of a data table (an excel spreadsheet as was the case with our project). If the editors of the published PBF could devote paragraph-long footnotes to laying out, weighing, and assessing the arguments for or against assigning authorship of a particular letter, for example, we were faced with an either/or choice: either we listed an author or we did not. We had no way to communicate degrees of certainty or uncertainty that prose can handle so well. Nor did we have a way to communicate even a simple either/or in a single cell. (For those readers who are interested in this dilemma, see, for example, documents 624462 and 625485.) For these reasons, from time to time, consistency trumps accuracy in large-scale (large-n) studies such as this. See the Goals section above.

Subjectivity, Interpretation, and Maintaining the Spirit of Humanistic Inquiry

The “data” we present is a work in progress in some ways. At the moment of publication, it is as “accurate,” “true,” or “objective” as we have been able to make it. However, one of the reasons we believe it is important to make the tables behind our arguments and visuals public and open for examination is so that others can use them, but also improve them. The steps we followed when transferring information from the documents to the three tables involved a large degree of interpretation, decision, weighting of options, and all the very traditional steps historians take when examining, assessing, and using evidence. Our hope is that by foregrounding the “subjectivity” of the data itself, readers will be reminded of the deep layers of messy humanistic inquiry beneath what too often can appear as a sterile set of data tables.

Project Index

Cite as

Arcenas, Claire Rydell and Winterer, Caroline. (2016). Schema for Benjamin Franklin's Correspondence Network During the London Years [PDF]. Stanford Digital Repository.