Project Index





Schema for Benjamin Franklin's Correspondence Network During the London Years

Claire Rydell Arcenas , July 2016

The data schema is a description of the author's data model. It is both a guide to understanding the values in a data set and a model that may be applied to other data sets. For example, the data schema for John Locke's correspondence network might also be applied to the correspondence of Thomas Hobbes or René Descartes. We consider the data schema an essential research product which, by itself, expresses the design of the research inquiry while also supporting effective data sharing, discovery and analysis.

The visualizations of the data tables on this page were created using Breve, a free, open source tool developed at Humanities + Design.

Data Table Viewer

Source Base:

As its source base, this project used the online Papers of Benjamin Franklin (hereafter online PBF) accessible here: To translate information from each item from online PBF into meta-data accessible for large-scale analysis and visualization, we developed and then followed a series of steps explained here. We checked each document from the online PBF against its corresponding document in the published editions—volumes 7 through 22—of the Papers of Benjamin Franklin (Yale, 1959-) (hereafter PBF). We also relied on the published volumes for their editorial apparatus. We do, however, include documents from online PBF that remain “unpublished.” A document’s publication status is indicated in the title (unpublished documents will say “unpublished”) as well as the document URL explained below.

Data Tables:

There are three data tables: one for “papers,” one for “people,” and one for “places” associated with the documents and the people.

The papers table records every document in Franklin’s papers from January 1, 1757 to December 31, 1775—the nearly two decades with which our study is concerned. There are 3,443 documents.

For each document, we recorded the following information: document ID, document title, document URL, date, primary language, additional language, author name, recipient name, source location (smallest, city, state, country) and destination location (smallest, city, state, country).

In addition to the documents themselves, we were interested in the make-up of Franklin’s social network (i.e. his correspondents and the people and groups whose writings are amongst Franklin’s papers). Therefore, we created a second people table to record information about those individuals and groups who appear as either authors or recipients in the first “document” spreadsheet. There are 774 rows in the people spreadsheet. 772 of these contain information about the individuals and groups with whom Franklin corresponded, one contains information about Franklin himself, and one is a placeholder for “unspecified” persons.

For each individual or group, we record the following information: name; vernacular name (e.g. Lord North); gender (or status as a group); kin status; birth year; death year; birth place; birth place geo coordinates; birth place location (smallest, city, state, and country).

The places table combines the location information from the source and destination columns in the papers table with the birth location information in the people table and matches each place with its GeoCoordinates (i.e. its latitude and longitude).


The goals underlying our methodology were threefold: First, we strove to stay as true to the primary sources as possible (given the constraints of working with spreadsheets explained below), by accurately recording information extrapolated from each document. Given our overarching goal of accuracy, we sometimes privileged accuracy over precision. This meant omitting information about which we were uncertain, but that may have provided additional precision had we elected to include it. Second, we strove for consistency. In short, if we choose to do one thing for one document, we aimed to do the same thing for every other similar document for which the same choice presented itself. Third, we strove for transparency, so others can replicate each decision we made and choose to follow or revise it.

Papers Table


The enormous variety of genres these documents encompass is only a small testament to the many hats Franklin wore during the London years. These 3,443 documents consist primarily of personal letters (correspondence) exchanged between Franklin and his correspondents.

Also included among his papers, however, are bills, receipts, deeds, petitions, reports, instructions, remonstrances, degrees, essays, sketches, recipes, and, on occasion, other types of documents, including, for example, Franklin’s own Poor Richard’s Almanac and sketches of grave inscriptions made by his son William Franklin. Most of the contents of Franklin’s papers consist of things that were sent through mail or delivered. To impose our own modern conception of what constituted a letter or not was to impose an artificial order onto the past. Thus, regardless of the genre or type, we include all documents that are included in the online PBF in the papers table.

A very small minority of these documents were neither written by nor sent to Franklin. As do the editors of the PBF, however, we include everything. They all lend important insight into the nature and make-up of Franklin’s social network during his “London years.” Moreover, it is easy to limit the scope of analysis (say, to only those things written by Franklin) at any point using simple facet filters in any tool or table.

Multiple entries within a cell (multi-value cells) are separated by “:”. E.g. A document co-authored by Franklin and John Foxcroft would appear as “Benjamin Franklin:John Foxcroft.” We do not list multiple dates or multiple locations.

Throughout we only added information not supplied by the document itself, or by the editors, when we are confident, without a serious doubt, that it is correct (see location explanations below, for example). When we do not know something, or could guess but only with reservations, we leave it blank, or in the case of the author and recipient, indicate that it is “unspecified.” We believe that it is important to capture these unknowns as well as what we know.

Document ID In this column, we record the Document ID, a specific number associated with each document in the online PBF. These range from 623544 to 627004.

Document Title In this column, we record the title assigned to each document by the editors of PBF and label this as the Document Title. We copy these exactly from the online PBF titles, which match the published PBF titles.

Document URL This column contains the specific URL for each document. It includes the volume number and page number from the published editions. For example, indicates that this document can be found in volume 7, on page 94. The “a” following the page number indicates that it is the first item on that page, while “b” and “c” indicate second and third items on respectively. If a document is unpublished, the URL will include a series of numbers following the page number and locator. It will look like this, for example:

Date In the format year-month-day, we record the date of each document in this column. When a range of dates is given, we follow the practice of the editors and list the earliest date. E.g. 1757-1759 will be 1757 and 13 February 1764 – 18 March 1764 will be 1764-2-13.

Many documents in the online and published PBF have a known month and year associated with them, but no day. Whenever the month and year of a letter is known, and the day is unknown, we record the document as having been written on the 15th of the known month in the known year. For example, “ May 1770” becomes 1770-5-15. We chose the 15th because it falls nearly in the middle of each month with varying numbers of days. If the month is unknown, we list only the year (e.g. 1770). If the year is unknown, we leave the column empty.

From time to time, the editors of the published editions of PBF grouped together documents (e.g. 624029-32) under one title, while the online version lists them separately. As is our practice, we follow the online version and separate the documents, which means that we record separate dates for each sub document under the larger heading. If the online version groups two documents together (e.g. 624581), we do the same and list the first of multiple dates, in the case of document 624581, October 23, rather than November 7.

Given the long duration of the editing project for PBF, from time to time, the editors erroneously included a document in one volume and later realized it should have gone in a different volume. If the date is definite, we list the revised date (e.g. 624108). If the date is not definite, we keep the date the editors assign it so it retains its position in the volume order (e.g. 625255)

Primary Language This column lists the language(s) of the document. If a document is in about equal parts of two languages we adopt the same multi-value cell disambiguation and list, for example, “English:French.”

Additional Language If the document is written in more than one language, this column records the additional language(s). We consider there to be an additional language when there are more than 3 words of it written. Two words in Latin (e.g. 623879), for example, do not make Latin an “additional language” for document 623879. There remains important future work to be done on the languages of Franklin’s correspondence, particularly with respect to what documents in PBF are translations.

Author Name This column lists the name of the author or authors of the document or records “unspecified,” if the author is unknown. Occasionally, the editors of the published PBF will venture a guess as to the authorship of a document with uncertain authorship. We follow their guidance on the matter. If authorship is unclear and there is conclusive proof neither for nor against Franklin as the author (e.g. 624677 when the editors say it is up to the reader to determine for herself), we list Franklin as the author. We do so on the grounds that these are his papers. If the editors do not think Franklin wrote it, we either list the person they list, or leave the cell as “unspecified.”

Recipient Name In this column, we recorded the name the person receiving the document. Not every document with a recipient is a letter. For example, when Franklin gives his wife, Deborah, the power of attorney, Franklin is the “author” and Deborah, the “recipient.” Some documents—such as essays or Poor Richard’s Almanac, for example—have no “recipient.” Others have Franklin as the “recipient,” even if he ultimately was not the intended or ultimate audience, but only a messenger (e.g. 624042). If there is no recipient or the recipient is unknown, we record “unspecified” in this column.

Additional Notes Regarding Names

We assigned each person one name, so if someone went by multiple names or changed her name, we use the same name throughout. For example, Mary Stevenson (Polly), the daughter of Franklin’s London landlady, married William Hewson in 1770 and thus became Mary Hewson. To avoid listing her a two people, we list her as Mary Stevenson Hewson throughout. So each person had a unique name, we sometimes included an additional, more specific identifier together with the name itself. For example, Mr. and Mrs. Lloyd are recorded as “? Lloyd (male)” and “? Lloyd (female).” With disambiguate multiple people with the same name (e.g. Patrick Wilson and James Parker) using their lifespans in brackets.

For the documents and the people tables to link in Palladio, each cell in the author and recipient columns must have a value, so listing “unspecified” here is preferable to leaving the cells with unknown or no information empty as we do for location information.

Source/Destination Smallest In these columns, we list the most particular source location (e.g. village, town, city, or county) for a document. This includes places such as Kensington, Bromley, or Hampstead, which are now part of the Greater London area; Prestonfield, which is now part of the greater Edinburgh area; and Roxbury, which is today part of the greater Boston area, for example. Although in same cases we know much more specific location information (e.g. Franklin’s street address or what coffee house from which he wrote something), we don’t list specific addresses or neighborhoods. A more street-level perspective would be an interesting line of inquiry for future projects.

Source/Destination City In most cases, these columns list the same location as Source Smallest. With those examples (such as Kensington, Bromley, Hampstead, Prestonfield, and Roxbury) listed above, which are in the greater metropolitan areas with which our study is most concerned, we list London or Boston, for example, as the “Source City.” The same goes for Fairhill, Pennsylvania (Isaac Norris’s home), which we list as Fairhill in “Source Smallest” and Philadelphia in “Source City.” Such a distinction allows us to record both the more specific locator (e.g. Kensington) and the level of metropolitan area (e.g. London) with which we are most concerned.

Source/Destination State These columns list the next larger/specific locator, such as the colony (e.g. Pennsylvania), county (e.g. Derbyshire), state/district (e.g. North Holland or Hesse), or island (e.g. Antigua).

Source/Destination Country In these columns, we list the largest locator, such as America, England, Scotland, or France. For an explanation of our decision to adopt modern placed names, see below.

Source/Destination These columns contain the combination of the information contained in the three previous columns.

Additional Notes Regarding Location Information

As was our practice for all document information, with respect to the source and destination location information for each document, we erred on the side of caution when supplying uncertain information (meaning, as a general rule, we leave blank what we do not know). When, however, it seemed likely without a reasonable doubt that a document was going from or to a particular place, even if the location was not given in the document itself, we supplied it. For example, even if there was no indication in a letter from Deborah Franklin to Benjamin Franklin that she was writing from Philadelphia, when we have no indication that she was elsewhere, we list Philadelphia as the source location. When we remained unable to discern with confidence or certainty the location, we left it blank.

For the sake of clarity and consistency, we adopt modern place names. For example, we include “Germany” and “Italy” as countries, and, as mentioned above, use “America” as a largest place designation. Woodbridge is Woodbridge Township and Elizabeth Town is Elizabeth, while Brunswick is New Brunswick. In the instance of Coldengham, NY, we updated the spelling to Coldenham, NY.

It is important to note here, too, that the location listed as the destination, does not necessarily mean the location where the recipient of the letter actually is. Instead, it conveys the location where the person sending the document meant it to go, even if the person to whom it is going is not there at the time. 623801 provides a good example of a letter being sent to Franklin in London, even though he was not there at the time. Between August 8 and November 2, Franklin was on a tour of the North, but his correspondents in America did not necessarily know this at the time. Yet, just because Franklin was not in London did not mean that he did not receive letters that were sent there. See for example, Franklin’s reply to his friend, the printer William Strahan: “Dear Sir, Your agreeable letter of the 4th of August, is just come to hand, being sent back to me from London hither. I have been a Month on my Journey…” (Franklin to William Strahan, September 6, 1759:

Often, we do, in fact, know a great deal about someone’s location and movements, but cannot say with certainty where letters written to them are going. Take, for instance, George Whitefield, whose movements were closely documented by the Pennsylvania Gazette during the summer of 1764. When Franklin wrote to him on June 19, 1764, however, we do not know if the letter was going to Boston, where Whitefield had been until this point, or to New York, where Whitefield was going.

Less frequently, the source location does not convey the location of the author. For example, in January 1765, Franklin’s Poor Richard’s Almanac is listed as coming from Philadelphia (where he prepared it in the summer of 1764 and it was printed in January of 1765, even though Franklin was in London at the time of publication).

People Table

Name This column records the name of the individual (e.g. Henry Holm) or group (e.g. Académie Royale des Sciences). For more on names, see above.

Vernacular Name/Title If a person was well known by another title, this column records that (e.g. Henry Holm’s vernacular name is Lord Kames). We also record additional known information for those people whose names are incomplete (e.g. ? Viny is Mrs. R. W. Viny) and alternative spellings. Because we consistently standardize women’s maiden and married names, we do not include those in this column.

Gender This column records whether an author or recipient in Franklin’s network was a group (G) or, if an individual, either male (M) or female (F).

For an important discussion of controlled vocabularies—such as our application of group, male, and female designations here—as well as their significant limitations for gender designations, see the recent blog post by Samantha Callaghan, “Gender and the Georgian Papers” (February 2018), at

We take “group” in a broad sense, including corporate bodies, governmental bodies, organizations, institutions, societies, newspapers, and designated groups of people. In short, anything that is not an individual person is a group. This does not mean, however, that two or more individuals necessarily constitute a group. Multiple individuals and multiple groups can be authors and recipients of letters. Both multiple individuals and multiple groups are separated by “:”. At times, people within Franklin’s network wrote in individual and group capacities. For example, in document 625117 Thomas Livezey wrote as an individual and in 625119 he wrote in his capacity as part of a group (the Pennsylvania Assembly Committee of Correspondence). This means that at 625117 we listed Thomas Livezey as the author and at 625119 we listed the Pennsylvania Assembly Committee of Correspondence as the author.

Kin In this column, we record whether or not the person is “kin” of Franklin’s. We consider kin to be by marriage or by birth. We designate this on Y/N binary.

Birth Year Here we record the year a person was born, if known.

Death Year Here we record the year a person died, if known.

Birth Location (Smallest, State, Country, and all combined as birthplace) In these four columns we record what we know about a person’s place of birth. We follow the same methodology for determining and recording location as described above for the papers table. We list the location of a group as its “birth location.” E.g. Oxford University has a birth location of Oxford.

Additional Notes Regarding People Table

For biographical information, we rely on information supplied by the online and published PBF, the Oxford Dictionary of National Biography, and American National Biography Online. As the Breve visualization at the top of this page indicates, it is important to remember how much remains unknown or uncertain with respect to biographical information for members of Franklin’s social network.


The process of converting information from the documents contained in Benjamin Franklin’s papers to “data” presented us with important reminders both of the possibilities as well as the perils of converting messy sources into the tidy rows and columns of an excel spreadsheet. Here we offer a few reflections on these matters.


Conveying uncertainty is difficult—if not almost impossible—when working within the constraints of a data table (an excel spreadsheet as was the case with our project). If the editors of the published PBF could devote paragraph-long footnotes to laying out, weighing, and assessing the arguments for or against assigning authorship of a particular letter, for example, we were faced with an either/or choice: either we listed an author or we did not. We had no way to communicate degrees of certainty or uncertainty that prose can handle so well. Nor did we have a way to communicate even a simple either/or in a single cell. (For those readers who are interested in this dilemma, see, for example, documents 624462 and 625485.) As data visualizing tools continue to develop, we are eager to investigate new methods for conveying varying degrees of certainty.

Subjectivity, Interpretation, and Maintaining the Spirit of Humanistic Inquiry

The “data” we present is a work in progress. We believe it is important to make the tables behind our arguments and visuals open for examination, so others can use them, but also improve them. The steps we followed when transferring information from the documents to the three tables involved a large degree of interpretation, decision-making, weighing of options, and all the very traditional steps historians take when examining, assessing, and using evidence. Our hope is that by foregrounding the “subjectivity” of the data itself, readers will be reminded of the deep layers of messy humanistic inquiry beneath what too often can appear as a sterile set of data tables.

Project Index

Cite as

Arcenas, Claire Rydell. (2016). Schema for Benjamin Franklin's Correspondence Network During the London Years [PDF]. Stanford Digital Repository.