Stefan Dumont, Susanne Haaf, Sabine Seifert

  1Whether letters, postcards, telegrams, e-mails, sms, or chats—correspondence between different agents is and for a very long time has been a crucial part of everyday’s life with just as much impact on private as on work and public lives. Naturally, already before the permeation of technology in our society, written correspondence captured vast amounts of people’s time. And today, the direct and simple ways of digital communication are used even more extensively.

  2As for historical correspondence, we gain insight in this extremely productive area by letters passed on to us which were written by scholars, politicians, artists, celebrities, or private persons. This way, we do not only learn about the impact of correspondence by the sheer amount of documents available in archives. These documents are also witnesses of peoples’ broad networks and of the topics which concerned them. They deliver insight in private views on public affairs as well as in everyday’s problems of historical persons of interest.

  3Thus, and not surprisingly, historical documents of communication quite frequently have become the subject of humanities’ research and are prepared in scholarly editions in order to enable and facilitate further research. Since nowadays editions are usually digital-born and, most of the time, also remain digital, this area has increasingly been attracting the attention of the Text Encoding Initiative community. Letters, for instance, may appear quite homogeneously structured at first sight but—as transient documents—tend to exhibit various peculiarities and exceptions from customary practice. And more generally, the available text types for communication changed over time, the text types themselves changed in their specific style and structuring as did the contexts in which these text types were typically used.

  4As conveners of the TEI Correspondence Special Interest Group (SIG)[1] and representative for the Deutsches Textarchiv (DTA) Base Format (Haaf/Geyken/Wiegand 2014/15), we were repeatedly asked about the TEI encoding of correspondence-specific phenomena. In 2015, the TEI Correspondence SIG had already addressed some of the problems with TEI encoding correspondence by introducing a new correspondence model to the TEI Guidelines with <correspDesc> and its several child elements (Stadler/Illetschko/Seifert 2016).[2] The DTA Base Format, which had started off as a format for printed texts, was opened to manuscript texts and enriched with markup for manuscript-specific phenomena in 2016 so that, among other things, it enables the integration of handwritten or typed (in addition to printed) letters in the DTA text corpus (Haaf/Thomas 2017).

  5Apart from mere correspondence annotation, there also was and is a growing demand to effectively link correspondence editions and projects in order to reveal correspondence networks of the ‘Republic of Letters’ and beyond. This desideratum led to the creation of the Correspondence Metadata Interchange Format (CMIF)—a constrained subset of the full TEI standard of the <correspDesc> element—and to the development of the web service correspSearch (Dumont 2016) with CMIF as exchange format. In 2018, the Rahtz Prize for TEI Ingenuity was granted to the developers of the <correspDesc> encoding model, the CMIF and correspSearch.[3]

  6However, there are still open questions on how to deal with several structural and textual occurrences, and work on creating an exhaustive environment for the TEI encoding of letters is not yet finished. As a next step, we decided to hold a workshop on these “Challenges of Letter Encoding” which would provide a forum for further discussions on problematic cases of correspondence encoding in TEI. The aim was to develop solutions and best practices within the range of possibilities already offered by the TEI, and, if necessary, to produce suggestions of potential extensions to the TEI.

  7The workshop was funded by CLARIN-D and was held at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW) in Berlin (Germany) in October 2018. We invited early-career researchers who deal with TEI encoding and/or especially correspondence encoding in the course of their daily work, as well as one member of the TEI Council for advice on proposals for TEI extensions. Some 20 participants from 15 institutions in Germany, Austria and Switzerland got together for the workshop.

  8Before the event, we gathered examples of insecurities or problems with applying TEI to correspondence texts from the participants, which were then dealt with in the course of the workshop. Based on this material, we formed five groups of (roughly) related encoding issues, and 4-5 participants per group were asked to discuss the given issues:

  • Group I: Text structures I (dealing with openers, closers, postscripts)
  • Group II: Text structures II (dealing with pre-printed parts like letterheads, addresses, stamps, seals, postcards etc.)
  • Group III: Metadata I (dealing with attachments, roles like author, scribe, and sender, actions like commenting etc.)
  • Group IV: Metadata II (dealing with unclear information)
  • Group V: CMIF and Metadata (dealing with the extension of CMIF, authority files and modelling correspondence in RDF)

  9The meeting was designed as a hands-on workshop, enabling the groups of participants to discuss the issues and material presented to them and to develop solutions. These sessions of group work were alternating with plenary sessions for discussions of problems and solutions across groups. All issues discussed concerned aspects of letter or postcard encoding: from correspondence-specific text structures to correspondence metadata. Subsequent to the workshop, the problems, discussions, and recommendations were summarized as handbook articles by the workshop participants. The articles underwent an internal review phase, and are now published in successive stages on this website for the community to comment and review. The articles now online and the ones coming soon are all in version 1 and remain as such stable. The public review phase lasts until 30 April 2020. Then, all articles will be revised and adapted, and made available in their final version here, as a version 2 of the handbook.

10This resulting handbook “Encoding Correspondence. A manual for encoding letters and postcards in TEI-XML and DTABf” is meant as a guide for annotators on characteristic and recurring problems of letter encoding in TEI, offering solutions and recommendations based on the TEI Guidelines, as well as possible extensions to the TEI. Next to these articles discussing the respective issues in a more or less longer manner, we plan, as a long-term objective, to provide a best practice guide of how to encode correspondence material for a quick and short overview.

11The literature used as well as the bibliographic information for each article itself are additionally stored in the Zotero group "Encoding Correspondence" and embedded as "COinS" in the HTML version of this manual. Therefore, one can add the bibliographic data to a literature management software with the appropriate browser extension (e.g. with Zotero or the Citavi Picker).

12As this is a community effort, we kindly invite you to comment on or review the articles, ask questions, give feedback, add examples of correspondence or encoding, or whatever you find helpful for making this a valuable resource for encoders of correspondence. This can be done with the easy-to-use annotating tool or by using the e-mail button at the side of each article.

13The complete handbook with all articles (but except the comments via is availabe for download at GitHub:




Stefan Dumont, Susanne Haaf, Sabine Seifert: Introduction. In: Encoding Correspondence. A Manual for Encoding Letters and Postcards in TEI-XML and DTABf. Edited by Stefan Dumont, Susanne Haaf, and Sabine Seifert. Berlin 2019–2020. URL: URN: urn:nbn:de:kobv:b4-20200110163942154-1974512-7Zotero