Like many parts of the world, Japan has a history of suffering and living with epidemics. During the Edo period, when the publishing culture flourished, many publications were published on how to prevent epidemics and what to do during illness. For example, in 1862, woodblock prints called Hashika-e はしか絵 were published en masse in response to a measles epidemic, and these offer us today a glimpse of the past. Transcription Project: Tackling Pandemics in Early Modern Japan (hereafter referred to as the Tackling Pandemics project) aims to produce transcriptions of these Edo-period resources online. “Transcription” is a historiographic term used to refer to the process of converting texts preserved as written artefacts into actionable materials, that can then be reprinted or made available online.
Dr. Laura Moretti, Senior Lecturer in Pre-Modern Japanese Studies at Emmanuel College of the University of Cambridge, is the organizer of the Tackling Pandemics project. Normally, she would be teaching the participants of her “Summer School in Japanese Early Modern Palaeography” course (https://wakancambridge.com/) how to analyze Edo-period texts, but due to the effects of COVID-19, was forced to cancel the course for this year. The Tackling Pandemics project was designed as a fully online alternative to this summer school.
Currently, a total of 33 graduate students, researchers, and curators from educational institutions around the world, including the UK, USA, Russia, China and Japan, are involved in the project. Although their specializations may differ, they all share an interest in the culture of the Edo period and the history of Japan. Resources on smallpox, cholera, and measles from the open digital databases of the National Institute of Japanese Literature and the University of Tokyo Library are available on the project’s website (Figure 1). In addition to this, we include materials from the Victoria & Albert Museum, Dr. Moretti’s Suzuran Library, and another private collection called Ebi Bunko. The goal of this project is to complete these transcriptions with the help of the participants, and, through this process, cultivate their ability to read and comprehend the literature of the Edo period.
As a researcher and developer in the field of Digital Humanities, I work to support the technical aspects of the project. As Dr. Moretti will be writing in greater detail on the project’s aims and significance in her upcoming Teaching Moments article, in this article, I will introduce the platform Minna de Honkoku みんなで翻刻 (https://honkoku.org/), which the participants use to input and share their transcriptions.
Initially, Minna de Honkoku was launched in 2017 as a platform to allow public participation in the transcription of pre-modern materials on earthquakes. The system is developed and operated by the Kyoto University Research Group for Historical Earthquakes, an interdisciplinary group of researchers in the natural sciences and the humanities, of which I have been a member since my days as a graduate student. The Research Group has been transcribing materials for nearly 10 years with the aim of applying them to earthquake and disaster prevention research. Because the occurrence of earthquakes is to some extent cyclical, the study of past earthquakes is instrumental in understanding the underlying mechanisms and thus preparing for future earthquakes. Unfortunately, it was only after the end of the 19th century that seismic observations were recorded using modern instruments. Therefore, in order to obtain information on earthquakes that occurred before the Edo period, it is necessary to collect and transcribe historical texts and other written records in order to estimate the scale and damage of these earthquakes from their descriptions.
The Research Group for Historical Earthquakes started its activities after the 2011 Tōhoku earthquake and tsunami, and has managed to transcribe approximately 130,000 characters worth of resources on earthquakes by 2017. However, the number of transcriptions that a small group of researchers like ours can handle will only make a small dent in the sheer number of written materials preserved in Japan. And so we began planning a way in which we could invite members of the public to participate via the internet, enabling us to transcribe a large amount of materials all at once. Minna de Honkoku was developed for this purpose. In the field of digital humanities, this method is called “crowdsourced transcription”, and has been used in many other projects including University College London’s “Transcribe Bentham” and the US National Archives’s “Citizen Archivist”.
However, there are no examples of crowdsourced transcriptions of pre-modern Japanese materials before Minna de Honkoku. This is because the vast majority of documents written and published in Japan prior to the Edo period were written in so-called Kuzushiji くずし字, a cursive writing style. Due to modernization after the Meiji era, Kuzushiji slowly disappeared from the realms of public education and publishing in Japan. As a result, less than 0.01% of the Japanese population today possesses the ability to comprehend Kuzushiji.
To respond to this challenge, we have designed the crowdsourcing transcription system of Minna de Honkoku to serve the function of a teaching tool as well. For example, transcriptions are shared to a common “timeline” (Figure 2) for all participants, which allows them to receive corrections from other participants. Minna de Honkoku is also linked to a mobile app for learning Kuzushiji. In addition, starting with the release of the July 2019 version, we now offer an AI system for Kuzushiji recognition support. This AI program, developed in collaboration with the Center for Open Data in the Humanities (CODH) and Toppan Printing Co., can identify the selected Kuzushiji within a document image, and provide its reading along with a confidence score (Figure 3). Transcribing Kuzushiji is a very difficult task, but with the support of AI, even a beginner can take on the challenge without encountering mental barriers.
Fortunately for us, Minna de Honkoku ’s learning-based approach to transcription has worked unbelievably well. By June 2020, more than 6,000 people have participated in transcription work on Minna de Honkoku, resulting in the completion of over 1,200 pieces of transcribed materials. The total number of characters transcribed by the participants amounts to 9 million.
Dr. Moretti’s decision to use the Minna de Honkoku system for her project was made after she happened upon a talk that I gave last year at a symposium on “Kuzushiji and AI” online. As of now, the Tackling Pandemics project has deployed the Minna de Honkoku system on a server that is accessible only to participants and operates in a closed format. Once the transcriptions have been completed and the project is brought to a close, we will publish the resulting transcriptions on Minna de Honkoku ’s main website.
At the time of writing this article, a month has yet to pass since the Tackling Pandemics project has begun. Both the organizer, Dr. Moretti, and I, the technical director, are still working through the details by trial and error. The participants however have taken on their tasks enthusiastically, transcribing materials while discussing their work in both English and Japanese. Since I myself was first exposed to Kuzushiji in graduate school and still find difficulty deciphering handwritten materials, you can imagine my amazement at seeing global participants, for whom Japanese is a second language, accurately transcribe Kuzushiji characters. I’m very much looking forward to seeing how this project will develop as it goes on.
Suggested Readings (optional)
Transcription Project: Tackling Pandemics in Early Modern Japan, https://wakancambridge.com/project-2020/.
Hedges, Mark, and Stuart Dunn. Academic crowdsourcing in the humanities: Crowds, communities and Co-production. Chandos Publishing, 2017.
Yuta Hashimoto is a researcher of Digital Huminites and an Assistant Professor at National Museum of Japanese History, Japan. His study focuses on the applications of information technologies such as crowdsourcing and image analysis to historical research. He is also an iOS/Android developer and has published several apps for digital humanities research.
* * *
The Teach311 + COVID-19 Collective began in 2011 as a joint project of the Forum for the History of Science in Asia and the Society for the History of Technology Asia Network and is currently expanded in collaboration with the Max Planck Institute for the History of Science (Artifacts, Action, Knowledge) and Nanyang Technological University-Singapore.