The 1st International Workshop on Histoinformatics is held in conjunction with the 5th International Conference on Social Informatics. It aims at fostering the interaction between Computer Science and Historical Science. This interdisciplinary initiative is a response to the growing popularity of Digital Humanities and an increased tendency to apply computer techniques for supporting and facilitating research in Humanities. Nowadays, due to the increasing activities in digitizing and opening historical sources, the Science of History can greatly benefit from the advances of Computer and Information sciences which consist of processing, organizing and making sense of data and information. As such, new Computer Science techniques can be applied to verify and validate historical assumptions based on text reasoning, image interpretation or memory understanding. Our objective is to provide for the two different research communities a place to meet and exchange ideas and to facilitate discussion. We hope the workshop will result in a survey of current problems and potential solutions, with particular focus on exploring opportunities for collaboration and interaction of researchers working on various subareas within Computer Science and History Sciences. The main topics of the workshop are that of supporting historical research and analysis through the application of Computer Science theories or technologies, analyzing and making use of historical texts, recreating past course of actions, analyzing collective memories, visualizing historical data, providing efficient access to large wealth of accumulated historical knowledge and so on.

Themes and Topics

The main topics of the workshop are that of supporting historical research and analysis through the application of Computer Science theories or technologies, analyzing and making use of historical texts, recreating past course of actions, analyzing collective memories, visualizing historical data and providing efficient access to large wealth of historical knowledge. The detailed topics of expected paper submissions are (but not limited to):

  • Processing and text mining of historical documents
  • Analysis of longitudinal document collections
  • Search models in document archives and historical collections, associative search
  • Causal relationship discovery based on historical resources
  • Entity relationship extraction, detecting and resolving historical references in text
  • Computational linguistics for old texts
  • Digitizing and archiving
  • Modeling evolution of entities and relationships over time
  • Automatic multimedia document dating
  • Applications of artificial intelligence techniques to history
  • Simulating and recreating the past course of actions, social relations, motivations, figurations
  • Analysis of language change over time
  • Handling uncertain and fragmentary text and image data
  • Finding analogical entities
  • Entity linking in historical collections
  • Named entity detection
  • Automatic biography generation
  • Mining Wikipedia for historical data
  • OCR and transcription old texts
  • Effective interfaces for searching, browsing or visualizing historical data collections
  • Collective memory
  • Studying and modeling forgetting and remembering processes
  • Vulgarization of History through new media
  • Probing the limits of Histoinformatics
  • Epistemologies in the Humanities and Computer Science


  • Adam Jatowt (Kyoto University, Japan)
  • Gaël Dias (Normandie University, France)
  • Agostini-Ouafi Viviana (Normandie University, France)
  • Christian Gudehus (University of Flensburg, Germany)
  • Günter Mühlberger (University of Innsbruck, Austria)

PC Members:

  • Robert Allen (Yonsei University, South Korea)
  • Antal van den Bosch (Radboud University Nijmegen, The Netherlands)
  • Lindsey Dodd (University of Huddersfield, UK)
  • Antoine Doucet (Normandie University, France)
  • Alexis Drogoul (Institute of Research for Development, France)
  • Marten Dring (Centre virtuel de la connaissance sur l'Europe (CVCE), Luxemburg)
  • Frederick Gibbs (University of New Mexico, USA)
  • Pedro Rangel Henriques (Minho University, Portugal))
  • Nattiya Kanhabua (LS3 Research Center, Germany)
  • Tom Kenter (University of Amsterdam, The Netherlands)
  • Mike Kestemont (University of Antwerp, Belgium)
  • Alexander Korb (University of Leicester, UK)
  • Andrea Nanetti (Nanyang Technological University, Singapore)
  • Daan Odijk (University of Amsterdam, The Netherlands)
  • Denis Peschanski (Pantheon-Sorbonne University, France)
  • Malte Rehbein (University of Passau, Germany)
  • Marc Spaniol (Max Planck Institute for Informatics, Germany)
  • Shigeo Sugimoto (University of Tsukuba, Japan)
  • Nina Tahmasebi (Chalmers University of Technology, Sweden)
  • William Turkel (University of Western Ontario, Canada)

Accepted Papers

  • (Full) A Digital Humanities Approach to the History of Science: Eugenics revisited in hidden debates by means of semantic text mining
    Pim Huijnen, Fons Laan, Maarten de Rijke and Toine Pieters
  • (Full) Documenting Social Unrest: Detecting Strikes in Historical Daily Newspapers
    Kalliopi Zervanou, Marten During, Iris Hendrickx and Antal van Den Bosch
  • (Full) Building the social graph of the History of European Integration. A pipeline for Humanist-Machine Interaction in the Digital Humanities
    Lars Wieneke, Marten During, Ghislain Sillaume, Carine Lallemand, Vincenzo Croce, Marilena Lazzaro, Chiara Pasini, Piero Fraternali, Marco Tagliasacchi, Mark Melenhorst, Erik Harloff, Isabel Micheel, Jasminko Novak, Javier Garcia Moron and Francesco Nucci
  • (Short) Frame-based Models of Communities and their History
    Robert Allen
  • (Short) Collective Memory in Poland: a Reflection in Street Names
    Radoslaw Nielek and Aleksander Wawer
  • (Short) From Diagram to Network: A Multi-Mode Network Approach to Analyze Historical Resources of Art History
    Yanan Sun

Keynote and Invited Talk

Text Analytics for Detecting Events and Motifs in Historical Texts

Antal van Den Bosch
Center for Language Studies
Radboud University Nijmegen


Text mining or text analytics, an applied subfield of computational linguistics, offers a rich toolkit of algorithms and methods for extracting information and knowledge from texts. The most reliable tools are the ones operating on the word and sentence level. Unfortunately this is not the level where most of the historical research questions live. Rather, a decade of work on historical texts has challenged text analytics developers to find new solutions for domain-specific entity extraction; time reference extraction; the detection of event descriptions; and the detection of topical structure and motifs, 'memes', or iconic narrative elements. Arguably, with topical and narrative structures and motifs we now begin to offer the means to investigate multiple perspectives and framing automatically and at a large scale. I sketch a framework on how this all may work together using MERIT, a concept project set up with Marten During, focusing on the period in WWII between Operation Market Garden and Operation Veritable (September 1944 - March 1945) in and around the Arnhem and Nijmegen area at the eastern border of the Netherlands.

About the speaker

Prof. Antal van den Bosch' aim in research is to bring fundamental-computational work on memory-based natural language processing to real-world applications: machine translation, text analytics applied to historical texts and social media, and proofing tools. Prof. Van den Bosch held research positions at the experimental psychology labs of Tilburg University, the Netherlands and the Universite Libre de Bruxelles, Belgium (1993-1994), obtained his Ph.D. in computer science at the Universiteit Maastricht, the Netherlands (1994-1997) and held several positions at Tilburg University (1997-2011), where he was appointed full professor in computational linguistics and AI in 2008. In 2011 he took on a full professorship in language and speech technology at Radboud University Nijmegen, The Netherlands. He is a member of the Netherlands Royal Academy of Arts and Sciences.

Histoinformatics Invited Talk: Exploring Medieval Charters: the ChartEx Project

Roger Evans
Research leader
Natural Language Technology Group
University of Brighton


ChartEx is a collaboration between historians, archivists, and computer scientists to develop new ways of exploring the full text content of digital historical records. The project's focus is on medieval charters - records of legal transactions of property from the 12th to the 16th centuries. These survive in abundance and are one of the richest sources for studying the lives of people in the past, long before the establishment of censuses or birth registers. In particular ChartEx is exploring the descriptions of parcels of property (such as houses, workshops and fields) in charters and their associations with three key entity types: people, places and events. In recent years, historians have invested substantial effort to make charter records available in digital form. Building on this resource, ChartEx has three main technical components: the application of Natural Language Processing to individual charter documents to extract information from the text in symbolic form; the use of Data Mining to combine information from multiple documents and make new connections between them; and an interactive "virtual workbench" that allows historians, archivists and others to visualise and explore the information extracted. In this talk I will give a general overview of the ChartEx project, describe the interdisciplinary processes which contributed to the development of the ChartEx system, illustrate the virtual workbench interface, and provide a peek "under the hood" at the technical language processing and data mining innovations that support the workbench.
Notes: The ChartEx Project (http://www.chartex.org) is a collaboration between the University of York, the University of Brighton, Columbia University, the University of Washington, the University of Toronto and Universiteit Leiden. It is part of the Digging into Data Challenge (http://www.diggingintodata.org/), Jisc, the Arts and Humanities Research Council, the Economic and Social Research Council in the UK; the Institute of Museum and Library Services, the National Endowment for the Humanities and the National Science Foundation in the US; the Netherlands Organisation for Scientific Research; and the Social Sciences and Humanities Research Council in Canada.

About the speaker

Dr. Roger Evans is a Reader in Computer Science in the School of Computing, Engineering and Mathematics and a research leader in the Natural Language Technology Group at the University of Brighton. His research explores applications of computer technology (knowledge representation, advanced algorithms, machine learning), particularly to problems which involve the use of natural (human) languages. He graduated from Warwick (Mathematics, 1st class) in 1980, holds a Certificate of Advanced Study in Mathematics from Cambridge (Kings, 1981) and obtained a DPhil in Cognitive Studies at Sussex in 1987. He has 25 years of postdoctoral academic research experience, is a former SERC Advanced Fellow, a member of the EPSRC College, and a senior visiting research fellow at the University of Sussex.


