Welcome to the MISS Project:

Mining Social Structures from Genealogical Data


The starting point of this research project is the large collection of historical documents maintained by the Brabant Historical Information Center (BHIC). A document can be anything ranging from scans of birth and death certificates, memories of succession, or tax declarations, to social photographs or family pictures. The current status of this collection is that the documents have been tagged by source and subject.Researchers can use keyword-based search to find relevant documents for their research (either a scan or a pointer to a physical location) based on these tags. This database, however, is not at all flawless; many names are duplicate, have several alternative spellings, or even contain mistakes. Furthermore, important semantic links such as the parent-child relation are only implicitly available, making simple tasks such as finding out if two given persons are related, very labor intensive.

Project Overview

This project addresses the problem of how to derive identities of persons and social structures from large sets of genealogical data available as text and photographs with incomplete information. In order to do so we want to investigate and deploy a combination of techniques from data mining, machine learning and human computation. The project goals are (a) a semantically enriched and cleaned version of the current database of the BHIC; (b) the development of advanced search tools to support historical research; and (c) providing automatic tools for supporting large scale prosopographical research.

Research Team

Gerhard Weiss Maastricht University
Toon Calders Eindhoven University of Technology
Karl Tuyls Maastricht University
Hossein Rahmani Maastricht University
Julia Efremova Eindhoven University of Technology
Bijan Ranjbar-Sahraei Maastricht University
Frans Oliehoek Maastricht University



Maastricht University
Eindhoven University of Technology
The Netherlands Organisation for Scientific Research
Brabants Historical Information Center