Here is an overview of some of my current projects.

  • digital representation of the Konstanzer Jahrgeschichten

Here you can find a collection of transcriptions from manuscripts that contain the so-called Konstanzer Jahrgeschichten.
The number of entries and their length and dating differ from manuscript to manuscript, and these can now be compared. The corresponding XML-TEI files can be found here.

  • The Quest for the Holy Entity: Using LLMs for Entity Recognition in Medieval Texts

Using medieval pilgrimage reports as sample, I am creating a pipeline for recognition, extraction and verification of entities, such as persons, places, organisations asf. Besides the technical work, a paper that discusses challenges and potentials and contains content-related analysis is in progress.

  • llm benchmarking for humanities data

As part of my activity at RISE, I am contributing to benchmarks that assess the performance of large language models (LLMs) on humanities-related tasks.

The fraktur benchmark evaluates a segmentation and transcription task using digitised images of an early modern newspaper.

The medieval manuscript benchmark assesses the performance of llms in transcribing medieval handwriting.