Maud Ehrmann
EPFL CDH DHI DHLAB
INN 116 (Bâtiment INN)
Station 14
1015 Lausanne
+41 21 693 19 31
Office:
INN 116
EPFL › IC › DHI › DHLAB
Website: https://dhlab.epfl.ch
Expertise
Highlights:
impresso. Media Monitoring of the Past. How can newspaper archives help understand the past? How to explore them? This large-scale, impact-driven project aims to enable critical mining of newspaper archives by integrating robust content mining and innovative data visualisation and exploration into a powerful user interface that can support digital scholarship.
The HIPE Evaluation Campaigns. What is the ability of machines to recognise and disambiguate entities (e.g. people, places, organisations) in multilingual historical documents? The series of HIPE shared tasks aims to assess and advance the development of robust, adaptable and transferable approaches to named entity processing in historical documents to foster efficient semantic indexing of digitised cultural heritage collections. See the HIPE-2020 and HIPE-2022 websites, the HIPE-eval GitHub organisation, the HIPE-2022 dataset, and the DHLAB web page.
Expertise
Highlights:
impresso. Media Monitoring of the Past. How can newspaper archives help understand the past? How to explore them? This large-scale, impact-driven project aims to enable critical mining of newspaper archives by integrating robust content mining and innovative data visualisation and exploration into a powerful user interface that can support digital scholarship.
The HIPE Evaluation Campaigns. What is the ability of machines to recognise and disambiguate entities (e.g. people, places, organisations) in multilingual historical documents? The series of HIPE shared tasks aims to assess and advance the development of robust, adaptable and transferable approaches to named entity processing in historical documents to foster efficient semantic indexing of digitised cultural heritage collections. See the HIPE-2020 and HIPE-2022 websites, the HIPE-eval GitHub organisation, the HIPE-2022 dataset, and the DHLAB web page.
Education
PhD in Computational Linguistics
|2008 – 2008 Paris 7 Diderot University, LaTTICE laboratory
Master in Computational Linguistics
|2004 – 2004 University of Lorraine, France
Master in General Linguistics
|2003 – 2003 University of Lorraine, France
Bachelor in History
|2002 – 2002 University of Lorraine, France
Bachelor in Comparative Literature
|2001 – 2001 University of Lorraine, France
Selected publications
Named Entity Recognition and Classification in Historical Documents: A Survey
Maud Ehrmann, Ahmed Hamdi, Elvys Linhares Pontes, Matteo Romanello, Antoine Doucet.
Published in ACM Computing Survey (accepted) in
Extended Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents
Maud Ehrmann, Matteo Romanello, Sven Najem-Meyer, Antoine Doucet, Simon Clematide.
Published in CLEF 2022 proceedings in
Extended Overview of CLEF HIPE 2020: Named Entity Processing on Historical Newspapers
Maud Ehrmann, Matteo Romanello, Alex Flückiger, Simon Clematide.
Published in CLEF 2020 proceedings in
Language Resources for Historical Newspapers: the Impresso Collection
Maud Ehrmann, Matteo Romanello, Simon Clematide, Phillip Benjamin Ströbel, Raphaël Barman
Published in LREC 2020 in
Teaching & PhD
Courses
Historical Document and Media Processing
DH-400
This course introduces historical document processing, focusing on concepts and methods that enable the transformation of digitised materials into searchable information. Grounded in machine learning and document processing, it also covers data curation and copyright considerations.