Demystifying Digitisation – A Hands-On Master Class in Text Digitisation
Antwerp Summer Academy in Digital Humanities organized by Digital Humanities Flanders (DHuF) and DARIAH-BE
Published on Monday, September 12, 2016
Abstract
The core of the our programme exists of two half-day workshops on software packages that may help the researcher automate some aspects of the transcription process. The first will deal with ABBYY, still one of the best software packages around for OCRing digitised print materials. Focusing on the software’s possible advantages and pitfalls, this workshop will show the participants how to prepare their documents in order to achieve the best OCR (optical character recognition) results. The second workshop will introduce Transkribus, a software package that has recently made great advancements in optically recognising characters in handwritten materials.
Announcement
Scope
This two-day workshop will take place from 29 to 30 September 2016 at the University of Antwerp, Belgium, preceding the DiXiT + ESTS 2016 conference hosted at the same location. It offers the perfect opportunity for the conference’s participants and other interested scholars to become better acquainted with some of the main concerns that need to be addressed at the outset of both mass- and ad hoc digitisation projects.
The core of the our programme exists of two half-day workshops on software packages that may help the researcher automate some aspects of the transcription process. The first will deal with ABBYY, still one of the best software packages around for OCRing digitised print materials. Focusing on the software’s possible advantages and pitfalls, this workshop will show the participants how to prepare their documents in order to achieve the best OCR results.
The second workshop will introduce Transkribus, a software package that has recently made great advancements in optically recognising characters in handwritten materials.
The programme will be completed by 4 (interactive) sessions on related topics that will be organised around these workshops.
Colleagues from the Ghent University Library will share their experience of taking part in the Google Books mass digitisation project to digitise their out-of-copyright books; Wout Dillen and Vincent Neyt and (UAntwerpen) will introduce the Manuscript Desk, a Virtual Research Environment for transcribing textual documents into TEI compliant XML, funded in the context of DARIAH-BE; Trudi Noordermeer (Antwerp University Library) will focus on issues related to developing a useful digitisation workflow; and Walter Scholger (Centre for Information Modelling - Austrian Centre for Digital Humanities, University of Graz) will tell us what we are allowed to do with our digitised documents by focusing on Copyright issues and Internet Property Rights.
The entire program will be free of charge, but registration is required. The workshops are limited to 20 participants each, while the lectures will be opened up to the larger public. Participants are asked to bring their own corpus to the workshops, consisting of scans of both printed and handwritten materials. The workshops do not demand any prerequisite skills, but a basic knowledge of XML is considered a strong advantage. Places will be distributed on a first-come-first-served basis, but we will keep a number of places reserved for members of DHu.F. and DARIAH-BE. All sessions will be held in English. Since places are limited, early registration is highly recommended.
Applications
To participate in the workshop, please fill in the application form by Wednesday 21 September: https://docs.google.com/forms/d/e/1FAIpQLSf6hVasYsCxawi_U08wWOzuhVCeF_vNEWIFtS7UOITMhg7Pig/viewform
If you are admitted to the workshop, you will be notified by email.
Costs
The entire program will be free of charge.
Location
Universiteit Antwerpen / City Campus
Grote Kauwenberg 18, Building E, Room S.E. 201
2000 Antwerpen, BELGIUM
Organizing committee
- Sally Chambers,Digital Humanities Research Coordinator at Ghent University
- Wout Dillen, doctoral student, affiliated to the Centre for Manuscript Genetics
- Mike Kestemont, assistant professor in the department of literature at the University of Antwerp,
- Trudi Noordermeer, Library Departement at the University of Antwerp
- Dirk van Hulle, Professor of English Literature at the University of Antwerp
Schedule
Thursday 29 September
10:00 - 11:00
- Hendrik Defoort / Dries Moreels: Digitising books with Google: the Ghent University library experience
11:00 - 17:00
- Jesse de Does and Katrien Depuydt, Instituut voor Nederlandse Lexicologie (INL): ABBYY Workshop
17:00 - 18:00
- Wout Dillen and Vincent Neyt: Introducing the Manuscript Desk
Friday 30 September
09:00 - 11:00
- Trudi Noordermeer on designing digitization workflows
11:00 - 15:00
- Sebastian Colutto: Transkribus Workshop
15:00 - 17:00
- Walter Scholger on Copyright Issues and IPR
Subjects
- Epistemology and methodology (Main category)
- Mind and language > Epistemology and methodology > Digital humanities
Places
- Building E, Room S.E. 201 - Universiteit Antwerpen, City Campus, Grote Kauwenberg 18
Antwerp, Belgium (2000)
Date(s)
- Thursday, September 29, 2016
- Friday, September 30, 2016
Keywords
- dariah-be, dariah-eu, DHu.F, digitization, ocr, ocring, ABBYY, transcribus
Contact(s)
- Sally Chambers
courriel : Sally [dot] Chambers [at] UGent [dot] be
Reference Urls
Information source
- Mike Kestemont
courriel : mike [dot] kestemont [at] uantwerpen [dot] be
License
This announcement is licensed under the terms of Creative Commons CC0 1.0 Universal.
To cite this announcement
« Demystifying Digitisation – A Hands-On Master Class in Text Digitisation », Miscellaneous information, Calenda, Published on Monday, September 12, 2016, https://doi.org/10.58079/vpp