StartseiteDARIAH Code Sprint 2019

Calenda - Le calendrier des lettres et sciences humaines et sociales

*  *  *

Veröffentlicht am Montag, 01. Juli 2019

Zusammenfassung

You are invited to join the DARIAH Code Sprint 2019! It is an opportunity to bring together interested developers and DH-affiliated people, not only from the wide DARIAH community. For this purpose we would like to cordially invite you to spend three days in Berlin working on topics related to bibliographical metadata.

Inserat

Presentation

You are invited to join the DARIAH Code Sprint 2019! It is an opportunity to bring together interested developers and DH-affiliated people, not only from the wide DARIAH community. For this purpose we would like to cordially invite you to spend three days in Berlin working on topics related to bibliographical metadata. The registration is now open and can be found here: https://desircodesprint.sciencesconf.org/registration

Although this is already our second DARIAH code sprint it is not exclusively addressed to participants of the first code sprint. Everyone is welcome! An affiliation to coding in the Digital Humanities or general in technological discussions would although be helpful.

We will have three tracks approaching the wider topic of bibliographical metadata from three angles: extraction of data from PDFs (GROBID), the import and processing of data applying Bibsonomy and the visualisation of this data.

The code sprint will take place from September 24th to September 26th 2019 in Berlin at Forum Factory in a relaxed and productive environment.

The code sprint is organised by the DESIR project (DARIAH ERIC Sustainability Refined), an offspring of DARIAH-EU. DESIR aims to bring together DH affiliated developers, to spread competencies in the community, enhance own knowledge and learn on new approaches and technologies. With all of this DESIR addresses the sustainability question for several kinds of activities, infrastructures or services originating from the DARIAH context. Different from developing new resources or infrastructural components, DESIR is exploring opportunities to employ already existing resources (independent from DARIAH) as means to sustain certain infrastructure components and services.

Track descriptions

Track A: Extraction of bibliographical data and citations from PDF applying GROBID

As a result of the first Code Sprint that was organised last year (2018) by the DESIR project, this track has successfully built a tool covering the following functionalities: 1. Citation extraction of PDF files using GROBID; 2. Visualisation of extracted information directly on the PDF  files. This visualization is intended to highlight important information on scientific articles (e.g., authors, title, tables, figures, keywords); 3. Inclusion of some additional information from external services (e.g., affiliation disambiguation, named entity recognition); 4. Integration of all extracted data on the PDF files as usable viewers.By browsing the tool url, users will be given some ideas of how this tool works: Firstly, users need to upload any scientific article in Pdf format; Then, click the service buttons as needed to see the highlighted results that show: - bibliographical extraction results; - affiliation processing results; - named-entity recognition.For the second sprint code, the idea of adding features and capabilities to the demonstrator will be our focus. For example, article authors as results of the Grobid extraction process will be able to refer to the digital researcher identifier (e.g., ORCID identifier). Track A invites participants to give creative ideas and to be part of our project.

Track B: Automatic Import of Bibliographic Data into BibSonomy

In this track we aim to extend the tool for automatic import of bibliographic metadata into BibSonomy. The first version of the tool was created at the DESIR workshop 2018. Currently, users can upload a pdf file and have metadata automatically extracted using GROBID. In a further step, users can correct the metadata and save it to BibSonomy. We want to extend the tool by adding further features:- Metadata extraction from text files, - Individual user login for BibSonomy, - Improved User Interface, - API. Feel free to come up with your own ideas for improvement. We are looking forward to actively discuss all ideas in the beginning of the code sprint.

Track C: Visualisation of time dependent graphs of relations

One of the major substantial outcomes of the previous DESIR Code Sprint Track-C was the novel generic concept of time dependent graphs of relations and its visual presentation. Examples of such graphs may be co-authorship and citation graphs, genealogy trees, or characters interaction graphs. From the visual perspective both the structure and time characteristics of such graphs play a significant analytical role. Our web-based tool developed throughout DESIR project now holds a functionality of visualizing bibliographical datasets (e.g imported via BibSonomy API or loaded from a file), on top of the generic data model. Within this Code Sprint we will focus on the extension of our tool both towards new data formats and use cases, as well as new visual forms. The participants will have the opportunity to work on the mapping of different data to the generic model of our graphs and/or on the translation of data formats to intermediate RDF description (subject-predicate-object). Bring-Your-Own-Data model is encouraged. New visual forms will cover the modification of web application user interface to include additional visualizations of metadata or aggregated information. Experience in Java and/or Javascript programming is recommended.

Program

Tuesday, September 24, 2019  

13:00 - 14:00 Welcome and Registration - The location will be announced soon on this website  

14:00 - 14:30 Welcome and Agenda Setting - Agenda Setting for the Code Sprint           

14:30 - 16:00 Opening - N.N.  

16:00 - 18:00 Workshop - Parallel Track A: Extraction of bibliographical data and citations from PDF applying GROBID                  

16:00 - 18:00 Workshop - Parallel Track B: Automatic Import of Bibliographic Data into BibSonomy

16:00 - 18:00 Workshop - Parallel Track C: Visualisation of time dependent graphs of relations

Wednesday, September 25, 2019   

08:30 - 09:00 Welcome - and Coffee  

09:00 - 18:00 Workshop - Parallel Track A: Extraction of bibliographical data and citations from PDF applying GROBID                  

09:00 - 18:00 Workshop - Parallel Track B: Automatic Import of Bibliographic Data into BibSonomy 

09:00 - 18:00 Workshop - Parallel Track C: Visualisation of time dependent graphs of relations

Thursday, September 26, 2019 

09:00 - 12:00 Workshop - Parallel Track A: Extraction of bibliographical data and citations from PDF applying GROBID                  

09:00 - 12:00 Workshop - Parallel Track B: Automatic Import of Bibliographic Data into BibSonomy

09:00 - 12:00 Workshop - Parallel Track C: Visualisation of time dependent graphs of relations

12:00 - 13:00 Wrap up of the Code Sprint - Talk

Orte

  • Forum Factory Hector Space - Charlottenstraße 2
    Berlin, Germany (10969)

Daten

  • Dienstag, 24. September 2019
  • Mittwoch, 25. September 2019
  • Donnerstag, 26. September 2019

Schlüsselwörter

  • DARIAH, DESIR, Code Sprint, Bibliographical Metadata, Digital Humanities, GROBID, Bibsonomy

Kontakt

  • Stefan Buddenbohm
    courriel : buddenbohm [at] sub [dot] uni-goettingen [dot] de

Informationsquelle

  • Barthauer Raisa
    courriel : barthauer [at] sub [dot] uni-goettingen [dot] de

Lizenz

CC0-1.0 Diese Anzeige wird unter den Bedingungen der Creative Commons CC0 1.0 Universell .

Zitierhinweise

« DARIAH Code Sprint 2019 », Fachtagung, Calenda, Veröffentlicht am Montag, 01. Juli 2019, https://doi.org/10.58079/133d

Beitrag archivieren

  • Google Agenda
  • iCal
Suche in OpenEdition Search

Sie werden weitergeleitet zur OpenEdition Search