HomeBuilding Modern Research Corpora: the Evolution of Web Archiving and Analytics

Building Modern Research Corpora: the Evolution of Web Archiving and Analytics

Constituer des corpus pour la recherche contemporaine : de l’archivage du web à son analyse

Annual conference of the International Internet Preservation Consortium (IIPC)

Conférence annuelle du consortium international pour la préservation de l’internet (IIPC)

*  *  *

Published on Friday, November 15, 2013 by Elsa Zotian

Summary

This conference aims to propose a forum where researchers, librarians, archivists and other digital humanists will exchange ideas, requirements, methods and tools that can be used to collaboratively build and exploit web archive corpora and datasets.

Announcement

Each year the IIPC holds a day-long public conference in conjunction with their General Assembly. It will be held at the Bibliothèque nationale de France in Paris, on 19th May 2014.

This year’s theme is Building Modern Research Corpora: the Evolution of Web Archiving and Analytics.

Argument

Libraries, archives and other heritage or scientific organizations have been systematically collecting web archives for over 15 years. Early stages of web archiving projects were mainly focused on tackling the challenges of harvesting web content, trying to capture an interlinked set of documents, and to rebuild its different layers through time. Institutions, especially those on a national level, were also defining their legal and institutional mandates. Meanwhile, approaches to web studies developed and influenced researchers’ and academics’ use of web archives. New requirements have emerged. While the objective of building generic collections remains valid, web archiving institutions and researchers also need to collaborate in order to build specific corpora – from the live web or from web archives.

 At the same time, “surfing the web the way it was” is no longer the only way of accessing archived web content. Methods developed to analyse large datasets – such as data or link mining – are applicable to web archives. Web archive collections can thus be a component of major humanities and social sciences projects and infrastructures. With relevant protocols and tools for analysis, they will provide invaluable knowledge of modern societies.

This conference aims to propose a forum where researchers, librarians, archivists and other digital humanists will exchange ideas, requirements, methods and tools that can be used to collaboratively build and exploit web archive corpora and datasets. Contributions are sought that will present:

  • models of collaboration between archiving institutions and researchers,
  • methods and tools to perform data analytics on web archives,
  • examples of studies performed on web archives,
  • alternative ways of archiving web content.

Attendance will be free but registration is obligatory.

Submission guidelines

Abstracts (no longer than one page) should be sent to Peter Stirling (peter.stirling@bnf.fr)

by 1st December 2013.  

Acceptance will be notified on 6th January 2014.

Abstracts should be submitted in English but speakers may present in English or French. Simultaneous translation (French/English) will be offered to the audience.

Final presentations will be published on the IIPC website, but no proceedings of the conference will be published. Those submitting a presentation proposal who also wish to propose a paper for publication are encouraged to send in parallel their abstract to Alexandria, The Journal of National and International Library and Information Issues, for its special issue on web archiving (abstracts due Friday 13th December 2013; more information on www.manchesteruniversitypress.co.uk/journals/alx  or from the editor, Monica Blake, at info@blakeinformation.com).

Limited, half-day time-slots are also available for workshops or training for specific web archiving tools, concepts, or issues. Past workshops included a legal issues discussion, hands-on Hadoop training, and a Crowdsourcing exercise.

Workshops will be held on Thursday 22nd and Friday 23rd May 2014.

Proposals for workshops (no longer than 2 pages) should be sent to Peter Stirling (peter.stirling@bnf.fr)

by 1st December 2013.  

Acceptance will be notified on 6th January 2014.

Evaluation

All proposals will be reviewed by a program committee made up of web archiving practitioners and researchers in the IIPC.  The committee members are:

  • Abbie Grotke, Library of Congress
  • Gildas Illien, Bibliothèque nationale de France
  • Rosalie Lack, California Digital Library
  • Hansueli Locher, Swiss National Library
  • Leïla Medjkoune, Internet Memory Foundation
  • Claude Mussou, Institut national de l’Audiovisuel
  • Clément Oury, Bibliothèque nationale de France
  • Mary Pitt, communication and program officer of the IIPC

The International Internet Preservation Consortium is a membership organization dedicated to improving the tools, standards, and best practices of web archiving while promoting international collaboration and the broad access and use of web archives for research and cultural heritage.

Places

  • Bibliothèque nationale de France - Quai François Mauriac
    Paris, France (75013)

Date(s)

  • Sunday, December 01, 2013

Keywords

  • archivage de l'internet, analyse de corpus, humanités numériques, web studies, digital humanities, corpus analytics, archiving

Contact(s)

  • Peter Stirling
    courriel : peter [dot] stirling [at] bnf [dot] fr
  • Clément Oury
    courriel : clement [dot] oury [at] bnf [dot] fr

Information source

  • Peter Stirling
    courriel : peter [dot] stirling [at] bnf [dot] fr

To cite this announcement

« Building Modern Research Corpora: the Evolution of Web Archiving and Analytics », Call for papers, Calenda, Published on Friday, November 15, 2013, https://calenda.org/263892

Archive this announcement

  • Google Agenda
  • iCal