Wednesday, 7 October 2009

Notes from Sun Pasig - Weds morning session

Intro from Art Pasquinelli:

  • Libraries and IT departments will move to the side in Digital Archiving and Content will be at the centre
  • There is a problem with 'Long Tail Data' - we don't know what we want to keep
  • The cost of power to keep the disks spinning is a hot issue
  • How do you transfer a Petabyte?
  • There is an increasing amount of discussion in companies around how to build 100 year archives
  • Pharmachem industries will push the linking of research data and papers.
Mike Keller

  • Federating large repositories is starting to happen
  • Lots of people are talking about storage in the cloud but are worried about security and Service Levels (or lack of)
  • Discovery is a huge problem and current metadata standards (marc, METS) do not solve it
  • Audit is an issue that more people are becoming aware of - can we prove that the stuff is there and will continue to be there? Should we be publishing our audits?
Thorny Staples - Duraspace

  • Fedora scalability - Sun have just tested a Fedora instance with 150million objects and found that ingest performance was flat over that volume as was access performance
  • Plan to make Fedora more modular and attract new developers to the Fedora community
  • Improving Fedora docs
  • Duracloud - adds integrity checking and other archiving services to the basic cloud storage.
  • They are talking to a number of vendors of cloud storage to put Duraspace on top of their clouds
  • Opportunity for the DLS to be a 'cloud' for Duraspace
Islandora

  • Very impressive automated workflow for book digitisation including conversion from TIFF to JP2K, OCR, TEI (significant terms, people, orgs etc) extraction and an editor for correcting OCR
  • Virtual Research Environment including auto DRM
  • iPhone app for data collection - uses user id to present the correct data collection interface to the user
  • FeSL better rights management than XACML in Fedora
Biodiversity Heritage Library

  • Part of Natural History Musem
  • Similar Archiving Architecture model to DLS. Fedora for metadata access, storage abstracted through Duracloud
  • Building a large datacentre off J16 fo M6 on old airfield.
  • 500 year business plan
  • Keen to collaborate with other long-term organisations
EPrints

  • 10 years old
  • Implemented OAI-ORE and migrated an entire archive from EPrints to Fedora and vice versa
  • Impressive demonstration showed links to a citation service so you could see the citations of articles in the archive
  • Offer support to enterprises on a commercial model
iRods

  • Policy-based federated data archiving
  • Very impressive but I suspect that deciding what your policies are and designing the workflows to implement the policies would be hard (i.e. It might take years)

No comments:

Post a Comment