Part 1 – Modules
Understanding data models
Possibilities and limitations of RDF
Data profiling and cleaning
Decentralization and federation
Abby Wood, Smithsonian National Zoo
Red pandas are very cute animals, which are known to have a small
home range. Where are we at as librarians? We have defined ourselves through
a substantial body of cataloging rules and formats, developed and fine-tuned
throughout the 20th century.
Librarians across the globe have been repeating the
mantra for over a decade. In parallel, Linked Data have
been heralded as the golden bullet to embed library metadata as native first-class
citizens on the Web. However, when looking at MARC must die
BIBFRAME, librarians often look like this anxious red panda…
The reaction is understandable. This fisher is holding up a Nile perch.
The fish was artificially introduced in
Lake Victoria and ended up eating all other fishes, wrecking the
Is Linked Data colonizing and corrupting our carefully curated library metadata?
underlying Linked Data and its radically decentralized
architecture goes very much against traditional views on library metadata.
However, can library metadata and Linked Data join hands in a happy marriage,
despite the apparent differences? These ten modules will help you
as a librarian develop a more nuanced perspective
on Linked Data by mastering some of the key methods and tools yourself!
Libraries and Linked Data…
…share the same agenda
Making information available in an open and structured manner
…have different world views
boutique metadata: carefully
curated sets of cataloging records and controlled vocabularies Linked Data is based on the
open-world assumption and embraces
the opportunities offered by big data
Integrating the different paradigms is challenging!
Goal of these modules
Librarians, be aware of IT fashion!
Technologies come and go at a fast pace,
librarians should be capable to go beyond the hype and understand what
fundamentally changes their profession Clearly identify the
apples and the bananas.
Linked Data offers low-hanging fruits, bringing immediate
gains at a relative small cost, but may also require heavy up-front
investments in your library’s information architecture
Seeing beyond the hype
Gartner introduced the
which demonstrates how technologies get over-hyped when they are introduced. It sometimes takes
5-10 years before the read added-value of the technology becomes apparent and leads to
operational advantages on the terrain. This Google Trends visualization helps
to understand how the popularity of terms such as the Semantic Web or Linked Data
peaks and lowers again.
Why do we actually need a smarter Web?
Looking for information on Picasso?
Picasso into a search engine!
Looking for paintings created by artists who were influenced by Picasso?
Both humans and machines need access to structured and
data on the Web.
Semantic Web is no fantasy, it is already a part of our daily Web
experience. Analyze for example the search results presented by Google
based on the query “Drexel”.
right from the traditional search results, you can see structured and semantically
meaningful information. Notice here how we can
identify the basic structure of the RDF data model:
(e.g., Drexel), predicates (e.g. acceptance rate)
and objects (e.g. 75%). This demonstrates how Linked Data is already
very much a part of our day to day information consumption.
As much as search engines already manage to embed Linked Data, libraries
are still struggling to come to terms with it. Despite the initial enthusiasm for the
Semantic Web and Linked Data from a research perspective, it is often hard to find practical examples which work.
This video was published on YouTube in 2010, but it is still relevant.
Despite promises from academics and consultants, we can not say that the Semantic
Web or Linked Data have been a big success. One of the key purposes of these modules
is to help you develop a more pragmatical view on what and what not to expect from Linked Data
from the perspective of a librarian.
Even if the road to integrate library catalogs within a
is bumpy and uncertain, it is no option to just stick with our former practices and tools…
MARC has been an invaluable tool, but its card-based record-centric focus
has also effectively locked our data in a tower and only those close aligned
with the library profession have access to the key.
F. Tim Knight, 2011
Progress with Linked Data in libraries
Important work on both the cataloging rules and the format has taken place
over the last few years:
Resource Description Access (RDA)
New cataloging rules replacing AACR2 BIBFRAME
New container format to replace MARC,
that allows embedding Linked Data within a cataloging record
BIBFRAME 2.0 was recently introduced and now features three levels of
abstraction, referred to as core classes: Work, Instance and Item.
However, it is still very much a work in progress and the Library of Congress (LoC)
is working hard on providing more information and help on how to go forwards
with the conversion from MARC to BIBFRAME.
Will Linked Data help you find a job?
Yes, if you acquire new skills!
The transition process underlines the importance of cleaning up
inconsistencies in legacy metadata Reconciliation and enriching
Creating automated links with knowledge bases Issuing identifiers
BIBFRAME requires the creation for example of IDs for Works,
which libraries often do not have yet
Structure of the learning materials:
two parts with five modules each.
Part 1 – Introduction and context
Overview of data models, limits and possibilities of RDF,
and a specific focus on data quality and cleaning
to underline some of the challenges to implement Linked Data
in libraries Part 2 – Advanced topics and future
Metadata reconciliation and enriching, overview of
architectural aspects of Linked Data with REST and the importance of
decentralisation and federation
Structure of each module
Overview and context of each module
Interactive HTML slides detailing the course content,
containing links to relevant other information sources Self-assessment exercises
Each module contains auto-evaluation exercises, allowing you
to check your learning outcomes
Get in touch and reuse our materials!
Comments or questions in regards to the content?
On each slide, you will find the possibility
to tweet to us and engage with other participants!
Reuse the content
The content of these modules can be freely downloaded from GitHub.
We would be happy to hear from you how you reuse the content in your
own classes or workshops.