The Semantic Web:
A Guided Tour

University of Wisconsin - Platteville

The Semantic Web:
A Guided Tour

Web 2008



Presentation and documentation are online @

http://www.uwplatt.edu/web/presentations

Tim Berners Lee on the Semantic Web

Tim Berners-Lee

What is the Semantic Web?

What do you think?

"Web of Data"

The Semantic Web is a web of data. There is lots of data we all use every day, and its not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar?

Why not? Because we don't have a web of data. Because data is controlled by applications, and each application keeps it to itself.

rdf:sw="The Semantic Web"

The vision of the Semantic Web is to extend principles of the Web from documents to data. This extension will allow data to be shared effectively by wider communities, and to be processed automatically by tools as well as manually.

sw:benefits

It allows data to be surfaced in the form of real data, so that a program doesn’t have to strip the formatting and pictures and ads off a Web page and guess where the data on it is.

It allows people to write (or generate) files which explain—to a machine—the relationship between different sets of data.

sw:example

To make a “semantic link” between a database with a “zip-code” column and a form with a “zip” field that they actually mean the same – they are the same abstract concept. This allows machines to follow links and hence automatically integrate data from many different sources.

sw:construction

To achieve this, we must define and describe the relations among data (i.e., resources) on the Web. This is not unlike the usage of hyperlinks on the current Web that connect the current page with another one: the hyperlinks defines a relationship between the current page and the target.

sw:relationships

Relationships can be established between any two resources, there is no notion of “current” page. The relationship (i.e, the link) itself is named, whereas the link used by a human on the (traditional) Web is not and their role is deduced by the human reader.

The definition of those relations allow for a better and automatic interchange of data. RDF, which is one of the fundamental building blocks of the Semantic Web, gives a formal definition for that interchange.

sw:tools

  • RDF - defines the relationship of data
  • SPARQL - query relationships
  • OWL - defines logical relationships
  • GRRDL - interchange with data from other sources

sw:usage

Will I "see" the semantic web?

Not necessarily, at least not directly. The Semantic Web technologies may act behind the scenes, resulting in a better user experience, rather than directly influencing the “look” on the browser.

sw:extension

The Semantic Web is an extension of the current Web and not its replacement. Islands of RDF and possibly related ontologies can be developed incrementally. Major application areas (like Health Care and Life Sciences) may choose to “locally” adopt Semantic Web technologies, and this can then spread over the Web in general. In other words, one should not think in terms of “rebuilding” the Web.

sw:meta

The meta and link elements in HTML can be used to add metadata to an HTML page.

This is equivalent to the process of defining RDF relationships for that page as a “source”. Note, however, that these elements can be used to define relationships for the enclosing HTML file only, whereas the Semantic Web allows the definition of relationships on any resource on the Web.

sw:folksonomy

Tagging has emerged as a popular method of categorizing content. Users are allowed to attach arbitrary strings to their data items (for example, blog entries and photographs). While tagging is easy and somewhat useful, it often destroys a lot of the semantics of the data.

A folksonomy tag is typically 2/3 of an RDF triple. The subject is known: e.g., the URL for the flickr image being tagged, or the URL being bookmarked in delicious. The object is known: e.g., http://flickr.com/photos/tags/cats or http://del.icio.us/tag/cats. But the predicate to connect them is often missing.

sw:web dc:version ex:2.0

Development of active client-side application also means that these applications use all kinds of data; data that are on the Web somewhere, or data that is embedded in the page though not necessarily visible on the screen.

In many cases, using RDF-based techniques makes the mashing up process easier, mainly when data collected by one application is reused by another one somewhere down the line. The general nature of RDF makes this “mashup chaining” straightforward, which is not always the case for simpler Web 2.0 applications.

sw:microformats

Microformats are usually relatively small and simple sets of terms agreed upon by a community. Data models developed within the framework of the Semantic Web have the potential to be more expressive, rigorous, and formal (and are usually larger). Both can be used to express structured data within web pages. In some cases, microformats are appropriate because the extra features provided by Semantic Web technologies are not necessary. Other cases requiring more rigor will not be able to use microformats.

sw:microformat dc:contains ex:concerns

Data described in microformats each address a specific problem area. One has to develop a program well-adapted to a particular microformat, to the way it uses, say, the class and property="dc:date" content attributes. It also becomes difficult (though possible) to combine different microformats. In contrast, RDF can represent any information—including that extracted from microformats present on the page. This is where microformats can benefit from RDF—the generality of the Semantic Web tools makes it easier to reuse existing tools, eg, a query language and combining statements from different origins easily belongs to the very essence of the Semantic Web.

sw:microformats dc:isReferencedBy "presentations"

Intro to the Semantic Web

RDF: Triplets

SUBJECT has a PREDICATE whose value is OBJECT.

RDF: example

http://www.example.org/index.html has a creator whose value is John Smith

could be represented by an RDF statement having:

  • a subject http://www.example.org/index.html
  • a predicate http://purl.org/dc/elements/1.1/creator
  • and an object http://www.example.org/staffid/85740

Predicate: Dublin Core Metatdata

The Dublin Core Metadata Initiative is an open organization engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include work on architecture and modeling, discussions and collaborative work in DCMI Communities and DCMI Task Groups, annual conferences and workshops, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.

Dublin Core Elements

Popular Schemas

RSS 1.0

  • Describes lightweight syndication channels
  • Properties: title, link, description
  • Classes: channel, item, image
  • http://purl.org/rss/1.0/

FOAF

  • Describes people and their social networks
  • Properties: name, homepage, knows, weblog, interest
  • Classes: Person, Document, Project, Group
  • http://xmlns.com/foaf/0.1/

FRBR

  • Describes bibliographic records
  • Properties: creator, part, embodiment, successor, subject
  • Classes: Work, Expression, Manifestation, Item
  • http://purl.org/vocab/frbr/core

Creative Commons

SKOS

Geo

RDF Means

  • Your data doesn't need to contain explicit types everywhere
  • Applications can look them up
  • "Duck Typing"
  • If it walks like a duck and quacks like a duck...
  • Classes are determined based on the properties of the thing
  • e.g. Authors are all things that have written something

What is OWL?

OWL is a Web Ontology language. OWL provides a language which uses the linking provided by RDF to add the following capabilities to ontologies:

  • Ability to be distributed across many systems
  • Scalable to Web needs
  • Compatible with Web standards for accessibility and internationalization.
  • Open and extensible

What can Web Ontologies be used for?

  • Web Portals
    • Categorization rules used to enhance search
  • Multimedia Collections
    • Content-based searches for non-text media
  • Corporate Web Site Management
    • Automated Taxonomical Organization of data and documents
    • Mapping Between Corporate Sectors (mergers!)

What can Web Ontologies be used for?

  • Design Documentation
    • Explication of "derived" assemblies (e.g. the wing span of an aircraft)
    • Explicit Management of Constraints
  • Intelligent Agents
    • Expressing User Preferences and/or Interests
    • Content Mapping between Web sites
  • Web Services and Ubiquitous Computing
    • Web Service Discovery and Composition
    • Rights Management and Access Control

Are there OWL ontologies available already?

There are a large number of ontologies available on the Web in OWL. There is an ontology library at DAML ontology library, which contains about 250 examples written in OWL or DAML+OIL (a converter from DAML+OIL to OWL is available on the web). In addition, several large ontologies have been released in OWL. These include a cancer ontology in OWL developed by the US National Cancer Institute's Center for Bioinformatics, which contains about 17,000 cancer related terms and their definitions, and an OWL version of the well-known GALEN medical ontology, developed at the University of Manchester.

For the semantic web to function...

..., computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning. Artificial-intelligence researchers have studied such systems since long before the Web was developed. Knowledge representation, as this technology is often called, is currently in a state comparable to that of hypertext before the advent of the Web: it is clearly a good idea, and some very nice demonstrations exist, but it has not yet changed the world. It contains the seeds of important applications, but to realize its full potential it must be linked into a single global system.

What does the acronym "OWL" stand for?

Actually, OWL is not a real acronym. The language started out as the "Web Ontology Language" but the Working Group disliked the acronym "WOL." We decided to call it OWL. The Working Group became more comfortable with this decision when one of the members pointed out the following justification for this decision from the noted ontologist A.A. Milne who, in his influential book "Winnie the Pooh" stated of the wise character OWL:

"He could spell his own name WOL, and he could spell Tuesday so that you knew it wasn't Wednesday..."

A Riddle

  • Two sons and two fathers went to a pizza restaurant. They ordered three pizzas. When they arrived, everyone had a whole pizza. How can that be?
  • In OWL those individuals are not assumed to be distinct - so this riddle is easily satisfied
  • OWL has no Unique Name Assumption - unlike most humans
  • So what's the answer?

Simple Inferencing

  • OWL enables simple inferencing
  • Drawing conclusions based on the restrictions and descriptions

OWL Inferencing 1

First of all a robbery takes place. The robber drops his gun while fleeing. A report is filed by the investigating officers: <RobberyEvent> <date>14th June 2005</date> <description>Armed robbery at Kwik-e-Mart</description> <evidence> <Gun> <serial>983GTE-H5TF</serial> </Gun> </evidence> <robber> <Person /> <!-- an unknown person --> </robber> </RobberyEvent>

OWL Inferencing 2

<SpeedingOffence> <date>26 October 2005</date> <description>Car observed driving at speed along M42</description> <speeder> <Person> <name>John Doe</name> <driversLicenseNumber>7431224667</driversLicenseNumber> </Person> </speeder> </SpeedingOffence>

OWL Inferencing 3

At police HQ, the computer analyses each report as it is filed. The following OWL description tells the computer that a driversLicenseNumber is unique to a Person:

<owl:InverseFunctionalProperty rdf:ID="driversLicenseNumber"> <rdfs:domain rdf:resource="Person" /> <rdfs:range rdf:resource="&rdf;Literal" /> </owl:FunctionalProperty>

OWL Inferencing 4

The computer uses this information to look up any other records it has about that person and finds a gun license: <GunLicense> <registeredGun> <Gun> <serial>983GTE-H5TF</serial> </Gun> </registeredGun> <holder> <Person> <name>John Doe</name> <driversLicenseNumber>7431224667</driversLicenseNumber> </Person> </holder> </GunLicense>

OWL Inferencing 5

The next OWL description tells the computer that the registeredGun property uniquely identifies a GunLicense. i.e. each gun is associated with only a single GunLicense.

<owl:InverseFunctionalProperty rdf:ID="registeredGun"> <rdfs:domain rdf:resource="GunLicense" /> <rdfs:range rdf:resource="Gun" /> </owl:FunctionalProperty>

OWL Inferencing 6

The computer now knows that the person stopped for speeding owns a gun. The next description tells the computer that each gun is uniquely identified by its serial.

<owl:InverseFunctionalProperty rdf:ID="serial"> <rdfs:domain rdf:resource="Gun" /> <rdfs:range rdf:resource="&rdf;Literal" /> </owl:FunctionalProperty>

OWL Inferencing 7

The computer uses this to determine that the gun on the license is the same gun used in the robbery. This final description, seals the speeder's fate. It tells the computer that each GunLicense applies to only one gun and one person, so there is no doubt that the speeder is the person who owns the gun:

OWL Inferencing 7: Code

<owl:Class rdf:ID="GunLicense"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="#registeredGun"/> <owl:cardinality>1</owl:cardinality> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="#holder"/> <owl:cardinality>1</owl:cardinality> </owl:Restriction> </owl:intersectionOf> </owl:Class>

OWL Inferencing 8

The computer reports back to the traffic cop who duly arrests the speeder on suspicion of armed robbery.

arrest

TWINE

Final Thought

“I believe humans get a lot done, not because we’re smart, but because we have thumbs so we can make coffee.”

— Flash Rosenberg

The Semantic Web: A Guided Tour

???

The Semantic Web: A Guided Tour



Presentation and documentation are online @

http://www.uwplatt.edu/web/presentations


Email: frommelt@uwplatt.edu

Copyright Information

Copyright Daniel M. Frommelt, 2008. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.