The semantic web is an attempt to make all sort of data available in a structured way, so that it can easily be combined, and the logical relations between data becomes automatically parsable.
Tim Berners-Lee originally expressed the vision of the Semantic Web as follows:
- I have a dream for the Web [in which computers] become capable of analysing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize. -- Tim Berners-Lee, 1999
It is useful to specify the same keywords when describing metadata. For example, one may say "author", while someone else says "creator", with the same meaning. To improve the machine-readability of meta data, common vocabularies are important. Note that these vocublaries are usually no ontologies, since they do not specify interrelations.
Every vocabulary has trade-offs between very specific and general statements. A stronger refinement is not always better. For example, a generalization like "People living in France speak French" has it merits, even though strictly speaking is incorrect, because there are people in France who do not speak French. The merit simply lies in the fact that it often useful for humans to make general statements, to get the big picture.
Semantic Web standards
W3C has defined multiple standards to support the concept of the "Semantic Web", which allows the exchange of data in a structured way.
- XML provides a surface syntax for structured documents, but imposes no semantic constraints on the meaning of these documents.
- XML Schema is a language for restricting the structure of XML documents and also extends XML with datatypes.
- RDF is a datamodel for objects ("resources") and relations between them, provides a simple semantics for this datamodel, and these datamodels can be represented in an XML syntax.
- RDF Schema is a vocabulary for describing properties and classes of RDF resources, with a semantics for generalization-hierarchies of such properties and classes.
- OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.
The main vocabularies discussed at RDF schemas are:
- Dublin Core (DC)
- Describes properties of published documents, like publication date and creator. Based on RDF.
- XHTML Friends Network (XFN)
- Describes social networks, like friends and other relations. Based on HMTL metadata.
- Friend-of-a-Friend (FOAF)
- Describes business card-like personal information, like name, homepage, interest and chat accounts.
Most vocabularies now use RDF, or are aimed at RDF applications. In short, RDF defines subject - property - object (or subject - relation - target) triplets, like the author (property) of this document (subject) is "John Doe" (target).
- RDF Primer, a great starting point with concepts and examples.
- RDF Concepts and Abstract Syntax
- RDF/XML Syntax Specification [RDF-SYNTAX
- RDF Vocabulary Description Language 1.0: RDF Schema
- RDF Semantics
- RDF Test Cases
OWL (Web Ontology Language) adds more logic to RDF statements. For example, it can describe relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, etc.
There are three flavours of OWL:
- OWL Lite
- OWL DL
- OWL Full
W3C documents on OWL:
- The OWL Overview gives a simple introduction to OWL by providing a language feature listing with very brief feature descriptions;
- The OWL Guide demonstrates the use of the OWL language by providing an extended example. It also provides a glossary of the terminology used in these documents;
- The OWL Reference gives a systematic and compact (but still informally stated) description of all the modelling primitives of OWL;
- The OWL Semantics and Abstract Syntax document is the final and formally stated normative definition of the language;
- The OWL Web Ontology Language Test Cases document contains a large set of test cases for the language;
- The OWL Use Cases and Requirements document contains a set of use cases for a web ontology language and compiles a set of requirements for OWL.
An early concept of describing metadata is described at HTML Metadata.