Introduction to Semantic Web

January 21 2014
Reading Time: 9 minutes
Contribute to this post on Github

In the same way that Web 1.0 abstracted away the network and physical layers, the Semantic Web abstracts away the document and application layers involved in the exchange of information. The Semantic Web connects facts, so that rather than linking to a specific document or application


The model behind the Web could be roughly summarized as a way to publish documents represented in a standard way (HTML), containing links to other documents and accessible through the Internet using standard protocols (TCP/IP and HTTP). The result could be seen as a worldwide, distributed file system of interconnected documents that humans can read, exchange and discuss.

In summary, the great advantage of Web 1.0 was that it abstracted away the physical storage and networking layers involved in information exchange between two machines. This breakthrough enabled documents to appear to be directly connected to one another. Click a link and you’re there—even if that link goes to a different document on a different machine on another network on another continent! In the same way that Web 1.0 abstracted away the network and physical layers, the Semantic Web abstracts away the document and application layers involved in the exchange of information.

The Semantic Web connects facts, so that rather than linking to a specific document or application, you can instead refer to a specific piece of information contained in that document or application. If that information is ever updated, you can automatically take advantage of the update. The word semantic itself implies meaning or understanding. As such, the fundamental difference between Semantic Web technologies and other technologies related to data (such as relational databases or the World Wide Web itself) is that the Semantic Web is concerned with the meaning and not the structure of data. This fundamental difference engenders a completely different outlook on how storing, querying, and displaying information might be approached. Some applications, such as those that refer to a large amount of data from many different sources, benefit enormously from this feature.  Others, such as the storage of high volumes of highly structured transactional data, do not.

What is meant by “semantic” in the Semantic Web is not that computers are going to understand the meaning of anything, but that the logical pieces of meaning can be mechanically manipulated by a machine to useful ends. So now imagine a new Web where the real content can be manipulated by computers. For now, picture it as a web of databases. One “semantic” website publishes a database about a product line, with products and descriptions, while another publishes a database of product reviews. A third site for a retailer publishes a database of products in stock. What standards would make it easier to write an application to mesh distributed databases together, so that a computer could use the three data sources together to help an end-user make better purchasing decisions? Semantic Web itself does not deal with unstructured content; instead, it is about representing not only structured data and links but also the meaning of the underlying concepts and relationships There’s nothing stopping anyone from writing a program now to do those sorts of things, in just the same way that nothing stopped anyone from exchanging data before we had XML. But standards facilitate building applications, especially in a decentralized system.  From a technical point of view, the Semantic Web consists of:

The term semantic technologies represents a fairly diverse family of technologies that have been in existence for a long time and seek to help derive meaning from information. Some examples of semantic technologies include natural language processing (NLP), data mining, artificial intelligence (AI), category tagging, and semantic search. You might think of the goal of semantic technologies as separating signal from noise. Some examples of existing semantic technologies being used today include:

The main goal behind knowing these technologies is that they help us in assembling the building blocks of the Semantic Web. For example, NLP can be used to extract structured data from unstructured documents (flat files like text documents). This data is then linked via Semantic Web technologies to other published data. This bridges the gap between documents (unstructured data) and structured data.

Linked Data

One of the most important movements in the Semantic Web community is Linked Data, which strives to expose and connect all of the world’s data in a readily queryable and consumable form. The goal of Linked Data is to publish structured data in such a way that it can be easily consumed and combined with other Linked Data.

 The Four Rules of Linked Data

So in a way, Linked Data is the Semantic Web realized via four best practice principles.

 The Four Rules Applied

  1. Instead of using application-specific identifiers—database keys, UUIDs, incremental numbers, etc.—you map them to a set of URIs. Each identifier must map to one single URI. For example, each row of those two tables is now uniquely identifiable using its URI.
  2. Make your URIs dereferenceable. This means, roughly, to make them accessible via HTTP as we do for every human-readable Web page. This is a key aspect of Linked Data: every single row of our tables is now fetch able and uniquely identifiable anywhere on the Web.
  3. Have our web server reply with some structured data when invoked. This is the Semantic Web “juicy” part. Model your data with RDF. Here is where you need to perform a paradigm shift from a relational data model to a graph one.
  4. Once all the rows of our tables have been uniquely identified, made dereferenceable through HTTP, and described with RDF, the last step is providing links between different rows across different tables. The main aim here is to make explicit those links that were implicit before shifting to the Linked data approach.

Twitter Facebook Google+