Sunday, June 7, 2009

Book Review: Explorer's Guide to the Semantic Web by Thomas Passin


Explorer's Guide to the Semantic Web
Thomas Passin
2004 | 304 pages
ISBN: 1932394206
Manning





This is another book review - this time on the topic of Semantic Web.

Main Concepts


This section talks about the main ideas I learned from the book. If you don't have time to read the book, these are the main things:
  • The concept of the Semantic Web is to make much of the information on the internet not only human readable, but also available for machines to process efficiently and correctly. If you read this blog, you probably already knew that, so...
  • There is a layer cake of technologies that can be used in the solution. This diagram is from Tim Berners-Lee et al, and is presented in the book:

  • RDF is clearly a central part of this. The book covers this tech in chapter 2. It essentially is a mechanism for declaring factual statements. It works by mapping two resources with a relationship. Like: "Wrist partof Arm". Wrist is the Subject, partof is the Predicate, and Arm is called the Object.
  • Once you have a collection of RDF statements, you can reason on that repository. For example, an insurance company can create a rule "Coverage for any partsof Arm injuries is 93%". When a claim with a Wrist injury arrives, the RDF repository will be used to reason that it it partof an Arm, and therefore the 93% applies.
  • An interesting point of applying logic is: you can infer much more in a closed fact repository than in an open repository. Meaning, if you ask the question "Is Joe Smith an employee at ABC Inc?" to the corporate database (a closed repo) you can infer the answer is NO if no record exists for Joe Smith. In an open repository (the internet) the absence of a record means nothing. It can simply mean no one has posted it publicly.
  • The semantic web gets really nasty when you consider: 1) some sites will post incorrect information 2) sites will incorrectly annotate their information 3) facts will be posted in different languages. Technologies like RDF will solve some problems, but not all.
  • OWL is a language for describing an Ontology - essentially a type structure for resources. OWL can express constraints (a car can only have 4 wheels).
Review

Summary: The book I possess was published in 2004. It is unfortunately out of date in many areas, so much so that I wouldn't recommend this book. Otherwise it is well written, and covers the topic pretty well (in the state they were in 2004).

Details: The first chapter motivates the Semantic Web, and the second chapter explains RDF. So far so good. But thereafter, I hit many places where the book just seemed outdated. For example:
  • The Annotations chapter discusses the need for users to be able to contribute semantics to web pages they don't control. Several solutions are presented, but the current obvious solution that we have, del.icio.us, isn't mentioned. Tagging is the preferred solution to this problem today, and isn't mentioned at all.
  • Page 103 makes the statement "Some news sites...allow readers to comment on their stories." Yes, I remember when that was a novelty too, but in 2009 that is the norm.
  • The Search chapter, without calling out any specific sentence, seems to describe the world of search as it stood in 2003/4. Oh, well, right. Search has evolved quite a bit since 2004 when the book was written.
I am surprised Manning hasn't published an update to this book - Semantic Web seems to be a popular enough topic.

No comments:

Post a Comment