Friday, August 02, 2013

Semantic web and linked data are a form of agile development

I am working on a book on building intelligent systems in JavaScript (1) and I just wrote part of the introduction to the section on semantic web technologies:

In order to understand and process data we must understand the context in which it was created and used. We looked at document oriented data storage in the last chapter. To a large degree documents are an easier source of knowledge to use because they contain some of their own context, especially if they use a specific data schema. In general data items are smaller than documents and some external context is required to make sense of different data sources. The semantic web and linked data technologies provide a way to specify a context for understanding what is in data sources and the associations between data in different sources.

The thing that I like best about semantic web technologies is the support for exploiting data that was originally developed by small teams for specific projects. I view the process as a bottom up process with no need to plan complex schemas and plans for future use of data. Projects can start by solving a specific problem and then the usefulness of data can increase with future reuse. Good software developers learn early in their careers that design and implementation need to be done incrementally. As we develop systems, we get a better understanding of how our systems will be used and how to build them. I believe that this agile software development philosophy can be extended to data science: semantic web and linked data technologies facilitate agile development, allowing us to learn and modify our plans when building data sets and systems that use them.

(1) I prefer other languages like Clojure, Ruby, and Common Lisp, but JavaScript is a practical language for both server and client side development. I use a small subset of JavaScript and have JSLint running in the background while I work: “good enough” with the advantages of ubiquity and good run time performance.

No comments: