Saturday, December 29, 2012

I am trying to improve my skills at design and web development

I built my first simple web page at SAIC in 1992 when my good friend Gregg Hanna set up a publicly accessible web server for my working group. Since then I have had a lot of people suggest that my web sites could look better but frankly I have always been more interested in content and developing cool web application functionality.

Recently I have been putting some effort into improving my design skills and the best resource that I have found is "The Non-Designer's Design Book" by Robin Williams

The author Robin Williams does a fantastic job at explaining four basic concepts of design: contrast, repetition, alignment, and proximity. She then provides good examples that show the reader how to recognize bad design and how to correct design errors.

I spent some time redesigning my main web site and really enjoyed the process. I started by determining the worst aspects of the old design based on Robin's advice and then tried to correct the design flaws using her examples.

I understand the technical aspects of using HTML5, CSS, and JavaScript but I was having some problems attempting to build effective web applications for both mobile devices and web browsers. As I have blogged about before, I have experimented and used the following tools on customer projects: plain old JSPs, Rails, Play Framework, and most recently Hiccup with Clojure web applications.

I have purchased several good books on CSS, HTML5, and JavaScript in the last few years but the one that has helped me the most has been "RESPONSIVE WEB DESIGN" by Ethan Marcotte

Ethan's short book on responsive design really helped me a lot because he efficiently covered what I needed to know about media queries and effective CSS and HTML5.

I have been using Dojo Mobile and more recently Twitter Bootstrap which did a lot of the heaving lifting for me in my first attempts at creating responsive multiple platform web applications. Reading Ethan's book helped me understand some of what Dojo and Boostrap were doing for me "behind the scenes" and also gives me some confidence in writing one page web applications from scratch without frameworks that might do more than I want and add unnecessary complexity.

Thursday, December 27, 2012

Technology tire kicking: trying Rails 4.0 beta

As I mentioned in my last blog article I am (mostly) shutting down my consulting business in order to have time for work on writing projects and to try to develop three business ideas. All three involve web apps/services and I want to use Clojure for two of them and Rails for the third.

Rails 4 should be released early in 2013 but I thought I would get a leg up and start experimenting with Rails 4 now. Fairly easy to set up and try:

git clone https://github.com/rails/rails.git
cd rails
gem install sprockets
rake build ; rake install
And then version 4 beta is installed:
✗ rails -v
Rails 4.0.0.beta
✗ rails new testapp -d postgresql
I had to comment out the assets group in the generated Gemfile before bundle install but otherwise everything worked fine.

Monday, December 24, 2012

Happy Holidays and my future plans

I would like to wish you Happy Holidays! I hope you are with family and friends enjoying yourself over the holidays.

I wanted to share with you my plans for the future. Starting in January I am planning on mostly shutting down my consulting business. I have been consulting for about 14 years and consulting has provided me with a great life style but it is time for a change. Before consulting I worked at SAIC, Physical Dynamics, and Angel Studios. I will still provide some consulting services but will limit the time directly helping customers on very small projects.

I plan to spend most of my "work" time writing and developing a few software as a service business ideas.

I have written published books for some great publishers (Springer-Verlag, McGraw-Hill, Morgan Kaufman, APress, Sybex, M&T Press, and J. Wiley) but because I prefer writing on niche subjects (things that are of special interest to me!) I will probably only write free books published as PDFs in the future. I don't really need the income generated by publishing books and I would prefer writing on "smaller topics" that likely would have a small market. My wife Carol is an excellent editor and she will help me as will volunteers who read early versions my books and provide technical feedback.

One book idea that I have been planning is titled "Single Page Web Applications in Clojure" - this is a niche topic that few people will be interested in but I have a personal interest in writing an open source framework and writing a short book around my software will hopefully make the whole project more useful.

I have had a lot of people help me in my working life. So much of what I have accomplished so-far in my life has been made possible by other people mentoring and helping me! When my writing (or open source projects) helps other people I feel like I am paying back all of the people who have helped me.

I plan on making writing my main priority and activity but I also hope to spend a significant amount of time developing some ideas I have for software as a service products. I have always been a polyglot programmer to fit in with whatever languages my employers/customers use: Java, Ruby, Common Lisp, Clojure, Scheme, Scala, Prolog, C/C++, and Python. I will probably just use Clojure on my own projects, with some Ruby glue code for little utilities. I will write more about this when I have prototype web apps in place for people to try.

Happy Holidays and a Happy New Year!

Sunday, December 23, 2012

Home from our Amazon River vacation - here are some pictures

I took a ton of hi-def video and pictures. Here is a Google+ photo album of a few of the pictures (small representative sample)

I have a Canon T2i camera with a nice 24-105mm L-lens. I mostly take hi-def video hand held. If we were in the same place enjoying a glass of wine together I would show you the hi-def video, but the pictures in the linked photo album are OK as a representation of the experiences Carol and I had.

Sunday, December 16, 2012

Problem fixed with Holland America: they offered a nice refund

Note: update: Holland America gave us a fair refund - I withdraw my complaints listed on this blog article

My wife and I have been on 15 cruises and the service provided by Holland America on the 23 day cruise we are on right now is so much worse than the other 14 cruises we have been on that I feel motivated to write up our experiences.

Note: our cruise was up the Amazon River and we did have some memorable experiences which I will blog about in the next week or so and provide links to some of our pictures and videos.

In order of "worse things first":

  • We booked a tour to Santarem (large industrial city) and Alter do Chao (a pretty little town on the water with amazing beaches). The tour guide was from Santarem and spent all 4 3/4 hours of the tour in her home town and blew off taking us to the scheduled stop in Alter do Chao. She did give us lots of unwanted shopping experiences, and wasted time stopping the bus by a new highway project and going into lots of detail about that. Many other cases where she was just killing time. She did have the driver stop on a road for one minute above Alter do Chao so we could catch a glimpse of the beautiful town that we missed. I formally asked Holland America for a partial tour refund but I received a negative response from them to my written request: no partial refund.
  • Our toilet did not work for over seven hours. This was not quite a ship wide occurrence since we talked to a few people who were not affected. Just hearsay, but we heard people talking of having their stateroom toilets out of order from a few hours to two days. When I first asked the front desk about this I was told it was a ship wide occurence and to be patient. After about 6 hours I asked again and was told that our toilet might be fixed that day. I got fed up, went back and told the lady at the front desk that this was a health violation and guess what: an engineer came and fixed our toilet within about 20 minutes. Sometimes complaining helps.
  • We booked an expensive 7 hour small boat tour and because of shallow water could not get to our destination. We were warned about this by the tour office the night before the tour and we decided to go anyway, so that was our own fault, but the tour should probably just have been cancelled.
  • We did not have any hot water in our stateroom for several days.
  • The ship had been in dry dock until the morning of the cruise and the exterior was filthy. The port side of the promenade deck was not fully reconstructed and was in particularly bad shape. However, within about 24 hours everything was cleaned up, so not such a big deal.
  • A nit pick: sometimes there were just paper napkins in the formal dining room, and they were not very good paper napkins.
Those were our own experiences. While drinking and having dinner with other passengers, we heard their complaints also; for example:
  • People's stateroom air-conditioning was not working, often for days. One couple took their pillows and sheets up to a public bar area and slept there because the bar had working air conditioning.
  • Complaints about the dirty state of the ship, toilets, and hot water issues.
One couple who we often met for drinks during happy hour are long time Holland America customers who have logged almost 300 days with this cruise line. They made it very clear that they will not be traveling again with Holland America. Another friend I made on board ship is also a long time customer and explained that this situation being caused by Holland America being bought by a larger company and they still charge a very high premium price for non-premium service.

I would like to say that we have had excellent service by our stateroom stewards and the food waiters. Also the food has been very good.

Monday, November 26, 2012

I am going to be mostly off the Internet for 3 weeks

I moderate comments because occasionally someone leaves some SPAM as a comment - so, there is usually just a short delay between the time readers post comments and when I moderate/publish them.

I will be on vacation, and for most of the time I will not have an Internet connection, so any comments left on my blog may not get moderated until the end of December when I get back home.

Best regards,
Mark

Saturday, November 24, 2012

Deep Learning

I worked in the field of artificial intelligence during the "AI winter" (a backlash against too optimistic predictions of achieving "real AI") and to this day I avoid getting too optimistic of huge short term gains in our field. That said, in the last several months a few things are stirring up my old optimism!

I have been enjoying Geoffrey Hinton's Coursera course Neural Networks for Machine Learning and I was pleased to see a front page New York Times article this morning on deep learning. Another really nice reference for deep learning is a very large PDF/viewgraph presentation by Richard Socher, Yoshua Bengio and Chris Manning.

Another very good resource is the Deep Learning Tutorial that provides the theory, math, and working Python example code.

Deep neural networks have many hidden layers and have traditionally been difficult to train.

In addition to very fast processors (graphics chipsets) a very neat engineering trick is pre-training weights in deep networks by stacking Restricted Boltzmann Machines, etc. After pre-training, weights can be fine-tuned using back propagation.

I haven't been this optimistic about (relatively) short term progress in AI since the early 1980s. Hoorah!

Saturday, November 10, 2012

"ClojureScript: Up and Running" book

I bought the Kindle edition of the book "ClojureScript: Up and Running" by Stuart Sierra and Luke VanderHart a few days ago. It is well written and a good way to ramp up on using ClojureScript for web client programming in Clojure instead of Javascript.

I have experimented with ClojureScript before, and now that two of the three Coursera classes I have been taking this fall are done, getting up to speed on ClojureScript has moved to the top of my side-project to-do list. (BTW, as I have written before, one of the classes I have taken this fall is the Functional Programming Principles in Scala, and that class is also of value to Clojure programmers who might not have a strong interest in Scala specifically, but have an interest into better understanding functional programming - the lectures for that class were especially enjoyable and useful.)

I have converted all three of my main web sites to Clojure + Noir in the last few months and this is a great combination (especially when used with a high level UI framework like Twitter Bootstrap or Dojo Mobile) for web applications. Since I am already happy with this web app development stack, why learn ClojureScript? Basically because I don't like to do a lot of coding in JavaScript. In the last few years I have enjoyed several projects done using either GWT or SmartGWT where both web client side and server side is done in Java, and I anticipate using ClojureScript, but only for projects with complex client side functionality.

Monday, October 29, 2012

Will HTML5 be the most important technology of this decade?

I am a technology "junky" and I suspect that most people who read my blog regularly or read an occasional blog post from a web search are the same. It is not easy to predict which currently used technologies will end up having a huge impact on human society, but it is fun to make educated guesses.

As much as I have been enjoying programming in Scala and Clojure (and some Ruby and Java) I doubt that improvements in programming languages and development tools will profoundly impact society, the economy, and quality of life in general.

I also don't think that new "gadget technology" will profoundly effect society, with a possible exception being very low cost smart phones in developing countries.

As you can tell by the title of this article, my bet is that the semantic features of HTML5 will have a profound effect on society. I have several reasons for this bet:

  • Semantic tags in HTML5 are minimal but sufficient for web analysis software to detect different types of content, make it easier to separate out HTML for navigation vs. content, etc.
  • Motivation: web sites with legitimate content (i.e., not link farm sites, etc.) that properly use HTML5 semantic tags will have better search engine optimization. This motivation will probably be a "secret sauce" for HTML5 adoption.
  • It has been a long time since Tim Berners-Lee and others introduced the concepts for the semantic web (I have had embedded RDF on some of my web sites for 10+ years) and despite a lot of skepticism, interest and utilization of linked data and semantic web technologies are gaining momentum.
Data specific tags for dates, locations, etc. will reduce the error of interpreting data embedded on web pages. Structural tags that allow, for example, a single web page to contain several article sections will make it easier for analysis software to split out different content for separate processing. This will increase the precision and recall of automated text analytics applications.

Of less importance than support for the semantic web, is the support for rich client applications that are reasonably portable. I believe that wide spread use of HTML5 will reduce the costs of publishing rich content.

Sunday, October 21, 2012

Clojure vs. Scala smackdown

Just kidding with the title of this post :-)

I believe in using the best tools for any given task, but this is not always possible when working with teams where most developers already know one programming language and/or framework. Also, as a consultant I usually favor using which ever tools are already used in the customer's organization.

All that said, I find that alternative JVM languages like Clojure, Scala, and JRuby are so much more effective for the projects that I work on that I have a strong preference to not use Java.

I find the decision when to use JRuby to generally be easy, using it on projects requiring fast development, web services, and as glue code for existing Java software. Increasingly though, I am viewing Scala and Clojure to be almost as agile as JRuby, with much better runtime performance.

For me, the tough decision is between Scala and Clojure. Taking Martin Odersky's Functional Programming with Scala class definitely affects my decision because I feel like I am learning best practices for Scala development and I am enthusiastic about using it for new projects. To balance that out I have bought and read four books on Clojure and I have used Clojure a lot in my work in the last several years and Clojure is a pleasure to use.

Although I have conflicted and undecided views on when to use Scala or Clojure, I do believe without a doubt that Rich Hickey and Martin Odersky have designed, implemented, and maintain two languages which blow "plain Java" away in developer productivity. For our industry, this is a problem because there are so many Java developers who are already trained using Java and Java frameworks, and there is a lot of inertia for not learning new languages, especially like Scala which has a steep learning curve.

Tuesday, October 16, 2012

A revolution in education

I am just finishing up today my course work for Andrew Ng's excellent Coursera course in Machine Learning. I am also taking two other classes that I will complete in about a month: Martin Odersky's Functional Programming with Scala class and Geoffrey Hinton's Neural Networks for Machine Learning. Previously this year I also took Natural Language Processing and Software Engineering for SaaS. I have taken the first two or three weeks of several other classes just to get a general feel for their subjects.

This morning in one of the last lectures in Andrew Ng's class he showed a precise algorithm for a problem that a customer (a media company in China) and I tried to solve about 7 years ago. We were successful enough to meet my customer's requirements but the next time I see a problem like that (involving collaborative filtering) I will nail the implementation. Every class that I have taken this year has provided many new insights, often on subjects that I thought that I was already a familiar with because of work experience.

I like to look past the enormous benefit for myself from very high quality free online classes, and consider the enormous benefits to the world in general. Just taking Andrew's class as an example, ten years ago there were perhaps a few thousands of people in the world who understood how to do machine learning and understood the craft of using the right methods for specific problems. In five years there might be close to a million people who have taken Andrew's class. This will affect productivity, world wide, and this example is just one class in one online university. Scale this to dozens of online universities with many thousands of classes in technology, health care, etc.

I don't think that it is an exaggeration to say that high quality online education will revolutionize the lives of knowledge workers and potentially help bring about revolutionary changes in the world economy.

Saturday, September 29, 2012

I tried using Twitter Bootstrap this morning. Really nice way to support mobile devices.

I rewrote my markwatson.com website recently, tossing the 15 year old PHP implementation and replaced it with a little Clojure + Noir web app. This morning I spent a short while refactoring it to use the noir-bootstrap project on github. Along the way, I read through the Bootstrap scaffolding documentation. Really cool stuff.

One thing that I had to add was

<meta name="viewport"
      content="width=device-width, initial-scale=1.0">
to the head section of my Noir page layout template. Before I did this my web site did not identify the media type when I used an iPad and a Samsung Galaxy III S phone for testing.

I wrote earlier this year about using Noir and the Dojo Javascript library for my cooking/recipes web app. Dojo makes it easy to write mobile web apps but Bootstrap helps a non-expert Javascript developer like myself write something once and have it work across devices.

Sunday, September 23, 2012

Alternative JVM languages

I am enjoying Martin Odersky's course "Functional Programming Principles in Scala." I use JRuby and Clojure a lot in both consulting projects and my own projects but my main use of Scala in the past was writing some programming examples for my book "Practical Semantic Web and Linked Data Applications, Java, Scala, Clojure, and JRuby Edition" that is available here for free. I had purchased three Scala books and occasionally played with Scala but it never really "clicked" with me (same situatuon with Haskell: bought three books on Haskell, lots of experiments, never had the "clicked" experience).

Taking Martin's class is definitely helping me become more comfortable with Scala.

Having an interactive repl for Clojure and JRuby has always been a big win for me over Java development, even with great Java tooling (e.g., when I have to do GWT or SmartGWT development, having both the server and client side code in Java, all debuggable in IntelliJ is nice, but still a very heavy weight development environment!)

Scala provides a good repl experience and has something else really neat that I learned about in Martin's course: the Eclipse+Scala plugin "worksheet" support. A "worksheet" is a separate edit window that can contain any Scala code and every time you save the worksheet file, then the value of every expression in the worksheet is displayed in the right hand column. This is sort of like Light table for Clojure and Javascript, but very polished and integrated nicely into the IDE. I am a huge fan of IntelliJ, but I have switched to Eclipse+Scala plugin for Clojure development for the class. BTW, if you want to experiment with the "worksheet" functionality, it is probably easiest to download the Scala IDE bundle from the Typesafe download site.

Tuesday, September 18, 2012

More work on my NLP web service

I have done a short sprint the last three days on my NLP SaaS kbsportal.com mostly working on classification/tagging.

When I have time in the next month I also make some improvements in sentiment analysis and entity recognition. The text summarization code also needs work but it may be a while before I have more time to work on that.

I rewrote KBSportal in Clojure early this year. It used to be a pile of Common Lisp, Scheme, and Java code experiments written over a 10+ year period and it was great to pull just the parts I wanted into a new system. It might be heresy to Lisp programmers but I will (probably) eventually rewrite it again in plain old Java to save a bit on memory footprint and CPU time. Clojure is a very efficient language but not as efficient as Java.

Sunday, September 02, 2012

Effectively using Linux for work

I am going to write up a few of the things that make Ubuntu Linux a more comfortable software development, writing, research, and having fun environment. I hope that readers of this blog add their own suggestions in comments (remember: I moderate comments to avoid publishing SPAM so it might take a short while before I see your comments and approve them).

I use Evernote and the Kindle reader a lot and there are no officially supported Linux clients. Evernote has an open source client client NixNote that is OK but I prefer to simply use the web interface on the Chrome web browser. This is a little slower than a native client with local copies of everything but it is OK. I also use the Evernote Chrome plugin. For reading books I buy for the Kindle the Chrome Kindle plugin works fine, especially since I own a Kindle device and my Samsung Galaxy 3 III phone (with 1280x720 screen resolution!!) is also good to read with. One serious problem is watching Netflix movies on Linux. I get by using our large TV with Google TV or my iTV (a gift from my stepson last year). Also watching Netflix and Hulu+ on my Samsung Galaxy 3 III is fine if no one else is watching. I also have an iPad 2 I bought last year that works well for watching video; I am planning to swap this out for a Nexus tablet that has a smaller screen but higher resolution than my iPad 2.

Other tools I use everyday work fine on Ubuntu Linux (sometimes with some adjustments): IntelliJ, RubyMine, LaTex tools, Emacs, git, etc.

For many years I wrote copious work notes in a physical square deal style laboratory notebook. I switched 20 years ago to using plain text files for copious work notes on everything that I do. Now I organize notes differently using a combination of Google Docs (back them up often!) and RTF formatted text files that are a little better for me than plain text because it makes it easier to tag different kinds of content with styled text and different colors - this helps me find things faster. I like to use AbiWord to quickly open, edit, and view RTF files - faster and lighter weight than Open Office or Libre Office.

For a lot of what I do there is little difference between using Linux, OS X, or Windows. Interacting with customers using github and Google shared documents is the same. To be clear, which desktop (or laptop :-) operating system people use is their own choice. For me, being able to apt-get install software and have the same environment on my laptops as on my servers make using Linux a great advantage. For most casual computer users obviously Windows or OS X is a better choice for them.

Saturday, September 01, 2012

Can we agree to stop buying Apple products?

Apple has gone too far in starting legal proceedings to stop sales of the Samsung Galaxy S III phones. Have you seen these? Carol and I both have them. It looks to me like Samsung bent over backwards on this product to not infringe.

I have a long happy history with Apple: I wrote the Chess playing program they gave away on the free demo tape for the early Apple II, I did quite well with a commercial AI product for the Mac in 1984, and I respect their technology.

That said, I think that the iPad 3 that I bought for my stepson a few months ago might be the last Apple product that I buy unless they have a turnaround in their business strategy and stop trying to be the world's largest patent troll. I have already decided to not buy one of the retina MBPs and to not renew my iTunes Match service subscription when it expires. Apple has the legal right to pursue any legal proceedings they want but as consumers we can vote with our wallets and stop buying Apple products! Obviously I want you to make your own decisions - I am just sharing my thoughts on the best strategy for affecting a company whose actions I don't like and a boycott of Apple products seems like the best strategy.

And, there are now some very sweet non-Apple laptops :-)

Thursday, August 30, 2012

Rewrote my main markwatson.com web site in Clojure + Noir

I have had the domain markwatson.com for almost 20 years and for about 15 years my site was a simple PHP application hosted on the awesome provider Hurricane Electric. There are a few things that I want to improve on my site so I decided to rewrite using Clojure and Noir, a combination that I have been mostly using for writing web apps in the last year or so.

I usually use AWS (and sometimes Heroku) for hosting my customer's web apps and the web apps for my side projects but for my primary domain (important for my business!) I decided to go back to using RimuHosting, a hosting provider I used to use for my own projects and I always liked their service.

Some of the new site is still HTML fragments from my old site implementation but I am in the process of converting everything to use hiccup and cleaning up the CSS.

Friday, July 27, 2012

More Clojure Datomic experiments: decoupling data building and transactions, and adding text search

I wrote two days ago and yesterday about my experiments for getting up to speed on Datomic. In the Datomic news group, Rich Hickey suggested that I keep any helper code for data-building separate from helper code for managing database access/transactions. I have reworked my code and added code to experiment with text search.

Now that I have a few days of experimenting with Datomic I think I understand how I will structure an application for a side project: write a small helper library that is application independent, much as the code snippets in this article. Write another application specific helper library that layers on top of the application independent library that encapsulates all data store functionality that I will need, and unit test this separately. Then I can write the application layer that will probably not have any Datomic application specific code. After I finish the first version of my side project this morning then I will help my customer on his Datomic project.

I removed the single core.clj functions and replaced it with two sets of functions. databuilder.clj:

(ns datomic-test.databuilder
  (:use [datomic.api :as api]))

(defn attribute [id t c doc]  ; by Michael Nygard
  {:db/id (api/tempid :db.part/db)
   :db/ident id
   :db/valueType t
   :db/cardinality c
   :db/doc doc
   :db.install/_attribute :db.part/db})

(defn string-singleton-attribute [id doc]
   (attribute id :db.type/string :db.cardinality/one doc))

(defn string-singleton-attribute-searchable [id doc]
  (assoc
    (attribute id :db.type/string :db.cardinality/one doc)
    :db/fulltext true))

(defn string-multiple-attribute [id doc]
  (attribute id :db.type/string :db.cardinality/many doc))

(defn string-multiple-attribute-searchable [id doc]
  (assoc
    (attribute id :db.type/string :db.cardinality/many doc)
    :db/fulltext true))

(defn long-singleton-attribute [id doc]
   (attribute id :db.type/long :db.cardinality/one doc))

(defn long-multiple-attribute [id doc]
  (attribute id :db.type/long :db.cardinality/many doc))
and transactions.clj:
(ns datomic-test.transactions
  (:use [datomic.api :as api]))

(defn- do-tx-helper [conn partition data-seq]
  (let [data
        (for [data data-seq]
          (assoc data :db/id (api/tempid partition)))]
    @(api/transact conn data)))

(defn do-tx-db [conn data-seq]
  (do-tx-helper conn :db.part/db data-seq))

(defn do-tx-user [conn data-seq]
  (do-tx-helper conn :db.part/user data-seq))
Finally, a little bit of test code that also includes a query using text search:
(ns datomic-test.test.core
  (:use [datomic-test.transactions])
  (:use [datomic-test.databuilder])
  (:use [clojure.test]))

(use '[datomic.api :only [q db] :as api])
(use 'clojure.pprint)

;;(def uri "datomic:free://localhost:4334//news")
(def uri "datomic:mem://news")

(api/create-database uri)
(def conn (api/connect uri))

;; create two singleton string attributes and add them
;; to the :db.part/db partition:
(do-tx-db
  conn
  [(string-singleton-attribute-searchable
     :news/title "A news story's title")
   (string-singleton-attribute
     :news/url "A news story's URL")
   (long-singleton-attribute
     :news/reader-count "Number of readers")])

;; add some data to the :db.part/user partition:
(do-tx-user conn
  [{:news/title "Rain Today"
    :news/url "http://test.com/news1"
    :news/reader-count 11}
   {:news/title "Sunshine tomorrow"
    :news/url "http://test.com/news2"
    :news/reader-count 8}])

(def results
  (q '[:find ?n :where [?n :news/title]] (db conn)))

(doseq [result results]
  (let [id (first result)
        entity (-> conn db (api/entity id))]
    (println (:news/title entity)
             (:news/reader-count entity))))

(def search-results
  (q '[:find ?e ?n
         :where [(fulltext $ :news/title "rain")
                 [[?e ?n]]]]
    (db conn)))

(doseq [result search-results]
  (let [id (first result)
        entity (-> conn db (api/entity id))]
    (println (:news/title entity)
             (:news/reader-count entity)
             (:news/url entity))))

Thursday, July 26, 2012

A little Clojure wrapper for Datomic

I wrote yesterday about getting started with Datomic in a lein based project. Probably because I am not up to speed with Datomic idioms, a lot of the data boilerplate bugs me so I wrote a little wrapper to hide all of this from my view. Starting with a some code by Michael Nygard I saw on the Datomic newsgroup I wrapped creating database attributes and adding data to the data store. I formated the following code in a funky way to make it fit on this web page:

(ns datomic-test.core
  (:use [datomic.api :as api]))

(defn attribute [id t c doc]  ; by Michael Nygard
  {:db/id (api/tempid :db.part/db)
   :db/ident id
   :db/valueType t
   :db/cardinality c
   :db/doc doc
   :db.install/_attribute :db.part/db})

(defn string-singleton-attribute [conn id doc]
  @(api/transact conn
     [(attribute id
         :db.type/string :db.cardinality/one doc)]))

(defn string-multiple-attribute [conn id doc]
  @(api/transact conn
     [(attribute id
         :db.type/string :db.cardinality/many doc)]))


(defn long-singleton-attribute [conn id doc]
  @(api/transact conn
     [(attribute id
        :db.type/long :db.cardinality/one doc)]))

(defn long-multiple-attribute [conn id doc]
  @(api/transact conn
     [(attribute id
        :db.type/long :db.cardinality/many doc)]))

(defn do-tx-user [conn data-seq]
  (let [data
        (for [data data-seq]
          (assoc data :db/id (api/tempid :db.part/user)))]
     @(api/transact conn data)))
Michael's code wraps schema attribute definitions like I showed in the file data/schema.dtm in yesterday's blog. The function do-tx-user takes a seq of maps, adds the user database partition specification to each map, and runs a transaction. With this wrapper, I don't use a separate schema input data file anymore. Here is the example I showed yesterday using the wrapper:
(ns datomic-test.test.core
  (:use [datomic-test.core])
  (:use [clojure.test]))

(use '[datomic.api :only [q db] :as api])
(use 'clojure.pprint)

;;(def uri "datomic:free://localhost:4334//news")
(def uri "datomic:mem://news")

(api/create-database uri)
(def conn (api/connect uri))

;; create two singleton string attributes and a number
;; attribute and add them to the :db.part/db partition:
(string-singleton-attribute
  conn :news/title "A news story's title")
(string-singleton-attribute
  conn :news/url "A news story's URL")
(long-singleton-attribute
  conn :news/reader-count "Number of readers")

;; add some data to the :db.part/user partition:
(do-tx-user conn
  [{:news/title "Rain Today",
    :news/url "http://test.com/news1",
    :news/reader-count 11}
   {:news/title "Sunshine tomorrow",
    :news/url "http://test.com/news2",
    :news/reader-count 8}])


(def results
  (q '[:find ?n :where [?n :news/title]] (db conn)))
(println (count results))
(doseq [result results]
  (let [id (first result)
        entity (-> conn db (api/entity id))]
    (println (:news/title entity) (:news/reader-count entity))))
Since I use many different tools, I sometimes like to figure out the subset of APIs, etc. that I need and wrap them in a form that is easier for me to remember and use. This may be a bad habit because I can end up permanently using a subset of tool functionality.

Wednesday, July 25, 2012

Using the Datomic free edition in a lein based project

Hopefully I can save a few people some time:

I flailed a bit trying to use the first released version yesterday but after updating a new version (datomic-free-0.8.3343) life is good. Download a recent release, and do a local maven install:

mvn install:install-file -DgroupId=com.datomic -DartifactId=datomic-free -Dfile=datomic-free-0.8.3343.jar -DpomFile=pom.xml
Setting a lein project.clj file for this version:
(defproject datomic-test "1.0.0-SNAPSHOT"
  :description "Datomic test"
  :dependencies [[org.clojure/clojure "1.4.0"]
                 [com.datomic/datomic-free "0.8.3343"]])
Do a lein deps and then in the datomic-free-0.8.3343 directory start up a transactor using the H2db embedded database:
cp config/samples/free-transactor-template.properties mw.propreties
bin/transactor mw.propreties
When you run the transactor it prints out the pattern for a connection URI and I used this in a modified version of the Datomic developer's Clojure startup example code:
(ns datomic-test.test.core
  (:use [datomic-test.core])
  (:use [clojure.test]))

(use '[datomic.api :only [q db] :as d])
(use 'clojure.pprint)

(def uri "datomic:free://localhost:4334//news")
(d/create-database uri)
(def conn (d/connect uri))

(def schema-tx (read-string (slurp "data/schema.dtm")))
(println "schema-tx:")
(pprint schema-tx)

@(d/transact conn schema-tx)

;; add some data:
(def data-tx [{:news/title "Rain Today", :news/url "http://test.com/news1", :db/id #db/id[:db.part/user -1000001]}])

@(d/transact conn data-tx)

(def results (q '[:find ?n :where [?n :news/title]] (db conn)))
(println (count results))
(pprint results)
(pprint (first results))

(def id (ffirst results))
(def entity (-> conn db (d/entity id)))

;; display the entity map's keys
(pprint (keys entity))

;; display the value of the entity's community name
(println (:news/title entity))
That was easy. You need to define a schema before using Datomic; here is the simple schema in data/schema.dtm that I wrote patterned on their sample schema (but much simpler!):
[
 ;; news

 {:db/id #db/id[:db.part/db]
  :db/ident :news/title
  :db/valueType :db.type/string
  :db/cardinality :db.cardinality/one
  :db/fulltext true
  :db/doc "A news story's title"
  :db.install/_attribute :db.part/db}

 {:db/id #db/id[:db.part/db]
  :db/ident :news/url
  :db/valueType :db.type/string
  :db/cardinality :db.cardinality/one
  :db/doc "A news story's url"
  :db.install/_attribute :db.part/db}

 {:db/id #db/id[:db.part/db]
  :db/ident :news/summary
  :db/valueType :db.type/string
  :db/cardinality :db.cardinality/one
  :db/doc "Automatically generated summary of a news story"
  :db.install/_attribute :db.part/db}

 {:db/id #db/id[:db.part/db]
  :db/ident :news/category
  :db/valueType :db.type/string
  :db/cardinality :db.cardinality/many
  :db/fulltext true
  :db/doc "Categories automatically set for a news story"
  :db.install/_attribute :db.part/db}

 ]

I hope this saves you some time :-)

Saturday, July 21, 2012

Using the new Bing Web Search API from Java and Clojure

I wrote a simple wrapper that is on github for calling the new API. The old API will not work starting in August 2012 so it was time to update. The README file on github has a simple example for using the JAR created by this project in a Clojure project (a pre-built JAR is also included in the github repository).

The wrapper is simple, but will save you a few minutes writing one yourself if you need to use Bing Web Search in Java, Clojure, JRuby, etc.

You get 100 free searches/day with the new API and there is a charge if you need more API calls per day.

Friday, July 20, 2012

New cellphone: Samsung Galaxy S III

My almost three year old Droid phone still works great and it was not so easy getting a new phone. What convinced me to upgrade was the Samsung Galaxy S III's screen resolution of 1280 x 720 pixels. Sure upgrading from Android 2.x to 4.x is nice, but having a high density screen makes reading email, browsing the web, reading using the Kindle app, etc. all feel very natural, even on such a small device as a cellphone.

Also: the Netflix app is fantastic: unbelievably easy to watch video at this screen resolution, even on a tiny device. I have been an enthusiastic iPad owner, but I think that the Samsung Galaxy S III will replace using the iPad about half the time. I had thought about getting a Nexus 7 tablet for occasions then the iPad was bigger (and heavier!!) than what I needed it for. At least for now, a larger cell phone with a high density screen seems to fill that niche better.

One thing that surprised me after using my old phone for almost three years: I didn't manually transfer any files from my old phone to my new phone. My old Droid automatically uploaded all the pictures I took to G+ and they were available almost instantly on my new phone. Same for my contacts, and I just about run my business on GMail and Google Calendar and those assets were immediately available - a no hassle upgrade.

My simple hack for using local JAR files in my Clojure lein projects

Maybe I shouldn't share bad habits with people but occasionally I see articles for setting up local maven repositories, etc. for using local JAR files in Clojure lein based projects. I have a kludge for doing this simply.

Now, I have to admit that I tend to use a lot of Java code and 3rd party JARs in my Clojure projects - nice if libraries are in Clojars, but if not I create a directory local_jars in a project directory, put any local JARs there, and instead of using lein deps and lein clean I use a Makefile like:

deps:
 lein deps
 cp local_jars/*.jar lib/

clean:
 rm -f -r lib/*
Really simple. A side benefit is that I use lein1 in some projects and lein2 in others. Using Makefile targets insulates me from mistakes using the incorrect version of lein.

Anyone have a better way of doing this? Please let me know.

Monday, July 16, 2012

Secrets of a polygot programmer

I often read opinions about using the best tool for the job in reference to choosing programming languages, frameworks, and libraries.

I am a polygot programmer but I am one mostly because I use languages and other software tools that my customers request. Seriously as a consultant I serve my customers' best interests and that process usually does not include trying to get their teams to pivot to use my favorite tools.

For my own side projects I am for the most part happy enough to use any of the languages that I feel most comfortable with, my favorites being Ruby and Clojure, but I also really like to use Java and Common Lisp. I usually choose languages for my own projects based on available frameworks and libraires that I can build my code on, and more rarely because of specific language features. Yes, I feel a little heretical saying that!

My secret for being comfortable with several languages is that I try to make the development experience similar across programming languages. For example, from the mid 1980s to several years ago I mostly lived inside Emacs. Certainly some language like Smalltalk carry their own IDEs around with them, but for the most part I used Emacs. Things are different now, I use various JetBrains IDEs because the editors, settings, etc. are all mostly similar:

  • Clojure - I use IntelliJ with the La Clojure and Leiningen plugins
  • Java - IntelliJ all the way
  • Ruby - I use RubyMine most of the time with some quick work done in GEdit or TextMate
  • Python - this is a tough language for me because I don't much use Python unless a customer is a Python shop. I find that PyCharm with code completion and instant syntax error hiliting really helps me a lot.
  • Common Lisp - yes, I still use Emacs
Other things that help are common tools like git, svn, crontab, PostgreSQL, MongoDB, and bash (like) shells that are fairly much constants in my work.

Saturday, July 14, 2012

I have been loving Ubuntu 12.04

12.04 was released a few months ago but I just got around to upgrading my Toshiba U505 yesterday. I must say that I don't understand some of the negative comments about Unity that I have read. Unity feels intuitive and so far has just worked for everything I have tried. A minute of google'ing found directions for seting up desktop files in .local/share/applications/. For an example, here is my .local/share/applications/rubymine.desktop file:

[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Exec=/home/rubymine/bin/rubymine.sh
Name=RubyMine
Icon=/home/rubymine/bin/RMlogo.svg
Obviously change file names and paths as required for your apps and your system. Then anytime when I need to run RubyMine I tap the ALT key and the Unity search bar appears; I start typing the first few letters of RubyMine and when the icon pops up I just click it. I also set up IntelliJ this way for Clojure and Java development. I prefer this to creating task bar icons. Very similar to using Command-space on OS X.

The only disappointment (so far) is the lack of an officially supported Evernote client for Linux. I appreciate the work the developers put into the open source client NixNote, but it lacks polish. Still, I am glad to have it!

I was very pleasantly surprised at how well the Dropbox Linux client works. Once it synchronized my files the service uses little memory and CPU time and everything is well integrated with Unity.

I have been pretty much using OS X Mountain Lion (dev preview, now the gold master) for my work this year but despite the lack of general polish in Desktop Linux there are some real wins in using Linux, at least for my workflow because I can match my development environment on my laptop with my servers, and some small time sinks of dealing with OS X go away. Since my tools like IntelliJ (for Clojure and Java), RubyMine (Ruby), Emacs, Dropbox, and lots of git repositories all work the same, it is really pretty painless for me to enjoy using Linux for a few days, alternate back to OS X, and choose the dev environment based on what I am working on.

Monday, July 09, 2012

I just released some NLP code for Pharo Smalltalk

After writing about Pharo Smalltalk the other day I started looking at some of my old projects. I created a new github repo for my Pharo code and I'll add code as I have time to re-test it and clean it up. There is not much there right now, just a part of speech (POS) tagger.

Sunday, July 08, 2012

Nice: OpenCyc version 4.0 has been released

The last release 2.0 was made available almost three years ago, so I was very happy to see that version 4 is available.

OpenCyc is a knowledge base and ontology containing about 270K terms and 2 million triples. There are now many more links to outside (or OpenCyc) knowledge resources like DBPedia, UMBEL, Wordnet, etc. Check out the new features.

I downloaded the Linux version (requires Java 6, 3 GB memory, and is only for 64 bit OSs and Java). I had no problems at all running it on 64bit Ubuntu and OS X Mountain Lion (developer preview 4) - it only took a few minutes to install and run. Just follow the instructions in the README.txt file and try the web interface at http://localhost:3602/cgi-bin/cg?cb-start Nice to see this the free version of Cyc getting support and development efforts.

I would be very interested in hearing from people who have done projects with either OpenCyc or the OpenCyc OWL/RDF data.

9-9-2012 update: Alyona Medelyan provided a link for downloading data to extend Cyc with terms from Wikipedia.

Using Dojo Mobile in Clojure Noir web apps

I made a first cut at wrapping Dojo Mobile for use in Clojure Noir web app projects. If you want to try what I wrote this morning here is the github repo. The code is really crude. For example, I just embedded Javascript to test AJAX right in a Noir Clojure view file and I don't make full use hiccup. If I have time in the future, I would like to support a wide range of Dojo control elements and perhaps even use ClojureScript. Pull requests welcomed :-)

Here is what it looks like on an iOS device:

Controls work and look different on different devices. This is part of the magic that Dojo Mobile does for us. On the Android platform a selection list behaves like an Android selection list:

Friday, July 06, 2012

A shoutout and thanks to the Pharo Smalltalk developers

Pharo is a fork from the Squeak open-source Smalltalk and provides an incredibly rich development environment. As a consultant people pay me to design and write code in Ruby, Clojure, Common Lisp, and Java. That said, for non-work related experiments, Pharo is a lot of fun to use: a modern and free Smalltalk environment. I just wanted to say thanks to the Pharo team: great work! I recently downloaded the 2.0 development build - exciting to see new features. One thing in particular that strikes me as awesome about Pharo is that it is very light weight, using little memory and CPU resources. I wrote a blog 5 years ago about deploying Squeak to Linux servers. I am a little surprised that Pharo is not more widely used for rich web applications but with so many great languages and frameworks (Rails, Sinatra, Clojure Noir, Java Play Framework, GWT, etc., etc.) there is a lot of competition for developer mindshare.

My personal interest in Smalltalk started when I got a Xerox 1108 Lisp Machine in 1982 and the Xerox SIS salesman gave me a one month license to try Smalltalk.

7-9-2012 update: I just posted some Pharo code to a new github repo.

Code examples for a Dojo Mobile one page application. Backend in Ruby/Sinatra

I wrote a few days ago that I am excited about how easy it is to make simple one page web apps using Dojo Mobile that look good and work fine on portable devices (Android and iOS) and regular web browsers. I don't do a lot of UI development for my work but when I do write web apps it is great to also be able to support mobile devices with a small amount of additional simple code. In this post I will show you hopefully useful code snippets for cookingspace.com that may save you some time if you want to write the same type of apps.

Last weekend I decided that I wanted a new mobile web interface to my old Cooking Space web site. I wanted to be able to quickly look up a recipe on my cellphone, see the ingredient list, and be able to specify how many people need to be served. I also want to see the nutrition data for the recipe. A top level requirement is that once the web page renders then everything is updated with AJAX. There are only two user interactions:

  • Enter a few search terms to locate recipes and show them in a format compatible with both small devices and a web browser on a laptop.
  • A control for changing the number of people a recipe will serve - this adjusts the amounts in the ingredients list using AJAX.
This is much simpler UI than that for the original web site but captures just what I personally want to use it for. I use the admin functionality on the old web site for adding and editing recipes which is a complex task because I use the USDA nutrition database and instead of for example adding "chicken" to a recipe I need to find out of the 200 entries for chicken in the nutrient database which entry for chicken to use. Same for all ingredients. This admin functionality is not user facing code but five years ago I implemented it in Rails with a nice AJAX enabled UI.

There is a lot of backend Ruby model code that I won't show. I'll show the controller code, the Javascript, and the HTML5 and Dojo application code I used.

The Sinatra controller code is very simple since it only has to serve up the web page and then handle AJAX calls using the model code. I reformatted the Ruby code to make it a little more verbose and easier to understand:

enable :sessions

get '/' do
  session["n"] ||= "4"
  erb :index
end

get '/getr' do  # handles AJAX calls from Javascript
  ret = "No recipes found"
  q = params['q']
  n = params['n']
  session["n"] = n.to_s   if n
  if q && q.length > 0
    ids = search q # calls model code to do search
    if ids && ids.size > 0
      num_served = session["n"].to_i
      ret = recipes_to_html num_served,ids.collect {|x| x[0]}
    end
  end
  ((!q || q.length == 0) && n) ? "*" : ret
end
The handler for the route '/getr' uses two parameters for search terms and number of people to be served. The web page has two sections: the top section contains the search input form and a control to set the number of people servered. The bottom section just contains an HTML element that receives the AJAX response. I usually put Javascript in a separate file but the Javascript code for this app is so short and simple I just included it in the HTML/ERB template file. Much of the following Dojo boilerplate comes from the excellent Dojo Mobile Documentation:
<!DOCTYPE html>
<html>
  <head>
    <meta name="viewport"
      content="width=device-width,initial-scale=1,maximum-scale=1,minimum-scale=1,user-scalable=no"/>
    <meta name="apple-mobile-web-app-capable" content="yes"/>
    <title>CookingSpace</title>
  <script src="http://ajax.googleapis.com/ajax/libs/dojo/1.7.1/dojo/dojo.js"
         data-dojo-config="async:false"></script>
 <script>
    require(["dojox/mobile/parser",
             "dojox/mobile/deviceTheme",
             "dojox/mobile/compat",
             "dojox/mobile",
             "dojox/mobile/TextBox"],
        function(parser, deviceTheme) {
            parser.parse();
        }
     );

     function ajax_helper() {
       dojo.xhrGet({
        url: "/getr?q=" + dojo.byId("q").value +
                     "&n=" + dojo.byId("served").value,
        handleAs:"text",
        timeout: 8000, // 8 seconds
        load: function(data, args) {
          if (data != "*") {
            dojo.byId("rcontent").innerHTML = data;
          }
        },
        error: function(err, args){
          dojo.byId("rcontent").innerHTML =
           "An unexpected error occurred: " + err;
        }
     });
    }
    dojo.connect(dojo.byId("q"), 'onchange', ajax_helper);
   </script>
  </head>
  <body>
    <div id="settings"
         data-dojo-type="dojox.mobile.View"
         data-dojo-props="selected: true">
    <h3 data-dojo-type="dojox.mobile.Heading">
      CookingSpace Mobile Recipe Lookup
    </h3>
    <div data-dojo-type="dojox.mobile.RoundRect"
         data-dojo-props='shadow:true'>
    <input id="q" name="q" size="40"
           data-dojo-type="dojox.mobile.TextBox"
           placeHolder="Search for recipes" />

    Number served:
    <select name="served" id="served">
      <option value="1">1 person</option>
      <option value="2" selected="selected">2 people</option>
      <option value="3">3 people</option>
      <option value="4">4 people</option>
      <option value="5">5 people</option>
      <option value="6">6 people</option>
    </select>
   </div>

 <div id="rcontent" data-dojo-type="dojox.mobile.RoundRect">
 ...
 </div
 
Please note that I set asynchronous loading of Dojo resources to false to make sure that the Dojo parser had a chance to parse the HTML on this page marking elements with Dojo types before processing my Javascript. This simplest approach is OK for a one page web app that only gets loaded once.

This is a simple example and hopefully will give you a good start creating one page mobile HTML5 web apps using Dojo Mobile.

Wednesday, July 04, 2012

Experimenting with Dojo Mobile

In my work I specialize in Natural Language Processing (NLP), text mining, and general AI development. That said I find myself writing a lot of web apps and what I really want is an easy to use web client stack with rich controls and that facilitates writing one web app that looks and works OK in web browsers, Android phones and tablets, iPhones, and iPads.

Five years ago I wrote a very simple web app cookingspace.com that let me look up the nutrients (using the USDA nutrition database) for recipes we frequently make. I have just spent a few hours rewriting the front end using Dojo Mobile, removing a lot of features that I don't need anymore that is hosted at mobile.cookingspace.com. The new app lets me quickly check nutrients and also on portable devices I can use it while grocery shopping to make sure I get the ingredients I need for making dinner. Both apps are deployed at Heroku (thanks Heroku!).

Dojo is really a nice web client toolkit. I suggest you take a look if you haven't tried it. I have used Dojo for years with server side code written in Common Lisp, Ruby, and more recently Clojure.

Tuesday, June 19, 2012

Importance of testing

I just finished the excellent Coursera/Berkeley Software as a Service (SaaS) class yesterday. There were two major themes in the class: engaging customers in creating user stories to make sure that you build the right thing, and use BDD and TDD that relies on continuous testing. In particular I enjoyed learning how to use Cucumber that has two parts: an english (or other natural language) style DSL that customers and developers can work on together to create user stories with expectations of what the system should do and an underlying set of steps that map the customer facing DSL descriptions to real testing code.

Cucumber is very popular in the Rails world but I had never tried it before. I now like using Cucumber and as I get more used to it for Ruby and also for Rails I look forward to integrating it into my development process for other programming languages.

A few nights ago I spent some time with the lein-cucumber project for Clojure. It is still a little rough around the edges compared to the Ruby version but seems to be very workable.

This morning over coffee I have been looking at the Java version Cucumber-JVM that lein-cucumber is based on. I use IntelliJ and there is a Cucumber plugin. I couldn't get my tests to run inside IntelliJ this morning but the IntelliJ editor understands the syntax of features files, does syntax color highlighting etc. Looks promising for the future. For now I think that I will just add the Maven Cucumber-JVM dependency and run the tests from the command line. For Ruby and Rails development running RSpec and Cucumber tests inside the IDE is very convenient so I look forward to figuring out how to get in-IDE tests running in IntelliJ.

Monday, June 18, 2012

More on PaaS: new dotCloud pricing and services

I have had a dotCloud account for a while, but so far only to experiment with. dotCloud has announced new pricing models that look very developer friendly with free lower performance sandbox support and "live" support for production. They may have hit the sweet spot for supporting free development while nudging developers to not deploy small demo apps using their free sandbox model. I would love to see Heroku's and dotCloud's customer stats on free versus paid hosted web apps.

The basic idea is that as you add services you may by the amount of memory that those services use. I like their handling of horizontal and vertical scaling. I have not tried it yet, but from the documentation it looks like when you horizontally scale a persistent service like PostgreSQL, MongoDB, MySQL, etc. they automatically set up master/slave, replica sets, etc. as appropriate. You can also vertically scale any service by adding memory.

Another interesting PaaS that I have experimented with is Cloud Foundry that is, like Redhat's OpenShift, an open source stack that you can also install on your company's private servers.

Managing servers is a form of technical debt. Using PaaS costs a lot more for raw resources but "set and forget" web apps deployed to PaaS can end up being a lot cheaper to deploy over long term.

Sunday, June 17, 2012

A better tool for private social and working networks? Using Open Source Apache Wave (used to be Google Wave)

OK, maybe now we should just call is Wave.

As I talked about a few years ago in my blog, I really enjoyed using Wave for interacting with family and friends, and also experimenting writing Wave Robots hosted on AppEngine. Those good times have ended :-(

Fortunately, the Apache Wave incubator project provides the code and directions for running Wave on your own servers.

If you are lucky, installing Wave is as simple as:

git clone git://git.apache.org/wave.git wave
cd wave
ant compile-gwt dist-server
ant -f server-config.xml
./run-server.sh
The default data store is the file system but you can use MongoDB instead. After you have started the server, you can use two different web browsers (e.g., Firefox and Chrome) to create two different accounts using the Register a new account link on the right side of the welcome page. After logging in under two accounts with two web browsers then create a new note under one of your Wave accounts and invite the other user account to that note. As you type on one account you see the characters echoed on the other.

Wave has a beautiful UI, cleaner than the original Google Wave. A beautiful example of a GWT web application. The code base is moderately large with about 1800 Java source files and 450 Java test files. There are 26 GWT modules. So, lots of code, but it is well organized and written, and so not too scary: certainly possible to make custom modifications for different business uses.

Apache Wave in its current form seems stable (but I like to back everything up before doing a git pull, and I don't git pull updates very often) and with a MongoDB back end, Wave looks like it would be a nice tool for private workgroups, families, groups of friends, etc.

Anyone considering an open source private Facebook-like system might want to take a long look at Apache Wave. While I wish the Diaspora project (and other open source Facebook alternatives) well, I can't help but think that Apache Wave is a better place for people to combine their energies so we all have a good tool to set up private social and working networks.

Redhat OpenShift is another interesting PaaS

My friend Alex Ott left a comment on my blog yesterday asking me if I had looked at OpenShift.

I had created an account a while ago but never did anything with it. This morning I briefly tried getting Clojure (not yet supported) running but promptly gave up and switched to Ruby, which along with Java, Python, Node.js, and Perl is supported. It only seemed fair to test Redhat's PaaS with a supported stack.

By default Ruby 1.8.7 is supported with a lot of gems pre-installed. I preferred to use RVM and Ruby 1.9.3 so I:

  • I created a new empty "Do-it-yourself" Cartridge using the web console. I was prompted to add a public key, etc. Easy setup.
  • A new empty cartridge runs a trivial Ruby web app.
  • I followed Mark's instructions to install RVM and Ruby 1.9.3.
  • After that, a git commit and a git push redeploys your app.
  • You should take a careful look at .openshift/action_hooks/* that are places where you can customize builds and deployments.
Using Mark's instructions, I temporarily modified .openshift/action_hooks/pre_build to look like this:
cd $OPENSHIFT_DATA_DIR
curl -L https://raw.github.com/xiy/rvm-openshift/master/binscripts/install-rvm-openshift.sh | bash -s
After that, I re-edited it to look like:
#cd $OPENSHIFT_DATA_DIR
#curl -L https://raw.github.com/xiy/rvm-openshift/master/binscripts/install-rvm-openshift.sh | bash -s

source $OPENSHIFT_DATA_DIR/.rvm/scripts/rvm
rvm use [email protected]

gem install sinatra rails mongo mongo_ext
After installing some gems I commented out everything in this file for future reference. I then edited .openshift/action_hooks/start to look like this:
source $OPENSHIFT_DATA_DIR/.rvm/scripts/rvm
rvm use [email protected]

gem list
ruby --version
nohup $OPENSHIFT_REPO_DIR/diy/testrubyserver.rb $OPENSHIFT_INTERNAL_IP $OPENSHIFT_REPO_DIR/diy > $OPENSHIFT_LOG_DIR/server.log 2>&1 &
The last line was originally in the generated file. Once this is all working, you can replace the placeholder mini-app with a Sinatra or Rails app.

There have been no announced costs for using OpenShift. For now it is in a free beta period. It is reasonable for Redhat to want a lot of test users so they can measure what it costs for them to provide the various services and then announce costs.

Saturday, June 16, 2012

Deciding between two premium hosting options for a new side project

Over the course of a year I spend about 2/3 of my working time consulting for customers and the other 1/3 doing my own projects. I view doing personal projects as a form of continuing education. My last major personal project was my natural language processing (NLP) web service KBSportal.com and at the same time I spent a day writing SleepyBird.us for my wife. The project before those projects was an experiment in a live note taking and scheduling system for consultants ConsultingWith.me which I still use but I have turned off the ability for people to create accounts. I never implemented the sister project ConsultingWith.us. Six years ago my side project was CookingSpace.com which I do still use. For me side projects are a great way to try out new ideas and it is well worth it to me to pass on less interesting consulting gigs to free up enough time for my own stuff.

I am in general interested in the combination of knowledge management and NLP and my new side project combines these interests with a tool that I want for myself: a system to register web resources, notes, and tasks and getting periodic summaries. One of the things I want to experiment with is three front ends (web app, mobile web app, and email) accepting input and displaying one a day and once a week status summaries to help put my time in context - to stimulate reflection on what I have been doing. The other thing that I want to experiment with is a web worker and backend worker architecture.

In the past, I combined web front end code and a background work thread into a single JVM process: basically I wrote Java web apps that had a specific worker servlet whose init method started a work thread. That always worked really well but for my new project I want to have a better architectural fit with platform as a service (PaaS).

Which brings me to the main topic of this article: I view both Google's AppEngine and Heroku to be Premium Services. Premium because no way you get a fraction of the raw resources as using a physical server from a provier like hetzner.de or even from AWS. What you get is the tradeoff on saving admin resources and easier scaling.

AppEngine got a lot of bad press for their price increases but the result is a service that they are unlikely to stop providing. One of the implementation options I am considering is a front end web worker written in Python and a backend worker written in Clojure. For this experiment I doubt that I will have to scale past that, but at least the path to scaling out would be straightforward. The other implementation option I am thinking about is using Heroku with a front end rails app and a background worker using either Ruby or Clojure.

Monday, June 04, 2012

I am a living advertisement for JetBrains products :-)

Full disclosure: for years JetBrains comped me with free products, but for a few years now I have been buying my own licenses.

I have been working since 5am this morning on three customer projects in three languages:

  • Python - using PyCharm
  • Ruby - using RubyMine
  • Java - using IntelliJ
Switching languages and supporting platforms and frameworks sometimes causes a little cognitive dissonance, taking a few minutes to get my head into a different toolset.

With the same (almost) user interface, similar code completion, etc., switching between three languages and projects today was easy with no stress.

I will also thank other companies that make it easier for me to do my work:

  • Amazon AWS - Amazon is my favorite technology company. AWS has changed my working life. Since I live in a remote mountain area, I also appreciate getting stuff from Amazon. The Kindle platform with synched reading accross all of my devices is also great.
  • Apple - my MacBook Air and iPad are great devices that I don't even have to think about while I am working and playing (and the iTV is pretty cool also)
  • Heroku - I am a big fan of PaaS and Heroku is my favorite
  • Google - for GMail, Docs and Calendar
  • github - really makes remote customer interaction simple and productive
  • Evernote - my own personal memory box
  • Dropbox
  • Netflix - covers most of my non-sports and non-reading recreation
I work and live in a digital world and these great companies deserve my business for making my life simpler and more fun.

Wednesday, May 30, 2012

The one computer to rule them all

No I am not talking about a super computer, at least not in the conventional sense.

I saw a blurb in this month's Technology Review magazine about the Padfone: a cellphone that slips inside a Pad device when you want a larger screen. Good idea as far as it goes but why not add instant connectivity to any compatible keyboard, monitor, and trackpad in the area.

I find a slight hassle switching between my MacBook Air Air, iPad, and Droid during the day. Having one tiny device with a copy of everything is a possible alternative to Apple's iCloud, Dropbox, and other distributed "store all of my stuff" ideas.

Sunday, May 13, 2012

I am adjusting to a mobile digital life: the acceleration of convenience

The first time I used a computer was in 1962 when I was 11 years old. It was not convenient: my Dad had to take me to his office to use one of the early timesharing systems using a local teletype machine with punched tape for saving and reloading stuff. A few years later I took an extension course at a local UC campus, FORTRAN with punched cards with nice keypunch facilities: definitely a large improvement!

In the 1980s life really got good: my own Xerox 1108 Lisp Machine and an Internet connection.

In the decades starting in 1990 and 2000 there were continual improvements but progress was gradual: better Internet connections, the development of the Web, improvements in software tools like code repositories, IDEs, etc.

Skipping ahead to the present time, I am trying to adjust my digital life, including my work and writing flows, to a more mobile lifestyle.

Probably the biggest change for me is that I used to keep just about everything that I worked with and touched in a repository (first cvs, then svn, and now git). I no longer need incremental backups of all of my work because for low priority stuff I can always dig back in time using simple Time Machine backups of my MacBook Air. I do still use a lot of git repositories for high value assets like:

  • Software development projects for customers
  • My open source projects at github
  • My own high value projects such as software and other assets for my web properties and commercial software products
This is a major change for me. I keep other assets that don't really need versioning on DropBox:
  • All assets for my writing projects: books I have written in the past, my current writing project, etc.
  • A million + 1 small code snippets and small bits or program code organized by all languages that I use (Common Lisp, Java, Ruby, Scheme, Haskell, Python, Prolog, etc. (whenever I figure out a new coding technique, how to do something generally useful, etc., I always like to save a little code snippet to refer to later)
  • Favorite pictures taken in my lifetime (old ones are now digitized)
  • Favorite videos taken in my lifetime (lots of digitally converted 8mm videos from my childhood through recent vacation videos: everything!)
  • All of my personal notes organized by travel logs, personal writing, etc.
  • A large fraction of the huge amount of music that I have purchased, organized by artist (and sometimes also by album)
  • Most of the eBooks that I have ever purchased if they are not in the Kindle format.
For Kindle books not purchased through Amazon, I email them to my Kindle/Amazon email address and let Amazon manage keeping everything in-sync on all of my devices. Kindle platform syncing the current reading location across devices is a huge convenience since I routinely read on my Kindle, iPad, and MacBook Air.

I am careful to not keep anything that is highly proprietary on DropBox (i.e., assets for customer projects). I used to also keep well organized directories of useful reading material found on the web (e.g., PDFs on AI, machine learning, NLP, server deployments, etc., etc.). I have very recently made the rather large step in throwing all of this material into Evernote, backing it up, and deleting it.

The really big win in relying on DropBox, Evernote, and the Kindle platforms is being able to switch between computers easily and also have most of my stuff available on my iPad and Droid cellphone. I use several computers and it is a slight nuisance doing a git pull every time I switch computers. I like to use mobile devices for reading and general thinking time and this is a lot easier now. I have been running Apple's beta OS X Mountain Lion that has iCloud integration. There is a chance that if iCloud is very well implemented that I may slowly transition away from DropBox to iCloud, and switch to using an iPhone. However, DropBox is very well implemented so Apple would need to make iCloud's implementation across all Apple devices very, very good for me to make the switch.

Saving the last big win for last: a mobile digital life promotes more of what Clojure creator (and general programming mentor) Rich Hickey calls "hammock time": the time you spend away from the computer thinking. I still use a pad or paper and a pen for away from a computer thinking time, but I find mobile devices augment this activity nicely.

Friday, May 11, 2012

Since Bing search and spelling APIs are no longer free, I did a quick survey of other services and signed up for Yahoo BOSS search.

I signed up for Google's APIs around 2002 and used it until they stopped the service. I later switched to Microsoft's Bing APIs: the cost was great, free! Bing's new entry level cost of $40/month for 20,000 queries does not match my use case. I wish they had a $5/month for 2000 queries.

I do a low volume of search for personal research projects and prototypes. Yahoo BOSS's service has an attractive price but I worry about the service being terminated with the deal with Microsoft. Google offers custom site search, but I didn't see any options for a general paid-for web search API.

So, I signed up with Yahoo BOSS for search and if the service is ever terminated, I will look elsewhere. The price is good, $0.80 per 1000 queries.

2012-05-19 edit: Bing search now has a free tier for up to 5000 search requests per month. Cool! For research purposes, this is good enough.

Monday, April 23, 2012

Back from Vacation, catching up on work, and a Java retrospective

Carol and I got home from visiting family in Rhode Island late last night and I have been working catching up on customer work since 4am this morning. Here is a picture of my birthday dinner.
The kids got two pound lobsters for everyone - I am holding mine and a plate of grilled veggies.

Java: I used to do research programming in Common Lisp or Scheme. For ideas that worked I often re-coded them in Java for better deployment options, for better alignment with customer preferences, etc. I fell out of that habit six or seven years ago because I had a long term Common Lisp development job and also really got into Ruby and Clojure development. Both Ruby and Clojure are great for research programming and for some types of applications the deployment options are good.

That said, even Clojure which is efficient (about 1/3 the speed of Java and uses about 3 times the amount of memory) wastes computing resources (hits the environment and the pocketbook). Ruby is much less efficient than Clojure.

I started thinking about efficiency after talking on my flight home to an Intel engineer who was sitting next to me on the plane. He was reading through a book on DSP with tons of differential equations. He said that he used C sometimes but felt better about projects that were mostly done in assembler language because it was a shame to waste processing cycles. We ended up talking about the efficiency of programming languages.

I still believe that it makes sense to prototype systems in whatever language and platform make sense for getting working code in place quickly. The decision is whether to recode for more efficient deployments and if so when to do it.

Sunday, March 18, 2012

Using Wolfram Alpha from Clojure

I have been blown away in the last year by Wolfram Alpha but I haven't done much with the developer's APIs. To make it easier to experiment with Wolfram Alpha, I wrote a simple Clojure wrapper for the Java APIs. You can get a copy at github.

In case you don't want to grab the github repo, here is most of the code:

(ns wolfram)

(def appid (System/getenv "WOLFRAM_APP_ID"))
(def engine (com.wolfram.alpha.WAEngine.))
(.setAppID engine appid)
(.addFormat engine "plaintext")

(defn query [input]
  (let [query (.createQuery engine)]
    (.setInput query input)
    (let [result (.performQuery engine query)]
      {:pods
       (for [pod (.getPods result)]
         {:title (.getTitle pod)
          :sub-pods
          (for [sub-pod (.getSubpods pod)]
            (for [contents (.getContents sub-pod)]
              (.getText contents)))})})))
Notice that you need to set the API key for your application in an environment variable. You get 2000 free API calls a month. Here is some sample output (with some output removed for brevity):
test=> (query "distance between San Diego and San Francisco")
{:pods ({:title "Input interpretation", :sub-pods (("distance | from | San Diego, California\nto | San Francisco, California"))} {:title "Result", :sub-pods (("453.7 miles"))} {:title "Unit conversions", :sub-pods (("730.2 km  (kilometers)") ("730194 meters") ("7.302?10^7 cm  (centimeters)") ("394.3 nmi  (nautical miles)"))} {:title "Direct travel times", :sub-pods (("aircraft  (550 mph) | 49 minutes 30 seconds\nsound | 36 minutes\nlight in fiber | 3.41 ms  (milliseconds)\nlight in vacuum | 2.44 ms  (milliseconds)\n(assuming constant-speed great-circle path)"))} {:title "Map", :sub-pods ((""))})}
user=> (query "pi")
{:pods ({:title "Input", :sub-pods (("pi"))} {:title "Decimal approximation", :sub-pods (("3.1415926535897932384626433832795028841971693993751058..."))} {:title "Property", :sub-pods (("pi is a transcendental number"))} {:title "Number line", :sub-pods ((""))} {:title "Continued fraction", :sub-pods (("[3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 1, 1, 15, ...]"))} {:title "Alternative representations", :sub-pods (("pi = 180 ?") ("pi = -i log(-1)") ("pi = cos^(-1)(-1)"))} {:title "Series representations", :sub-pods (("pi = 4 sum_(k=0)^infinity(-1)^k/(2 k+1)") ("pi = -2+2 sum_(k=1)^infinity2^k/(binomial(2 k, k))") ("pi = sum_(k=0)^infinity(50 k-6)/(2^k binomial(3 k, k))"))} {:title "Integral representations", :sub-pods (("pi = 2 integral_0^infinity1/(t^2+1) dt") ("pi = 4 integral_0^1 sqrt(1-t^2) dt") ("pi = 2 integral_0^infinity(sin(t))/t dt"))})}

Saturday, March 03, 2012

A bright future, with some potential problems

Even though the news media portrays a dire world situation, I disagree. In the last few decades the world has become a safer place and fundamental shifts in technology keep driving down the cost of computing resources, networks, and storage that enable greatly increased global productivity. For much of the world globalization is a rising tide that floats most people's boats.

The problem is that not everyone benefits from new paradigms for constant lifelong learning, diminishing advantages of organizations who hold to old mega-scale production and business models, and a free flow of information. The book The Power of Pull is a good reference for ideas how to take advantage of the transitions that the world is going through, whether you like them or not!

The losers in this new world are people and organizations who cannot (or don't want to) adapt and learn and who expect material rewards that are out of touch with their productivity. The biggest potential problem that concerns me is that some of these "losers" have tremendous political and economic clout and will struggle to hang on to old advantages instead of engaging in more forward thinking and productive activities. You don't have to look further than businesses that are "too big to fail" to understand the real dangers of powerful incumbents to our future prosperity and security.

On a personal level, I do believe that for the most part we have control of our lives and that both our happiness and sadness in life is mostly an internal process in our own minds and is fairly independent of the world at large. Certainly, some people are born into, or live, in very harsh situations, but for most people there is at least the opportunity for material success and personal happiness. A cliche, but true: people who live in the past tend to be depressed, those who live in the future are anxious, and those who live in the moment are usually happy and content. The more we can focus our attention on what we are doing in the moment the happier and more productive we can be.

I leave it up to you how you want to manage your life, but I will mention a few things that work for me:

  • I don't waste much time exposing myself to the negatively toned corporate-slanted news media. It is necessary to understand what is happening in the world, and why, but a few minutes a day reading news stories from multiple sources around the world suffices.
  • Everyday I enjoy the time I set aside for learning new technologies, practicing a musical instrument, trying new recipes, hiking with friends, and generally enjoying my family. Without fun time, it is difficult to be productive while working.
  • I spend time and resources helping and mentoring people, and working extra time each week to support three very worthwhile charities. I am convinced that a quality life requires the certain knowledge that we are personally helping to make the world a better place.
  • Time is probably our most precious resource. In addition to saving the time not wasted on corporate news, I try to evaluate how I spend my time, realizing that watching TV, watching too many movies, and other mindless time sinks all have tremendous opportunity costs: how much more can we accomplish and how much more can we enjoy our lives if we apply critical thinking to how we spend our time?

Sunday, February 19, 2012

Using pjax with Clojure and Noir: minimize client side Javascript code while maintaining fast page load times

I don't like doing a lot of client side Javascript (or Coffeescript) development. pjax is a way to minimize client side Javascript while maintaining fast page load times.

I became interested in pjax after reading an article on the development of Basecamp Next. DHH indicated that they looked at pjax but then rolled their own similar system.

Here is a github repo with a Clojure and Noir example web app using pjax that I wrote this morning. There were a few non-obvious aspects to using pjax with Noir so hopefully this will save you some time.

If you don't want to grab the github repo, here are a few interesting code snippets. First, we need to run a little Javascript to process links to set up for AJAX calls setting a "X-PJAX" header:

$(function(){
    // Activate PJAX test links
    // Response will be loaded into #wrapper element
    $('a').pjax('#wrapper')
})
I put this code in resoures/public/js/application.js which is loaded in the common Clojure page wrapping code (in common.clj):
(ns noir-pjax-example.views.common
  (:use [noir.core :only [defpartial]]
        hiccup.core
        hiccup.page-helpers))

(defpartial layout [& content]
  (println "\n**** layout\n")
  (html5
    [:head [:title "noir-pjax-example"]
     (include-css "/css/reset.css")
     (include-js "/js/jquery.js")
     (include-js "/js/jquery.pjax.js")
     (include-js "/js/application.js")]
    [:body "<h1><b>Demo of pjax with Clojure and Noir</b></h1><br/><br/>"
     "<div id=\"wrapper\">"
     content
     "</div>"]))
The trick is that this wrapper code only gets executed one time and the browser only needs to set up the page one time. Only the div with id "wrapper" gets replaced by the standard pjax Javascript file jquery.pjax.js.

Sure, there is still Javascript using pjax, but you don't have to write much at all. In this case, I am "pjax-ifying" all HTML links with the small Javascript snippet in application.js; in a real application, you would be more selective and perhaps also set up multiple page elements that pjax updates. The following code snippet shows the file welcome.clj:

(ns noir-pjax-example.views.welcome
  (:require [noir-pjax-example.views.common :as common]
            [noir.content.getting-started])
  (:require noir.request)
  (:use [noir.core :only [defpage]]
        [hiccup form-helpers page-helpers]
        hiccup.core
        hiccup.page-helpers))

(defpage "/" []
  (if (nil? (((noir.request/ring-request) :headers)
             "x-pjax"))
    (common/layout
      [:p "Welcome to noir-pjax-example /"]
      [:a {:href "/"} "foo 1"])
    (str
      "<p>Welcome to noir-pjax-example /</p>"
      "<a class=foo href=\"/foo\">foo 2</a>")))

(defpage "/foo" []
  (if (nil? (((noir.request/ring-request) :headers)
             "x-pjax"))
    (common/layout
      [:p "Welcome to noir-pjax-example /foo"]
      [:a {:href "/"} "home 4"])
    (str
      "<p>Welcome to noir-pjax-example /foo</p>"
      "<a class=foo href=\"/\">home 3</a>")))
There is not much of a trick here: I check to see if there is a "x-pjax" header and if there is I don't call the common layout page wrapper function.

Friday, February 17, 2012

Nice discovery: PJAX and Rails

Well this has been a discovery for me :-)

I finished my work early today and started my afternoon working through a simple iOS 5 tutorial. As recreation, I went over to Hacker News and read a linked article by David Heinemeier Hansson on making the response time fast on Basecamp Next while still doing mostly server side processing. The article and the comments were great.

This is of some interest to me because I have recently spent a lot of time writing a lot of client side Javascript for a Dojo + Rails app: straightforward but time consuming. DHH in the article and HN comments was making the point that for his company, it was a better developer experience doing more with server side Rails and less custom rich client code in Coffeescript or Javascript. I agree. He, and other people in the comments mentioned pjax as a library for sending back requests to the server that are marked with a HTTP header 'X-PJAX' if the page layout is not to be returned. This makes it relatively easy to still write mostly server side code but make page load times small when most or all of the page layout is not changed. Here is a simple Rails demo program by Edison Machado.

I need a few days to digest this, but this is likely going to change how I write Rails applications.

Thursday, February 16, 2012

I feel a bit like a traitor to the open source movement: I just re-signed up as a Mac OS X and iOS developer

I am a Linux enthusiast (downloaded my first distro over a 2400 baud modem, a long time ago!) and I really like the Android platform.

That said, I have really been enjoying the integration between my iPad and my Apple iTV (that my stepson gave me for Christmas) and the Mountain Lion OS X information released today makes me feel fairly certain that the "Apple experience" is what I want when I am not earning money doing server side Java, AI and textmining consulting gigs, etc. For the work I do for making money (i.e., consulting) it doesn't matter what computer and operating system that I use. I am even planning on trading in my Droid cellphone for an iPhone this year.

I also have a long history with Apple. I prepaid for an Apple II and received serial number 71. I wrote the simple little chess program that Apple gave away on a cassette tape for a while. When the Mac shipped in 1984 I bought one right away and wrote a commercial app that generated a lot of revenue. So, Apple and I are old friends :-)

One of the reasons I paid Apple today to (re)join their developers program is that I want to play around with the early developers release of Mountain Lion. I would also like to experiment (play!) with the intersection of iOS and OS X rich clients for web services, etc.

Edit: I installed Mountain Lion. So far I like it and the only disappointment is that Air Play is not working (yet) to my iTV. As expected not all Apple apps are updated to use iCloud storage, etc.

Edit #2: AirPlay Mirroring doesn't work yet on an Intel Core 2 Duo processor.

Edit #3: after 1 week: I have spent a few hours working through Apple's iOS 5 and OS X Xcode tutorials - fun stuff, but I am going to spend less time on this in the near future because I am very busy with work projects.

Saturday, February 11, 2012

github repo for 4th edition of my Java AI book

As of right now, this new github repo mostly contains the code from the 3rd edition of my book but as I re-write the book, I'll also be updating my code. Some of this Java code really needs a rewrite: many of the examples from the first edition were written in 1998 - a long time ago! I have reworked the code with each new edition. The code examples are licensed under the LGPL but I am considering dual licensing them under Apache 2 also.

Any suggestions for code improvements, pull requests, etc. will be appreciated.

Saturday, January 21, 2012

Citrusleaf: an interesting (non open source) NoSQL data store

I have been using Citrusleaf for a customer (SiteScout) task. Interesting technology. Maybe because I am excessively frugal, but I almost always favor open source tools (Ruby, Clojure, Java, PostgreSQL, MongoDB, Emacs, Rails, GWT, etc., etc. that I base my businesses on). That said, I also rely on paid for software and services (IntelliJ, Rubymine, Heroku, AWS services, etc.) and it looks like Citrusleaf is a worthy tool because of its speed and scalability (which it gets from Paxos, using lots of memory, efficient multicast when possible for communication between nodes in a cluster, etc.)

Wednesday, January 18, 2012

Yes, the DynamoDB managed data service is a very big deal

Just announced today: DynamoDB solves several problems for developers:

  • No administration except for creating database tables (including some decisions like using simple lookup keys or keys with range indices and whether reads should be consistent or not)
  • Fast and predictable performance at any scale (but see comment below on the requirement for provisioning)
  • Fault tolerance
  • Efficient atomic counters
The probable hassle for developers that I see is in knowing how to provision tables for reasonable numbers of allowed reads and writes per second. When you create tables one option is to get warning emails when you hit 80% of provisioning capacity; I interpret this to mean that you really had better not go over the capacity that you have provisioned. Amazon needs to know how much capacity you need in order to allocate enough computing nodes for your tables. The capacity that you pay for can be raised and lowered to avoid getting runtime exceptions when you go over your provisioned number of reads and/or write per second.

The lastest AWS Java SDK handles DynamoDB. For Ruby, the latest aws-sdk (gem install aws-sdk) supports DynamoDB. I signed up for DynamoDB, looked at the Java example, and wrote a little bit of working Ruby code using documentation - I had to slightly change the example code to get it to work for me:

require 'aws-sdk'

dynamo_db = AWS::DynamoDB.new(
    :access_key_id => ENV['AMAZON_ACCESS_KEY_ID'],
    :secret_access_key => ENV['AMAZON_SECRET_ACCESS_KEY'])
table = dynamo_db.tables.create('my-table', 10, 5)

begin
  sleep 3
  puts "Waiting on status change #{table.status}"
end while table.status == :creating

# add an item
item = table.items.create('id' => '12345', 'foo' => 'bar')

# add attributes to an item
item.attributes.add 'category' => %w(demo), 'tags' => %w(sample item)
p item

# update an item with mixed add, delete, update
item.attributes.update do |u|
  u.add 'colors' => %w(red)
  u.set 'category' => 'demo-category'
  u.delete 'foo'
end
p item.attributes.to_h

# delete attributes
item.attributes.delete 'colors', 'category'

# get attributes
p item.attributes.to_h

# delete an item and all of its attributes
item.delete
I used the AWS web console to then delete the test table to avoid charges.

DynamoDB is a big deal because while it is easy enough to horizontally scale out web applications and back end business applications, it is a real pain to scale out data storage for session handling and application data. Except for paying for the service, Amazon is trying to remove these hassles for developers.

I think that in addition to deployments to EC2s, DynamoDB will be a very big deal for Heroku users because it gives them another data store option in addition to Heroku's excellent managed PostgreSQL service, MongoHQ, Cloudant, and other 3rd party data service providers.

Wednesday, January 11, 2012

Web 3.0 and the Semantic Web, a slight return

After talking with a friend and a friend of his about the Semantic Web and healthcare yesterday, I re-watched a great video on Web 3.0 by Kate Ray that I bookmarked and blogged about a couple of years ago. I like this video because it frames the problems that the Semantic Web is trying to solve. My last published book (for APress) had Web 3.0 in the title, a term that did not really catch on :-)

At least a little bit of my enthusiasm for Semantic Web technologies has diminished over the last ten years because of problems that I have had on customer projects trying to collect linked data from disparite sources and merge it into something useful. There are (apparently) no silver bullets and any data collection and exploitation activities involve a lot of difficult work.

I would not be surprised if this problem of merging different data sources is not solved by using Ontologies and webs of linked data sites, but rather, by vendors curating data in narrow domains and selling interfaces to this curated data.

In a world of too much information the activity of curation can have a very high value and this value and the market price for these services will determine the amount of resources invested in combinations of automated and manual curation of information.

Tuesday, January 10, 2012

sleepybird.us site is online

Yesterday I wrote about two web portals I have been working on in Clojure. One of them is online: our stock photo web site. This is a simple web app written with Clojure and Noir. I use the excellent stripe.com system for accepting orders for JPEGs (and soon hi-def video clips). In my tests it seems easy enough to buy JPEG files: you just check the ones you want, go to the purchase page, and in a few seconds you are downloading a ZIP file with the JPEGs you purchased. A simple little web app but I think that my wife and I will have fun with it: we are avid photographers.

Monday, January 09, 2012

My two new projects: both web portals written in Clojure

I have three web portal projects that I have wanted to develop for quite some time. I am close to releasing two of them (a text analytics web service and a stock photos and video clip store. My wife and I are avid photographers and we have been wanting to travel more and do more photography; I started putting together the photo site yesterday morning and hope to have it fully on line in the next day of two - simple to implement. The text analytics web service will be publicly available within a month or so (right now, just the demo page is active - I short circuited the new account login for now). My third project is a web portal for a single consultant to manage multiple customers. Last year I prototyped this for my own use using Java + GWT + AppEngine and then ported it off of AppEngine, using MongoDB for the data store. I have had such a fun and productive time using Clojure and Noir for my two recent projects that I am considering porting this third project to Clojure. I might leave it as-is except that I have already done most of the design for a more complex web app to manage multiple consultants with multiple customers. I know that the development would go much faster using Lisp.