Saturday, January 21, 2012

Citrusleaf: an interesting (non open source) NoSQL data store

I have been using Citrusleaf for a customer (SiteScout) task. Interesting technology. Maybe because I am excessively frugal, but I almost always favor open source tools (Ruby, Clojure, Java, PostgreSQL, MongoDB, Emacs, Rails, GWT, etc., etc. that I base my businesses on). That said, I also rely on paid for software and services (IntelliJ, Rubymine, Heroku, AWS services, etc.) and it looks like Citrusleaf is a worthy tool because of its speed and scalability (which it gets from Paxos, using lots of memory, efficient multicast when possible for communication between nodes in a cluster, etc.)

Wednesday, January 18, 2012

Yes, the DynamoDB managed data service is a very big deal

Just announced today: DynamoDB solves several problems for developers:

  • No administration except for creating database tables (including some decisions like using simple lookup keys or keys with range indices and whether reads should be consistent or not)
  • Fast and predictable performance at any scale (but see comment below on the requirement for provisioning)
  • Fault tolerance
  • Efficient atomic counters
The probable hassle for developers that I see is in knowing how to provision tables for reasonable numbers of allowed reads and writes per second. When you create tables one option is to get warning emails when you hit 80% of provisioning capacity; I interpret this to mean that you really had better not go over the capacity that you have provisioned. Amazon needs to know how much capacity you need in order to allocate enough computing nodes for your tables. The capacity that you pay for can be raised and lowered to avoid getting runtime exceptions when you go over your provisioned number of reads and/or write per second.

The lastest AWS Java SDK handles DynamoDB. For Ruby, the latest aws-sdk (gem install aws-sdk) supports DynamoDB. I signed up for DynamoDB, looked at the Java example, and wrote a little bit of working Ruby code using documentation - I had to slightly change the example code to get it to work for me:

require 'aws-sdk'

dynamo_db = AWS::DynamoDB.new(
    :access_key_id => ENV['AMAZON_ACCESS_KEY_ID'],
    :secret_access_key => ENV['AMAZON_SECRET_ACCESS_KEY'])
table = dynamo_db.tables.create('my-table', 10, 5)

begin
  sleep 3
  puts "Waiting on status change #{table.status}"
end while table.status == :creating

# add an item
item = table.items.create('id' => '12345', 'foo' => 'bar')

# add attributes to an item
item.attributes.add 'category' => %w(demo), 'tags' => %w(sample item)
p item

# update an item with mixed add, delete, update
item.attributes.update do |u|
  u.add 'colors' => %w(red)
  u.set 'category' => 'demo-category'
  u.delete 'foo'
end
p item.attributes.to_h

# delete attributes
item.attributes.delete 'colors', 'category'

# get attributes
p item.attributes.to_h

# delete an item and all of its attributes
item.delete
I used the AWS web console to then delete the test table to avoid charges.

DynamoDB is a big deal because while it is easy enough to horizontally scale out web applications and back end business applications, it is a real pain to scale out data storage for session handling and application data. Except for paying for the service, Amazon is trying to remove these hassles for developers.

I think that in addition to deployments to EC2s, DynamoDB will be a very big deal for Heroku users because it gives them another data store option in addition to Heroku's excellent managed PostgreSQL service, MongoHQ, Cloudant, and other 3rd party data service providers.

Wednesday, January 11, 2012

Web 3.0 and the Semantic Web, a slight return

After talking with a friend and a friend of his about the Semantic Web and healthcare yesterday, I re-watched a great video on Web 3.0 by Kate Ray that I bookmarked and blogged about a couple of years ago. I like this video because it frames the problems that the Semantic Web is trying to solve. My last published book (for APress) had Web 3.0 in the title, a term that did not really catch on :-)

At least a little bit of my enthusiasm for Semantic Web technologies has diminished over the last ten years because of problems that I have had on customer projects trying to collect linked data from disparite sources and merge it into something useful. There are (apparently) no silver bullets and any data collection and exploitation activities involve a lot of difficult work.

I would not be surprised if this problem of merging different data sources is not solved by using Ontologies and webs of linked data sites, but rather, by vendors curating data in narrow domains and selling interfaces to this curated data.

In a world of too much information the activity of curation can have a very high value and this value and the market price for these services will determine the amount of resources invested in combinations of automated and manual curation of information.

Tuesday, January 10, 2012

sleepybird.us site is online

Yesterday I wrote about two web portals I have been working on in Clojure. One of them is online: our stock photo web site. This is a simple web app written with Clojure and Noir. I use the excellent stripe.com system for accepting orders for JPEGs (and soon hi-def video clips). In my tests it seems easy enough to buy JPEG files: you just check the ones you want, go to the purchase page, and in a few seconds you are downloading a ZIP file with the JPEGs you purchased. A simple little web app but I think that my wife and I will have fun with it: we are avid photographers.

Monday, January 09, 2012

My two new projects: both web portals written in Clojure

I have three web portal projects that I have wanted to develop for quite some time. I am close to releasing two of them (a text analytics web service and a stock photos and video clip store. My wife and I are avid photographers and we have been wanting to travel more and do more photography; I started putting together the photo site yesterday morning and hope to have it fully on line in the next day of two - simple to implement. The text analytics web service will be publicly available within a month or so (right now, just the demo page is active - I short circuited the new account login for now). My third project is a web portal for a single consultant to manage multiple customers. Last year I prototyped this for my own use using Java + GWT + AppEngine and then ported it off of AppEngine, using MongoDB for the data store. I have had such a fun and productive time using Clojure and Noir for my two recent projects that I am considering porting this third project to Clojure. I might leave it as-is except that I have already done most of the design for a more complex web app to manage multiple consultants with multiple customers. I know that the development would go much faster using Lisp.

Monday, January 02, 2012

Using Emacs and org-mode in OS X

I recently ran across David O'Toole's org-mode tutorial and I have been experimenting with using org-mode with Emacs instead of the little utility web app I wrote for my own use a few years ago. I decided that I like org-mode better after learning the basic commands even though I can no longer access my to-do lists from ay web browser. Org-mode is useful for more than simply managing to-do lists and tasks but that is what I am mostly using it for.

To make org-mode always easily available I added this to my ~/.profile:

alias orgmode='echo -e "\033]0;org-mode\007";Emacs -nw ~/Documents/org-mode/.'
This will open org-mode in a the current term window tab and change the tab title to "org-mode." I keep all of my org-files in /Users/markw/Documents/org-mode/ so change that bit of bash script to reflect where you want to keep your org data files. You might also want to substitute emacs for Emacs -nw which is what I use for command line Emacs because I prefer the latest version 2.4.x of Emacs.