Saturday, November 26, 2011

Closer to the metal: Clojure, Noir, and plain old Javascript

I am wrapping up a long term engagement over the next five to six weeks that uses Java EE 6 on the backend, and SmartGWT (like GWT, but with very nice commercially supported components) clients. As I have time, I am starting up some new work that uses Clojure and Noir, and it is like a breath of fresh air:

I keep a repl open on the lein project and also separately run the web app so any file changes (including the Javascript in the project) are immediately reflected in the app. Such a nice development environment that I don't even think about it while I am working, and maybe that is the point!

As I have mentioned in previous blog posts, I really like the Clojure Noir web framework that builds on several other excellent projects. Developing in Noir is a lot like using the Ruby Sinatra framework: handles routes, template support options, but it is largely roll your own environment.

Monday, November 21, 2011

Ruby Sinatra web apps with background work threads

In Java-land, I have often used the pattern of writing a servlet with an init() method that starts up one or more background work threads. Then while my web application is handling HTTP requests the background threads can be doing work like fetching RSS feeds for display in the web app, perform periodic maintenance like flushing old data from a database, etc. This is a simple pattern that is robust and easy to implement with a few extra lines of Java code and an extra servlet definition in a web.xml file.

In Ruby-land this pattern is even simpler to implement:

require 'rubygems'
require 'sinatra'

$sum = 0

Thread.new do # trivial example work thread
  while true do
     sleep 0.12
     $sum += 1
  end
end

get '/' do
  "Testing background work thread: sum is #{$sum}"
end
While the main thread is waiting for HTTP requests the background thread can do any other work. This works fine with Ruby 1.8.7 or any 1.9.*, but I would run this in JRuby for a long-running production app since JRuby uses the Java Thread class.

Using the Stardog RDF datastore from JRuby

I was playing with the latest Stardog release during lunch - the way to quickly get going with the included Java examples is to create a project (I use IntelliJ, but use your favorite Java IDE) and include all JAR files in lib/ (included all nested directories) and the source under examples/src.

6/21/2012 note: I just tried these code snippets with the released version 1.0 of Stardog and the APIs have changed.


I took the first Java example class ConnectionAPIExample and converted the RDF loading and query part to JRuby (strange formatting to get it to fit the page width):
require 'java'
Dir.glob("lib/**.jar").each do |fname|
  require fname
end

com.clarkparsia.stardog.security.SecurityUtil.
      setupSingletonSecurityManager()
com.clarkparsia.stardog.StardogDBMS.get().
      createMemory("test")

CONN = com.clarkparsia.stardog.api.
        ConnectionConfiguration.to("test").connect()
CONN.begin()
CONN.add().io().format(org.openrdf.rio.RDFFormat::N3).
  stream(java.io.FileInputStream.new(
            "examples/data/sp2b_10k.n3"))

QUERY = CONN.query("select * where {?s ?p ?o}")
QUERY.limit(10)
RESULTS = QUERY.executeSelect()

while RESULTS.hasNext() do
  result = RESULTS.next()
  result.getBindingNames().toArray().each do |obj|
    puts "#{obj}: #{result.getBinding(obj).getValue().stringValue()}"
  end
  puts
end
This is mostly just a straight conversion from Java to Ruby. The first few lines enumerate all JAR files and require them. The last part, of interpreting the results, took a few minutes to figure out. I used IntelliJ to explore the result values of class MapBindingSet, looking at available methods to call to get the binding names of the variables in my SPARQL query and the values (as strings) for these three variables for each returned result.

Output will look like:
s: http://localhost/vocabulary/bench/Journal
p: http://www.w3.org/2000/01/rdf-schema#subClassOf
o: http://xmlns.com/foaf/0.1/Document

s: http://localhost/vocabulary/bench/Proceedings
p: http://www.w3.org/2000/01/rdf-schema#subClassOf
o: http://xmlns.com/foaf/0.1/Document
...
If you want to run this bit of code, put it in a file test.rb in the top level Stardog distribution directroy and just run
jruby test.rb
I wanted to be able to use Stardog from both JRuby and Clojure. My lunch time hacking today is just a first step.

Tuesday, November 15, 2011

Experimenting with Google Cloud SQL

I received a beta invite today and had some time to read the documentation and start experimenting with it tonight.

First, the best thing about Google Cloud SQL: when you create an instance you can specify more than one AppEngine application instances that can use it. This should give developers a lot of flexibility for coordinating multiple deployed applications that are in an application family. I think that this is a big deal!

Another interesting thing is that you are allowed some access to the database from outside the AppEngine infrastructure. You are limited to 5 external queries per second but that does offer some coordination with other applications hosted on other platforms or host providers.

Their cloud SQL service is free during beta. It will be interesting to see what the cost will be for different SQL instance types.

It was very simple getting the example Java app built and deployed. I created a separate SQL instance (these are separate from other deployed AppEngine application instances), made a new IntelliJ AppEngine project, pasted in the example code, and it all worked.

Perception of quality is often influenced by price. Since developers now have to pay more for using AppEngine, I find myself looking more at AppEngine as a premium service, which it is. Despite my dislike for MySQL (I use PostgreSQL when given a choice), Google's hosted and managed MySQL cloud data service looks good and provides developers with more options. Their SQL service is synchronously replicated between data centers automatically for you.

It has been a few years now since I had to either set up a physical server or a leased raw server for any deployments. I like that! Thank you Platform as a Service (PaaS) providers like Heroku (built on AWS) and AppEngine - they are the future. I still do a lot of work on "plain AWS" but that is still much more agile than provisioning my own servers.

Saturday, November 12, 2011

The quality of new programming languages is apparent by looking at projects using the language

The community growing around the Clojure language is great. While the Clojure platform is still evolving (quickly!) browsing through available libraries, frameworks, and complete projects is amazing.

My "latest" favorite Clojure project is Noir that simply provides a composable mechanism for building web applications (using defpartial). I get to use Noir on two customer web app projects (and some work with HBase + Clojure) over the next month or two, and I am looking forward to that. The simpler of the two web apps is an admin console exposing some APIs on a private LAN and the Try Clojure web app is a great starting point, as well as an example of a nicely laid out Noir application.

Since Clojure is such a concise language I find it easy to read through, understand, evaluate, and use projects. Since I am still learning Clojure (I have just used Clojure for about 6 months of paid work over the last couple of years) the time spent reading a lot of available code to find useful stuff is very well spent because reading good code with an open repl is a great way to learn new idioms.

Monday, November 07, 2011

Writing a simple SQL data source for the free LGPL version of SmartGWT

While travelling back from a vacation I cleaned up some old experimental code for writing a fairly generic SmartGWT data source with the required server side support code. The commercial versions of SmartGWT have support for connecting client side grid and other components to server side databases. For the free version of SmartGWT you have to roll your own and in this post I'll show you a simple way to do this that should get you started. Copy the sample web app that is included in the free LGPL version of SmartGWT and make the modifications listed below.

I also set up a Github project that contains everything ready to run in IntelliJ.

The goal is to support defining client side grids connected to a database using a simple SQL statement to fetch the required data using a custom class SqlDS. I had to strangely format the following code snippets to get them to fit the content width for my blog:

    ListGrid listGrid = new ListGrid();
    listGrid.setDataSource(
      new SqlDS(
         "select title, content, uri from news where " +
         "content like '%Congress%'"));
    listGrid.setAutoFetchData(true);

The following datasource looks for the column names (i.e., "title", "content", and "uri") in the SQL query and creates fields in the constructed SqlDS instance with those column names. I also assume that there is a servlet defined to process the HTTP GET fetch at the bottom of the constructor:

package com.markwatson.client;

import com.smartgwt.client.data.DataSource;
import com.smartgwt.client.data.DataSourceField;
import com.smartgwt.client.types.DSDataFormat;
import com.smartgwt.client.types.FieldType;

import java.util.Arrays;
import java.util.List;

public class SqlDS extends DataSource {
  public SqlDS(String sql) {
    setID(id);
    setDataFormat(DSDataFormat.JSON);

    List<String> tokens =
        Arrays.asList(sql.toLowerCase()
           .replaceAll(",", " ").split(" "));
    int index1 = tokens.indexOf("select");
    int index2 = tokens.indexOf("from");
    for (int i=index1+1; i<index2; i++) {
      if (tokens.get(i).length() > 0) {
         addField(new DataSourceField(tokens.get(i),
               FieldType.TEXT, tokens.get(i)));
      }
    }
    // should do a better job at UUENCODEing SQL:
    setDataURL("/news?query="+ sql.replaceAll(" ","20%"));
  }
}

The only thing left to do is write a servlet that processes web wervice requests like /news?query=... and returns JSON data with fields from the SQL query for each returned row for display in the list grid:

package com.markwatson.server;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;

public class DbRestServlet extends HttpServlet {
    @Override
    public void doGet(HttpServletRequest req,
         HttpServletResponse resp) throws IOException {
      PrintWriter out = resp.getWriter();
      try {
          // remove "query="
          String sql = req.getQueryString().substring(6); 
          int index = sql.indexOf("&");
          sql = sql.substring(0, index);
          out.println(
             DbUtils.doQuery(sql.replaceAll("20%", " ")));
      } catch (Exception ex) {
        ex.printStackTrace(System.err);
        out.println("[]");
      }
    }
}

The utility class DbUtils returns JSON data which is what the client side SqlDS DataSource class expects from the server:

package com.markwatson.server;

import org.codehaus.jackson.map.ObjectMapper;

import java.io.StringWriter;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class DbUtils {
  static String dbURL;
  static Connection dbCon;

  static {
    try {
      Class.forName("org.postgresql.Driver");
      // Define the data source for the driver
      dbURL = "jdbc:postgresql://localhost/test_database";
      dbCon = DriverManager.getConnection(
                     dbURL, "postgres", "password");
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  public static String doQuery(String sql)
                               throws Exception {
    ObjectMapper mapper =
       new ObjectMapper(); // should cache and reuse this!
    List<Map<String, String>> ret =
        new ArrayList<Map<String, String>>();
    Statement statement = dbCon.createStatement();
    ResultSet rs = statement.executeQuery(
                            sql.replaceAll("20%", " "));
    java.sql.ResultSetMetaData meta = rs.getMetaData();
    int size = meta.getColumnCount();
    while (rs.next()) {
      Map<String, String> row =
         new HashMap<String, String>();
      for (int i = 1; i <= size; i++) {
        String column = meta.getColumnName(i);
        Object obj = rs.getObject(i);
        row.put(column, "" + obj);
      }
      ret.add(row);
    }
    StringWriter sw = new StringWriter();
    mapper.writeValue(sw, ret);
    return sw.toString();
  }
}

I had to add three JAR files to the SmartGWT sample project:

jackson-core-lgpl-1.8.1.jar
jackson-mapper-lgpl-1.8.1.jar
postgresql-9.0-801.jdbc4.jar

SmartGWT's DataSource abstraction is a real improvement over how I connect to databases in GWT apps where I tend to write a lot of small RPC services to fetch and save data as required. My simple DataSource subclass SqlDS does not support writing data back to the database from the client; it can either be extended or you can use a RPC service call to save edited data.

Sunday, November 06, 2011

Annoyed by anti-MongoDB post on HN

I am not going to link to this article - no point in giving it more attention. The anonymous post claimed data loss and basic disaster using MongoDB. I call bullshit on this anonymous rant. Why was it posted anonymously?

I am sitting in an airport waiting to fly home right now: just finished extending a Java+MongoDB+GWT app and I am starting to do more work on a project using Clojure+Noir+MongoDB.

I do have a short checklist for using MongoDB:
  • For each write operation I decide if I can use the default write and forget option or slightly slow down the write operation by checking CommandResult cr = db.getLastError(); - every write operation can be fine tuned based on the cost of losing data. I usually give up a little performance for data robustness unless data can be lost with minimal business cost.
  • I usually use the journalling option.
  • Use replica pairs or a slave.
  • I favor using MongoDB for rapid prototyping and research.
  • I use the right tool for each job. PostgreSQL, various RDF data stores, and sometimes Neo4J are also favorite data store tools.

Friday, November 04, 2011

Notes on converting an GWT + AppEngine web app using Objectify to a plain GWT + MongoDB web app

There has been a lot of noise in blog-space criticizing Google for the re-pricing of AppEngine services. I don't really agree with a lot of the complaints because it seems fair for Google to charge enough to make AppEngine a long term viable business.

That said, I have never done any customer work targeting the AppEngine platform because no one has requested it. (Although I have enthusiastically used AppEngine for some of my own projects and I have written several AppEngine and Wave specific articles.) I still host KnowledgeBooks.com on AppEngine.

I wrote a GWT + AppEngine app for my own use about a year ago, and since I always have at least one EC2 instance running for my own experiments and development work I decided to move my app. It turns out that converting my app is fairly easy using these steps:
  • Copy my IntelliJ project, renaming it and removing AppEngine facets and libraires.
  • Add the MongoDB Java required JARs
  • I had all of my Objectify datastore operations in a single utility class on the server side - I converted this to use MongoDB
Sure, a complex application would take a while, but my app only has 6 model classes (all POJOs) so the whole process took less than 90 minutes.

Recent evaluations of web frameworks while on vacation

My wife Carol and I have been visiting family in Rhode Island this week and since our grandkids are in school on weekdays, I have had a lot of time to spend writing the fourth edition of my Java AI book and also catching up on reevaluating web frameworks.

Although my main skill sets are in data/text mining, general artificial intelligence work and Java server side development, I do find myself spending a lot of time also writing web applications. In the last few years, I have done a lot of work with Rails (and some Sinatra), GWT, and most recently with SmartGWT because one of my customers really liked SmartGWT's widgets. (Note: if you are in the San Jose area and want to work on a SmartGWT project with me, please email me!)

For my own use, because I have strong Java and Ruby skills, the combination of Rails, GWT, and SmartGWT works very well for me when I need to write a web app.

That said, I have spent time this week playing with Google's Closure Javascript tools and less time with ClojureScript that uses Google's Closure. Frankly, both Closure and ClojureScript look fantastic, but I have a personal bias against making Javascript development a career and although ClojureScript works around this issue by compiling a nice subset of Clojure to Javascript I am concerned that the market for developing with ClojureScript is probably small. If you do want to write Lisp code on the server and client side definitely spend a few evenings playing with ClojureScript because it may be a good fit for you. I have also recently had a good experience with Clojure and the Noir web framework.

Tuesday, November 01, 2011

Anyone know any SmartGWT and Java developers looking for a job?

A call out for some help: one of my favorite customers is looking for a SmartGWT and Java developer in San Jose area - anyone know anyone good + available?