Tuesday, June 30, 2009

My new APress book was released today: "Scripting Intelligence: Web 3.0 Information, Gathering and Processing"

APress web page for my book.

And, an Amazon link..

There is a lot of challenging material in this book but I am hoping to save my readers some setup effort because they can use a pre-configured Amazon machine image (AMI) that I created with the book examples and required infrastructure software. Of course, you can also build the examples on your laptop or servers.

I am setting up a book support web page with errata information. I plan on making entries for most questions from my readers that I answer via email.

Monday, June 29, 2009

USA: return to 'robustness'

With all of the problems that my country is facing, I am still optimistic, if:

Parents do their job and turn off the TV after dinner. When I was in high school I did 2 to 3 hours a night of homework on week nights - that should, I think, be the norm for the new young generation.

Young people do their job and squeeze every bit of value from the educational opportunities that they have at their disposal.

Adults do their job and realize that education and job skills are something that they need to develop continually throughout their working lives. Be productive and prosper.

Congress and our president: suck it up, stop being bought off, and do what is right. Look out for your karma, try not not be total assholes.

Financial elite: realize that no matter how much money you accrue, your children and grandchildren need to live in this world so you should not ruin the world that they will need to live in. Suck it up and try doing the right thing for change. Look out for your karma, try not not be total assholes.

As a country, we need to work together and everyone do their part.

Tough choice in the USA

Here in the USA, we face a tough choice: to survive with any kind of lifestyle and robustness, we need to defund government spending on:
  • Huge government subsidies to the Insurance companies, Beef Industry, etc. Subsidies may also take the form of not collecting a fair tax burden and for "under-regulating" corporations that strongly act against the public interest.
  • Drastic curtailment of defense spending
  • Elimination of the large amounts of money we give/loan to other countries to buy weapon systems from companies in the USA
Why are these tough choices? Well, because Congress (and the executive branch under Bush and Obama) mostly look out for corporate interests and not citizens' interests - that is just the way it is. Also a tough choice because almost everyone is simply too lazy to spend the effort to personally lobby their elected representatives.

Perhaps people get the government that they deserve.

Saturday, June 27, 2009

My Java AppEngine article published; my wife's video; more good experiences with Heroku

I wrote an article for DevX on Java AppEngine that was just published. The example application implements simple document storage and search. I still very much like AppEngine, but most of my work right now involves Rails development. I am still waiting for a customer who needs AppEngine development :-)

My wife has been helping a local non-profit organization (Connections) that uses Equine (horse) therapy to help both children and adults with disabilities - good work. I put her latest video on Youtube if you are interested.

I continue to be very happy with Heroku - for a relatively small cost, my customer gets a good deployment platform that takes almost none of my time to use, thus saving them money. When it comes to deploying software, I do like control -- so, it seems strange that I am so happy with Heroku and AppEngine where you have to give up control in return for saving time and money.

Friday, June 19, 2009

ClioPatria semantic search web-server

Between 2003-2005 I often used Swi-Prolog for semantic web experiments before I more or less settled on using Sesame (and occasionally AllegroGraph and Redland).

I just saw a link to the ClioPatria semantic search web-server project. Assuming that you have a fairly recent copy of Swi-Prolog installed, trying ClioPatria only takes a few minutes:
git clone git://eculture.cs.vu.nl/home/git/eculture/ClioPatria.git
cd ClioPatria/project-iswc
./configure
/run.pl
Then, point your browser to http://localhost:3020/, load, for example, a OWL (XML format) data file, and try some queries. The default query language is SeRQL which I don't use, so I set the query language to SPARQL and all seems to work fine.

One good thing about Swi-Prolog and the bundled SemWeb library is that loading RDF data and performing queries seems very quick compared to Sesame which is what I usually use. As a result, the ClioPatria web application is also very quick.

Thursday, June 18, 2009

Opera Unite is an interesting idea

Unite is an interesting idea, letting non-techies easily share materials with friends and relatives. Opera provides an intermediary service: they map your account to whatever temporary IP address you have, deals with NATs, acting as a proxy between the web server running on your laptop and friends' and family members' browsers.

While I have been getting very interested in cloud computing in the last 6 months (using Amazon's and Google's offerings, and I spent an hour with someone at Sun yesterday giving them usability feedback on their soon to be released cloud services web interface), I am also interested in peer to peer systems. With some inter mediation, Unite is sort-of peer to peer (except for the reliance on an intermediate proxy service, although in some circumstances UPnP is used and no proxy is required). It is fine that Opera provides the back end services for Unite inter mediation. I would also like to see open source implementations (could be as simple as a little proxy server written in a scripting language and a Firefox plugin - if there is not already something like this, I expect to see some projects soon). For example, I might want to run my own intermediary service for friends and family. Anyway, ignoring the hype, I think that Unite is a good and also interesting idea.

Saturday, June 13, 2009

Heroku: Rails hosting done right

I just did a test deployment for a customer on Heroku.com this morning. Lots of non-standard things in the web app, but it still deployed nicely after a short remote SMTP setup. I have been reading about Heroku's architecture, implementation on EC2, etc. for a long while, so getting to use Heroku was fun!

For general development and flexibility I still like a semi-managed VPS (RimuHosting.com is the best I have found so far) because I can run a mixture of Java, Rails, Squeak+Seaside, etc. and have a home for master git and svn repositories. That said, for deployment, a custom deployment architecture on EC2, or an abstract scalable platform like AppEngine or Heroku really does make a lot of sense. I enjoyed talking to two principles of Engine Yard (Ezra Zygmuntowicz and Yehuda Katz) at Merb Camp last year but I have not yet had an opportunity to use their platform on a customer job.

Tuesday, June 09, 2009

Google Translator Toolkit: wow!

My good friend Tom Munnecke (he took the great picture on my web site of my wife and I in front of the Taj Mahal) recorded an interesting interview last month with Peter Norvig who talked a lot about the Google's translation services and how they work (Tom has not posted his video interview yet - I will add a link when one is available).

Anyway, it is a real pleasure this morning to actually get to experiment with the Translator Toolkit - very impressive. I have fairly good reading knowledge of French so I am experimenting with translating English to French. My Spanish is very rusty (and my FORTRAN is rustier still :-) but I will try that also.

Monday, June 08, 2009

Ruby client for search and spelling correction using Bing's APIs

I noticed that Microsoft allows free use of their search and spelling correction APIs. I just played with the APIs for a few minutes. Here is a Ruby code snippet that I just wrote:
API_KEY = ENV['BING_API_KEY']

require 'rubygems' # needed for Ruby 1.8.x
require 'simple_http'
require 'json'

def search query
uri = "http://api.search.live.net/json.aspx?AppId=#{API_KEY}&Market=en-US&Query=#{CGI.escape(query)}&Sources=web+spell&Web.Count=4"
JSON.parse(SimpleHttp.get(uri))["SearchResponse"]["Web"]["Results"]
end

def correct_spelling text
uri = "http://api.search.live.net/json.aspx?AppId=#{API_KEY}&Market=en-US&Query=#{CGI.escape(text)}&Sources=web+spell&Web.Count=1"
JSON.parse(SimpleHttp.get(uri))["SearchResponse"]["Spell"]["Results"][0]["Value"]
end
You need a free Bing API key - notice that I set the key value in my environment. If you get a key, then try:
search "semantic web java ruby lisp"
correct_spelling "semaantic web jaava ruby lisp"
The first method does spelling correction before search.

Sunday, June 07, 2009

I avoid installing software with sudo

As a Linux user since the early 1990s (and a longtime OS X user), it was easy for me to get in the "./configure; make; sudo make install" habit, but I don't think that this is such a good idea for two reasons:
  • Security: have you really read the source code to see what might be executed during "sudo make install"? I am constantly installing Ruby gems, infrastructure software, etc. and I often read code as an educational experience, but not for security. It is best to not run other peoples code as root.
  • It is much easier for me to rebuild systems from backups when I "./configure --prefix=/home/mark/bin" (or wherever, but in my home directory).
I used to like to keep my home directory fairly small so backups take up less space but now costs of external disks, remote storage like S3, etc. are so small, that it makes more sense to have my home directory to be more self contained.

I also like to develop customer projects under a single master directory. It is nice to have everything in one place: my application code, nginx, PostgreSQL (with data), Ruby, gems, Java, Tomcat, Sesame, Erlang, CouchDB, etc. - whatever a project requires to run. A top level shell script can set up the environment for each different project. This also makes cloning a customer's system to one of their alternative servers just a quick rsync away...

Friday, June 05, 2009

Chrome browser betas for OS X and Linux - very fast!

Very nice - even the betas that were just released are very fast. Interesting to see how fast the final releases will be...

I had not tried Chrome before because of WAS (Windows Avoidance Syndrome). I like the minimalist Chrome UI - very nice. When I boot OS X, I also like to use the new Safari 4 beta.

I have been very busy this year - not too much free time to try new things. Now that I am done working on my new book, I hope to have time to experiment with some technologies that I have not tried before like writing a web application that optionally uses Google Gears for local storage.

Tuesday, June 02, 2009

Open source, the gift economy, and the new world order

I just made a small donation to Canonical (good shepard for Ubuntu Linux) while I was installing some security updates. A good investment.

As a few very large corporations continue to control resources and major infrastructure, I expect to see a trend towards small agile enterprises covering rapidly changing technology and business niches. I expect to see a three-way synergy between mega-size corporations, small agile businesses, and a mobile highly educated work force: all three sides win big. The losers in this new world order are the poor and the poorly educated workers who can not adapt to changing situations.

I think that open source software, other key infrastructure supported by the users of the infrastructure, and a general gift economy will continue to reduce down to a minimum the cost of doing business. Again, the winners are both people who are well educated and prepared on a global scale to move quickly to take advantage of new situations, or people who prepare themselves for work in high-value local jobs like health care, critical government services (fire control, police, etc.), and support of local physical infrastructure.

I believe that both my country (USA) and most of the world are going through a historic transformation created mostly by a new higher level of transparency. Throughout history, the ultra rich and powerful have worked behind the scenes to amass more power and wealth by starting wars, etc. The same things still happen, but now society better understands what is really happening: corporate ownership of governments, who benefits financially from planned wars, strife, and the manipulation of the world's financial infrastructure.

I believe that the "information genie" is out of the bottle, and is not going back in.

Predicting the future is tricky, and I will not try. Still, it will be interesting to see how the general quality of world-scale governance improves or degrades in a future that blends meritocracy (those who get great educations and are major producers will compete with conventional multi-generational dynasties of controllers), greater transparency mostly due to the Internet, and near absolute corporate control and manipulation of governments and news media.

Ideally, in the new world order governments become "infrastructure" in the sense that governments compete to provide:
  • Highest quality of physical infrastructure services for the lowest tax base
  • Highest quality of resources for education (which needs to cover people throughout their entire lives)
  • Physical security for people living in and businesses in their jurisdictions, at the lowest cost
  • etc.
We all live in interesting times :-)