Monday, November 24, 2008

Something fun: new book project on the Semantic Web using AllegroGraph

The book is about 15% done (about 50 pages so far) and a rough draft PDF file is available. I realize that the market for this book will be small because AllegroGraph is a commercial product. However, Franz does make a non-commercial use version available for free, so my expectation is that when the book is done (between 2 and 6 months, depending on how busy my consulting schedule is) a fair number of people will enjoy the book with the non-commercial version of AllegroGraph. The finished book will be available for free as a PDF file and as a print book from lulu.com.

This book is fairly easy for me to write because I have existing coding experiments for just about all the Semantic Web application examples in the book. Also, since there are so many good Semantic Web references on the web and in existing books, I am only covering the SW technology that is used in the book examples. I want the book to be self contained: just enough tutorial and reference material covering AllegroGraph and other SW technologies so readers can completely understand the application examples.

Saturday, November 15, 2008

Giving up on just using one IDE

I have tried hard in the last year to standardize all of my work on one IDE (experimenting with Eclipse+Mylar and NetBeans). I have given up: lately it seems like I need NetBeans for Java-TV (Blu-ray Java) and JavaME tasks, IntelliJ for most other Java tasks and playing with Scala, TextMate (mostly) for Ruby, and Emacs for the Lisp examples for my new book project (on Franz's AllegroGraph). Anyway, I have given up on just being able to use one IDE: a nice thought but not realistic.

Monday, November 10, 2008

My new book "Practical Artificial Intelligence Programming in Java" is available in print and as a free PDF download

My book "Practical Artificial Intelligence Programming in Java, third edition"
is available in print and PDF download versions:
Support independent publishing: buy this book on Lulu.

A free download of the PDF version is available on my Open Content web page.

This book uses several excellent open source and public domain libraries and this code is distributed in the ZIP file of book example code. Please read the third party software licenses in the directory licenses-for-3rd-party-libraries. For the book example code that I have written you can use the Commercial Use License if you have purchased either the for-fee PDF version or the print version of this book. If you have downloaded the free PDF version from this web page then you may use my book example code under the Non-commercial Use License.

Sunday, November 09, 2008

Don't repeat yourself: for code, sure, but how about for data?

In individual applications we want to make sure that we don't have replicated code that is identical or very similar. What about replication across all projects, both active and in "freeze mode"?

I periodically like to consolidate source code: keep a single latest svn trunk version on my system and organize the code that I have written and frequently reuse into libraries. I am in the process of packaging up most of my Ruby code in local gems.

I also have issues with many copies of textual data files. For data used in Java libraries and applications the solution is simple: I keep data with the code that needs it in JAR files that are kept in a single library directory on my development system. I have been doing this for over 10 years and this is a really nice way to keep data assets and code together.

Sometimes I simply link data statically into compiled applications that I use (e.g., in the last year I have reimplemented many of my statistical NLP tools in Gambit-C Scheme and I generate a single command line utility program with all the required data statically lined.)

For data assets used in programs developed in multiple programing languages, a "separation of concerns" between code and data assets makes more sense.

I need to better organize other data assets like tagged training data, raw text organized into a hierarchy of categories, data that I have culled form the web and stored in XML files, etc. I am starting the process of putting the most up to date versions into a single directory and tweaking my code to check the DATA environment variable value and then load data assets as-needed. I will probably not import this data directory into svn or git: most of the data seldom changes and some of the assets are huge.

Friday, November 07, 2008

Good Ruby support in IntelliJ 8.0

IntelliJ 8.0 was released yesterday and after installing the Ruby + Rails plugin, IntelliJ is very competitive with the NetBeans for Rails development.

One feature that I particularly like is the jump links in the editor that let you jump from a controller method to the corresponding view template. There are also links from a method to the super class method that is being overridden (if any). There is currently a small bug in the plugin: multiple identical jump links are shown; all work the same.

In some ways it is nice to have Java and Ruby support in one IDE, but there are "Java only" menus shown while working on Rails projects - that is one advantage of the new RubyMine IDE: basically IntelliJ with all Java support removed. At the current time, Rails support for IntelliJ 8.0 seems to be more stable than the prerelease version of RubyMine but it will be interesting to compare the two next year when RubyMine is released as a product.

Tuesday, November 04, 2008

Bad news: I did not get to read "The Reasoned Schemer" this morning. Good news: no lines at my polling place today!

I have not read through "The Reasoned Schemer" in a long while, and since it is such a light (to carry) book I thought that it would be perfect for standing in line reading :-)

BTW, I do not know anyone who is not voting in this election. Whoever you prefer, McCain or Obama, vote!

Monday, November 03, 2008

Cool: JetBrain's new RubyMine

I have started using JetBrain's IntelliJ 8.x milestone releases for experimenting with Scala and doing Java development so I was very pleased to see JetBrain's new Ruby development IDE. I'll write more about it after using it on a test project. This is a public preview with a release date for next year.

Saturday, November 01, 2008

Complaints about Ruby memory use: false?

Please feel free to add a comment if you disagree with this, but I just don't see problems with excessive memory use in long-running Ruby apps like Rails and Merb. I saw a blog entry this morning where someone running a small web app and including passenger and Apache - about a dozen processes with a very large combined memory footprint.

I just checked two long-running deployments of mine (one Rails and one Merb) and the Rails application processes totaled about 80 MB VSIZE and the Merb combined processes were about 70MB VSIZE. In my setup, add in about 7.5 MB for nginx and 10 MB for memcached. Nginx and memcached are shared by all web apps, long-running and experiments, running on this particular server.

Have you seen excessive memory use in your deployments? If so, what are your setups? On the other hand, I would also like to hear about low-memory deployment schemes. I have not tried Phusion Passenger but I heard good things at MerbCamp 2008 about reduced memory use with Passenger.