Saturday, March 31, 2007

Less is more: advantages of compact programming languages

Compact languages and libraries have the advantage that one person can understand most of an implementation. Two of the most elegant programming languages, Squeak Smalltalk and Ruby, have sufficiently compact implementations to understand, given some effort. And this effort is worthwhile - a lesson I learned in the 1970s when I worked as a systems programmer: I kept listings of key parts of the PrimeOS (Multics based) operating system and for the Lisp and FORTRAN implementations handy at home and in my office. Time spent reading through the code paid huge dividends whenever I had to do any systems level programming. Reading the code helps create a mental map of whatever software that you are using.

The implementation of Ruby and the standard Ruby libraries is compact, basically divided into the C implementation of Ruby itself and the Ruby code for the standard libraries. If you are going to be working much with Ruby then consider creating at least a project (using Eclipse, IntelliJ, TextMate, etc.) with the library source code. Good for library reference, looking up APIs, and for just reading code.

I would make the same suggestion to developers who use Squeak Smalltalk (or other implementations): you have the source code to the standard libraries: read it!

I also enjoy and am very productive using the Java and Common Lisp frameworks, but these languages with their standard libraries are so huge that I would personally never attempt to dive in and understand their implementations. Not having a mental road map of the implementations of Common Lisp and Java reduces the effectiveness of these platforms.

Saturday, March 24, 2007

GWT and Seaside: steps in the right direction for increasing developer productivity

Although most of my work is in AI development and general web services, I greatly enjoy writing user interfaces - this interest started in 1982 when SAIC bought me a Xerox 1108 Lisp Machine - the windowing system and general development tools were so good that they make most recent software development environments look weak.

Google Web Toolkit (GWT) and Squeak Smalltalk based Seaside are pleasant exceptions. Just like using Rails, GWT and Seaside provide that pleasant "everything just works" experience and great interactive development environments.

Last night I used GWT and IntelliJ to prototype some web UI ideas. If you have not tried working with the GWT Development Shell, give it a try: changes to client side resources (style sheets and HTML) and to Java client side source code (automatically compiled to Javascript during development) are immediately reflected in the test web app. I don't have any current tasks dealing with Java web applications right now, but the next time I do, GWT looks like a very good framework choice.

Seaside also supports the same fluid interactive development style: changes made in a Squeak Smalltalk source browser are immediately reflected in running web application. Seaside uses continuations to provide a more linear programming model. Blocks of code are used instead of URLs; inside a code block, you can call another code block - after a "return", you continue where you left off in the first code block. There is a performance penalty for using continuations, but it looks straight forward to run multiple Squeak processes behind Apache so it look like Seaside scales the same way that Rails scales. DabbleDB is written using Seaside - a great proof of concept for scaling.

Wednesday, March 21, 2007

Balancing the use of Open Source (especially GPL) and proprietary software

The GPLv3 is likely to have stronger requirements for sharing back code used in web applications. While I personally think that this is fair, it will be "interesting" for companies using GPLed components in proprietary web applications.

As a consultant, it is often frustrating when customers do not want to simply use a license like the GPL because they worry about protecting their intellectual property. I believe that in almost all cases, any proprietary code in a system should be associated with custom data handling. I am not a lawyer, but I believe that the GPL (even v3) allows GPL systems to share data with proprietary systems via a relational database (or some other type of persistent storage).

Why bother dealing with the GPL, assuming that you don't buy into the social and philosophical ideas of the FSF.org? Because it will end up saving you money! For example, using a GPLed content management system and donating improvements back to the project will save you money and effort every time there is a new release that already has your improvements incorporated.

What about maintaining a competitive advantage over competitors? Any advantage will probably be in both how well you serve your customers and in your proprietary data and data processing software.

Monday, March 19, 2007

Apollo alpha released to developers: offline enabled web applications

I have spent a lot of time in the last year studying and using AJAX technologies: Dojo (when I want a rich set of components), Prototype (when I want simple, lean AJAX support for forms), OpenLaszlo, some time spent with GWT, less time spent with Flex 2. For web applications that never have to be used "off line", these technologies meet my needs.

The Apollo developer pre-release looks compelling for applications that need to continue running when the user does not have an Internet connection. That said, I think that Adobe will get real competition from the open source OpenLaszlo system. Reason? Because OpenLaszlo is open source, great funding, and lots of Fortune 100 customers using it. I would like to see Adobe compete by releasing the Flex 2 SDK as open source (it is now free as in beer).

Friday, March 16, 2007

Factor

I just read about Factor on Dave Roberts' Finding Lisp blog. While I don't expect to do anything serious with Factor, I must say: it is very well done - at least the Mac OS X version that I tried. Add creative to well done. Read what Dave has to say about Factor (he also has good things to say about Erlang, for which I have been investing some learning time in the last month) and install Factor to see an interesting light weight environment. Factor reminds me slightly of using Oberon (a long time ago): a different way to work.

metaweb.com and freebase.com

I am always on the lookout for freely available sources of data in useful formats. Metaweb was founded by Danny Hillis and their first public system is at www.freebase.com.

"Freebase is a vast, free, open online database of structured knowledge" - from their web site.

One interesting thing, besides the interesting technology for storing and querying structured data where both the user can define her own categories and use system wide categories, is that the content that hosted is freely licensed under Creative Commons, GNU documentation license, or in the public domain.

You need to request an invitation, and then the documentation provides information on accessing Freebase. I experimented during lunch time with their Python client APIs - cool stuff.

Thursday, March 15, 2007

Trip down memory lane: I installed Minix 3 tonight

I was an early Linux user (downloaded Slackware over a slow modem in early 1994) but I installed Minix before that - probably in 1992 or 1993.

While I was reading tonight, I installed Minix 3 with Parallels on my MacBook. The system worked OK, but ran really slowly (over 5 seconds to start Emacs). I have seen a few blogs lately on Minix 3 and I decided to install it, and it was fun to kick the tires for 20 minutes. I like the idea of Minix (an operating system simple enough for a hobbyist to understand) but the slow performance convinced me to delete the Parallels disk image for it. I don't know why it ran so slowly since Windows 2000 and Ubuntu Linux run very well under Parallels.

Wednesday, March 14, 2007

I have released some NLP (natural language processing) tools with a LGPL license

Here is the download link. These tools come in a few 'flavors': Java, Ruby, C++, and C#. I expect to add two larger NLP projects in the next month.

BTW, I consider the LGPL to be "business friendly". You are allowed to mix my LGPL software with your own commercial products without open sourcing your products. You may also mix my LGPL software with open source with projects with Apache, BSD, MIT, or Mozilla style licenses. If you have any questions, ask me. If the LGPL license prevents you from using this material, please let me know about it.

Integrating PHP and Ruby with Java server side deployments

I dislike re-writing code that works well without a very good reason. Sometimes there is a temptation to rewrite code in order to move to a new language and platform. The Quercus PHP run time WAR file from Caucho looks like a good way to mix and match existing server side Java and PHP projects. Several years ago, a customer asked me to loosely integrate (just share a common login) SugarCRM and the JSP+POJO+Prevayler based web app that I had written for them to manage their research papers and devliverables. Not difficult to do, but my integration solution was a hack. Compiling PHP to byte code for the Java platform and easier (non-web service) calls between PHP and Java is an obvious win for developers. When JRuby's performance gets better (and it will :-) I look forward to the same type of integration with both Ruby and Rails.

Tuesday, March 13, 2007

Future of programming and IT jobs

This is a prediction made from a gut feeling and my own experiences: I think that in time, most jobs are going to involve building custom systems using mostly open source projects. In many cases off the shelf products are a poor fit to a business's work flow. On the other hand, building proprietary systems from the ground up is expensive and requires long development times.

Whenever a customer talks about a new system they need, the second step (after understanding the problem they need to solve) is always to identify quality open source projects to use. I believe that the effective developer of the future will:
  • Need to know about and perhaps have experience with many open source projects
  • Be skilled at reading code and understanding APIs
  • Have strong architecture skills for creating new systems with custom code and existing projects
  • Know when it makes business sense to turn back improvements to open source developers, even if the license does not require this
  • Be a mentor to management on the trade offs of different forms of software licensing and different open source strategies
While there will always be jobs for computer scientists who break new ground and develop fundamentally new paradigms and technologies, most development work is in customized applications for organizations.

Monday, March 12, 2007

Google Guice: probable affects on my development methodology

As a developer, I have spent about equal time the last few years doing paid-for work in three programming languages: Common Lisp, Ruby, and Java. As you would expect, I try to apply the same tools and ideas for all three languages. This helps prevent the small "mental overhead" when switching languages. After reading the Guice User's Guide while having lunch today, I believe that I am going to use Guice for one or two of my "personal time" projects in order to try it out and add it to my toolkit.

The question that interests me is: how to use Guice patterns in dynamic languages like Common Lisp and Ruby? For Common Lisp, I recently started studying the project Closer - a wrapper for smaller projects for AOP and uniform interfaces (and enhancements) for MOP and context oriented programming. Good stuff!

For Ruby, there are a few projects for injection of control (e.g., Needle).

I am not sure how well spreading this pattern over three programming languages will work for me, but it makes sense to put some effort into consistency.

Being a good web citizen

Ten years ago I had a long email conversation with someone who was blind. He had bought one of my books and was using one of the example programs to write a better screen scraper for web sites than the one he was currently using. This made me realize how important it is to test our web sites so that everyone has access to them. I periodically use the text only web browser Lynx to test my sites. BTW, if you are in a hurry looking for real information on the web, you might try Lynx: sometimes plain text is better and faster, if sites are well designed: keyboard-only web browsing :-)

Spend a little time getting used to Lynx, and you may like it as a fast alternative for web browsing.

Ideally, XHTML or HTML is used for web content, with visual styles added with CSS. Javascript should be optional. It is very cool that using good web writing style also helps make sites more accessible for people with impaired vision. By the way, note that Blogger.com does a very good job of generating HTML that is easily viewed and navigated with a text only browser like Lynx. Try visiting this blog using Lynx.

Sunday, March 11, 2007

New version of my free web book "The Software Development Book"

I have not done too much work on my free web book The Software Development Book in the last two years, but I did work on it quite a bit this weekend. This is a high level book, really my best advice in a nutshell. I hope to finish it up in the next month. I have been working on a "Ruby AI" free web book, but I noticed last Friday how many people were downloading the older version of the "The Software Development Book" which motivated me to finish it first.

As I mentioned in an earlier blog entry, starting in March 2007, I am taking a new approach to distributing my web books:
  • During development (writing), my web books are accessible as HTML linked from http://markwatson.com/opencontent
  • When the books are complete, the HTML links will point to the latest version of the book and additional links will reference PDF and a physical book purchase links at lulu.com
I am in the process of streamlining my writing production. I work using Latex (TexShop) and OmniGraffle for technical diagrams. A simple automated build process “publishes” media for free viewing on my web site, and generates media for purchases at lulu.com. My web books will always be available for free reading on my web site. I expect to have all of my projects configured in my new writing system in the next few weeks. After that, I hope to spend about one day a week writing. The great thing (for me!) is that I will get to spend my writing time actually writing and not dealing with production issues.

As always, I am grateful to receive reports on misspellings, etc., and even more grateful to receive suggestions.

Wednesday, March 07, 2007

I took another look at Grails

Grails has a new release so I spent a little time with it last night and during lunch break today. The new release looks much improved over what I looked at last year, but really, Groovy is not as flexible as Ruby, and Grails lacks the quality of Rails at this point. That said, Grails is a technology that I am hopeful for, much like I hope that JRuby gets fast and well supported as a platform for running Ruby code on. When I write JSP+POJO+Hibernate web apps (my preferred light weight Java stack), I like to use custom tag libraries and Grails' nifty way of writing tag libraries looks very cool indeed.

Tuesday, March 06, 2007

Very best tools to work with

I wrote about how I spent much of Sunday streamlining my writing setup (Latex, TexShop, my own shell and Ruby scripting). I continued this process by upgrading to IntelliJ version 6. I was using a trial update and found the new Ruby support and other small changes worth the update price from version 5. While TextMate seems best for working on small Ruby projects, the new IntelliJ Ruby and Ruby Rails plugin works very well for me on larger projects. For Mac users, TexMate is a "must have" tool. Even when I am developing Common Lisp applications (I use Franz Lisp + Emacs + Franz ELI), I keep the entire project folder open in TextMate - choosing to only keep Emacs buffers open for the files I am making serious changes to and using TextMate for browsing and short edits.

Before spending the money for an IntelliJ upgrade I revisited Eclipse for Java and Ruby development and the latest code drop for NetBeans 6 (testing) that has good Ruby and Rails support. In the end, I went with a supported commercial product. BTW, I find it interesting that the three most popular Java IDEs now support both Ruby and Ruby on Rails development - that tells you something about the momentum that Ruby has. I enjoy Ruby so much that I have all but stopped using Python.

Lunch with an opponent from the 1978 US Chess Open

I am not a particularly strong chess player: I started by winning two tournaments (no losses!) in San Diego after I graduated from college, then went to the US Open and had my clock cleaned, and more or less gave up the game except for casual games. (I really prefer Backgammon and Go.) I had the pleasure of having lunch yesterday with one of my opponents from 1978. I was playing a casual game with Marty in 1999 (when we first moved to Sedona) and recognized him as an old competitor. It is fun to catch up with old acquaintances.

Sunday, March 04, 2007

Automating the technical writing process

It pays off in the long run to write some custom code and tweak your own working environment. How many programmers spend time writing code to improve other people's productivity but don't invest the time to automate their own work flow? Probably too many.

I have been working fairly hard in my free time on two "free web books" for my main we site. In the past I have simply created PDF files and made them available for my site. Although I write my "free web books" using a Creative Commons license, I decided that I wanted to make at least a small revenue stream so that I can spend more time on these projects. My plan has been to have all of my web books readable for free online with a small Google Adsense advertisement at the beginning of each chapter and offer through lulu.com the ability to buy PDFs with no advertising for a few dollars or get a printed book for about $12.

I would prefer spending my time writing rather than preparing HTML, PDF, and printer ready PDF. I think that I now have this process automated about as fully as I can, using custom Ruby code, shell scripting, and a "do everything" Makefile. I had experimented with using OpenOffice.org and writing some utilities to modify the generated HTML. I also experimented with the very cool SiSU system.

In the end, I went back to automating my writing setup using Latex, htlatex, pdflatex, custom Ruby code and shell scripts, and the OS X Latex editor TexShop. It feels good to get things working "just right".

Saturday, March 03, 2007

Source code for FastTag released - free for non-commercial use

I have started to release my commercial products with a free for non-commercial license, with source code. I repackaged the FastTag part of speech tagger this morning and it is now available. A separate version of FastTag uses the MEDPOST medical term lexicon.