Monday, January 28, 2013

A little weird, but interesting: using IntelliJ LeClojure's debugger with Clojure code

I have been working on a web app using Compojure, Noir, and Hiccup. I usually run both lein repl and lein run concurrently and use either IntelliJ or Aquamacs (Mac Emacs) as a code editor. If I am working on a small bit of new code and experimenting a lot, then Aquamacs + nrepl works well for me. If I have my MacBook Air plugged into a huge monitor sometimes I run lein run and a LeClojure repl inside IntelliJ and detach a few edit panes as separate windows - so many nice choices for development!

I don't really like debuggers because getting into the habit of using them can end up wasting a lot of time that could be spent writing unit tests, etc. That said, I was looking at some Ruby code I wrote for a customer but have not deployed and it occurred to me to try the RubyMine debugger which worked very well and generally didn't get in my way or waste too much time manually stepping through some code.

So, I decided to spend a little time trying IntelliJ LeClojure's debugger on my Clojure project:

I wrote a little stub file to run as the debug target "script":

(ns cookingspace)
(use 'cookingspace)
(-main)
I placed this file in the top level project directory so the relative path for loading required data files is set up OK. Also, I had to change the default run/debug option to not make the project before running - this is necessary since I have top level expressions that initialize data from reading data files, and the Clojure compiler is not run from the project directory so I would get build errors. You probably will not have that issue with your projects.

One thing that is cool about running the app in the debugger is that the LeClojure plugin also starts a repl (see the second tab at the bottom to the screenshot) that shows all debug println print outs and provides a repl without having to start a separate process.

I don't think that I will use the debugger very often (again, using a debugger to understand and debug code is usually not the best use of time) but it is cool to have another tool. That said, I might find that always running in the debugger and only setting breakpoints on the rare occasions that I need to better understand what some code is doing might work well. I will try this mode of development for a few hours a week as an experiment.

Friday, January 18, 2013

Cooking functionally

I am not actually using functional programming in the kitchen :-)

I am re-writing my old cookingspace.com web app in Clojure with a new twist: the new web app will be an AI agent for planning recipes based on the food you have on-hand and based on a user's history, preferences, and the general mood they are in (food-wise). I have written some tools to convert my old relational database code for (more or less) static data (USDA nutrition information, recipes) to Clojure literal data.

Based on the type of food a user feels like cooking and what food they have on-hand, I need to transform recipes in an intelligent way to (hopefully!) also taste great after ingredients have substituted or morphed because a user would prefer a different type of sauce over a base recipe, etc., etc.

Anyway, working in Lisp and experimenting in a repl is making the entire process easier.

My wife and I are both very good cooks and we can make up super-tasty recipes on the fly with whatever ingredients we have on hand. I am betting on two things: I can automate our expertise and that people will enjoy using the system when it is finished.

Wednesday, January 16, 2013

Faceted search: one take on Facebook's new Graph Search vs. other search services

Faceted search is search where the domain being search is filtered by categories or some taxonomy. Individuals become first class objects in Facebook's new Graph Search and (apparently) search is relative to a node in their social graph that represents the Facebook user, other users they are connected with, and data for connected users.

I don't yet have access to Facebook's new Graph Search but I have no reason to doubt that as it evolves both Facebook users and Facebook's customers (i.e., advertisers and other organizations that make money from user data) should be happy with the service.

Google's Knowledge Graph and their search in general are also personalized per user. Once again this is made possible by collecting data on users and monetizing by providing this information to their customers (i.e., once more, advertisers, etc.)

Pardon a plug for the Evernote service (I am a happy paying customer): Evernote serves as private search. Throw everything relavant to my digital life into Evernote, and I can later search "just my stuff." I don't doubt that Evernote somehow also makes money by aggregating user information.

I assume that any 3rd party web service I use is somehow monetizing on data about me. I decide to use 3rd services more for the value they provide since my cynical self assumes the worse about privacy.

Dealing with faceted search and graph databases at Facebook and Google scale is an engineering art form. Fortunately for the rest of us, frameworks/libraries like Solr (faceted search) and Neo4J (a very easy to use graph database) make it straight forward to experiment with and use the same technologies, but admittedly without the advantage of very large data stores.

Sunday, January 13, 2013

The Three Things

Since I (mostly) shut down my consulting business (which I have enjoyed for 15 years) in favor of developing my own business ideas I find that I am working more hours a week now. When consulting the breakdown of my time spent per week was roughly 15 hours consulting, 8 hours writing, 5 hours reading technical material (blogs, books, and papers), 5 hours programming on side projects, and many hours of "me time" (hiking, cooking, reading fiction, meditation, and hanging out with my wife and our friends).

Wondering what "The Three Things" are? Hang on, I will get to that.

I now find myself spending 40+ hours a week on business development. Funny thing is, I still am spending a lot of time on the same activities like hiking, hanging out, reading, and side projects. I feel like a magician who has created extra time. Now for The Three Things:

I have always known this but things have become more clear to me:

  1. Spend time on what energizes you. If work drags you down and robs you of energy then you are probably doing the wrong thing. Whenever you can, engage in activities you love doing rather than maximizing how much money you make.
  2. Continually try to improve yourself by choosing activities for both your personal growth and that have a positive effect on other people.
  3. Even in the face of disappointments in life try to live in a constant state of gratitude for loved ones, skills, opportunities, and the adventure of not really knowing what comes next in life.

Friday, January 11, 2013

Ray Kurzweil On Future AI Project At Google

Here is a good 11 minute interview with Ray Kurzweil.

In the past Google has been fairly open with publishing details of how their infrastructure works (e.g., map reduce, google file system, etc.) so I am hopeful that the work of Ray Kurzweil, Peter Norvig, and their colleagues will be published, sooner rather than later.

Kurzweil talks in the video about how the neocortex builds hierarchical models of the world through experience and he pioneered the use of Hierarchical hidden Markov Models. It is beyond my own ability to judge if HHMMs are better than the type of hierarchical models formed in deep neural networks, as discussed a lot by Geoffrey Hinton in his class "Neural Networks for Machine Learning." In this video and in Kurzweil's recent Authors at Google talk he also discusses IBM's Watson project and how it is capable of capturing semantic information from articles it reads; humans do a better job at getting information from a single article, but as Kurzweil says, IBM Watson can read every Wikipedia article - something that we can not do.

As an old Lisp Hacker it fascinates me that Google does not use Lisp languages for AI since languages like Common Lisp and Clojure are my go-to languages for coding "difficult problems" (otherwise just use Java <grin>). I first met Peter Norvig at the Lisp Users & Vendors Conference (LUV) in San Diego in 1992. His fantastic book "Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp" had just been published in 1991, as had my much less good Springer-Verlag book "Common LISP Modules: Artificial Intelligence in the Era of Neural Networks and Chaos Theory." Anyway, it is not for me to tell companies what programming languages to use :-)

I thought that one of the most interesting parts of the linked video was Kurzweil's mention of how he sees real AI (i.e., being able to understand natural language) will fit into Google's products.

While individuals like myself and small companies don't have the infrastructure and data resources that Google has, if you are interested in "real AI", deep neural networks, etc. I believe that it is still possible to perform useful (or at least interesting) experiments with smaller data sets. I usually use a dump of all Wikipedia articles, without the comments and edit history. Last year I processed Wikipedia with both my KBSPortal NLP software (which, incidentally, I am hoping to ship a major new release in about two months) and the excellent OpenCalais web services. These experiments only took some patience and a leased Hetzner quad i7 32GB server. As I have time, I would also like to experiment with a deep neural network language model as discussed by Geoffrey Hinton in his class.