Tuesday, January 26, 2016

Great talk on Spark

I just listened to an ACM sponsored talk Making Big Data Processing Simple With Spark by Matei Zaharias. You may need to be an ACM member to watch the webinar. I first joined ACM in the mid 1970s - recommended.

For handling huge datasets Spark is evolutionary or revolutionary depending on your point of view. A bit of personal history before I talk specifically about Spark:

In the late 1980s I was an architect and developer on a multinational project to use seismic data from 38 data collection stations to detect atomic bomb tests. All of our data handling software was custom; if we had Spark, or even Hadoop, we would have saved a ton of effort. Similarly, in the 1990s I was tech lead on a fraud detection system that used massive real time telephone records data sets. Modern infrastructure would have saved a lot of time and money.

My first serious use of map reduce was processing large Twitter data sets at Compass Labs. We used Hadoop on Amazon ElasticMapreduce. Later when I worked as a contractor at Google, in addition to using map reduce, I was introduced to realtime interactive tools like Dremel that made it easy to interactively use large data sets.

With Spark, everyone gets to interactively work with massive datasets! I think that Spark is evolutionary in that it builds on and plugs into existing work like the Hadoop File Sytem and supports familiar map reduce style operations. I think that it is revolutionary in the memory based distributed architecture and application programming model. Spark was designed based on limitations of map reduce systems like Hadoop that while providing easy to use programming models, have ineffiencies in data access. With Spark, you have an easy to use programming model, more efficiency, and built in interactivity. I have examples of using Spark in my last book Power Java. You can experiment with Spark on your laptop and only worry about accessing a cluster when you need to scale.


Saturday, January 23, 2016

Simple Haskell: using a sqlite3 database

I have been using Lisp languages professionally since the early 1980s. While I now use Java, Ruby, and Clojure for much of my work, I have been slowly been getting up to speed using Haskell over the last 5 years. My difficulties using Haskell are caused almost 100% when I need write impure Haskell code. This occasional discomfort is made up for by the fun and productivity of writing pure Haskell code. Using haskell-mode in Emacs I get the same happy feeling writing pure Haskell code that I used to get using Common Lisp, Scheme, and Clojure - and with the advantages of a strongly typed language!

I like to mock up test data and write the pure code first and then write impure code that needs to access the web, RDF data stores, relational databases, file IO, etc. For me, as a student of Haskell, this is the easiest way to write Haskell programs.

About 15 years ago, in one of my Java artificial intelligence books I wrote an example program that provides a natural language processing (NLP) interface to relational databases. I have decided that I would like to do the same, but in Haskell, and take advantage of what I have learned in the last 15 years. Writing the code to convert natural language queries into SQL queries is pure Haskell code (given mockup data for database metadata and sample table data, and test NLP queries) and I am enjoying working on that. Eventually I will need to write some impure code that accesses the popular databases. To make the initial development as easy as possible (a good idea since I may never totally finish this side project) I have decided that I will use sqlite and the sqlite-simple library. For the first proof of concept/prototype, I don't expect to need much impure code. A good thing!

This reminds me of a comment Erik Meijer made when he was teaching the edX functional programming class. He said that as developers we can think of pure Haskell code a being islands and impure code that has to maintain state and interact with the world as the ocean containing the islands. I like this metaphor!

I write little code snippets (or sometimes mini-projects) to experiment with nonpure Haskell code and the following listing, derived from the sqlite-simple library, contains the small experiments with the functionality that I need for now. I thought it was worth sharing in case this saves anyone else some time:


{-# LANGUAGE OverloadedStrings #-}
import Database.SQLite.Simple

{-
   Create sqlite database:
     sqlite3 test.db "create table test (id integer primary key, str text);"

   This is derived from the example at github.com/nurpax/sqlite-simple
-}

main :: IO ()
main = do
  conn <- open "test.db"
  -- start by getting table names in database:
  do
    r <- query_ conn "SELECT name FROM sqlite_master WHERE type='table'" :: IO [(Only String)]
    print "Table names in database test.db:"
    mapM_ (print . fromOnly) r
  
  -- get the metadata for table test in test.db:
  do
    r <- query_ conn "SELECT sql FROM sqlite_master WHERE type='table' and name='test'" :: IO [(Only String)]
    print "SQL to create table 'test' in database test.db:"
    mapM_ (print . fromOnly) r
  
  -- add a row to table 'test' and print out the rows in table 'test':
  do
    execute conn "INSERT INTO test (str) VALUES (?)"
      (Only ("test string 2" :: String))
    r2 <- query_ conn "SELECT * from test" :: IO [(Int, String)]
    print "number of rows in table 'test':"
    print (length r2)
    print "rows in table 'test':"
    mapM_ print  r2
    
  close conn

Just to make this example complete, here is my stack.yaml file:

resolver: lts-4.0
packages: - '.'
extra-deps: []
flags: {}
And here is my sqlite.cabal file:
name:                sqlite
version:             0.1.0.0
synopsis:            Experiment with sqlite-simple
description:         Derived from example in github.com/nurpax/sqlite-simple
homepage:            https://github.com/mark-watson?tab=repositories
license:             Apache2
license-file:        LICENSE
author:              Mark Watson
maintainer:          [email protected]
copyright:           2016 Mark Watson
category:            Web
build-type:          Simple
-- extra-source-files:
cabal-version:       >=1.10

executable test1
  hs-source-dirs:      .
  main-is:             test1.hs
  ghc-options:         -threaded -rtsopts -with-rtsopts=-N
  build-depends:       base
                     , sqlite-simple
  default-language:    Haskell2010

Here is a build and sample run (assuming that the sqlite database test.db has been created as per the comments in the first source listing):

✗ stack build
✗ stack exec test1
"Table names in database test.db:"
"test"
"SQL to create table 'test' in database test.db:"
"CREATE TABLE test (id INTEGER PRIMARY KEY, str text)"
"number of rows in table 'test':"
3
"rows in table 'test':"
(1,"test string 2")
(2,"test string 2")
(3,"test string 2")

I would like to thank Janne Hellsten for maintaining the sqlite-simple library and I would also like to thank the developers of stack. Using stack has solved most of my build issues with Haskell. Thanks!

Thursday, January 14, 2016

I will not vote for Hillary Clinton. I reject the "lesser of two evils" argument.

I believe that Hillary Clinton is in the pocket of Wall Street, a lacky by any definition. I also believe that she is, as Ralph Nader says, a poster child for the military industrial complex. I also don't like her close ties to agribusiness giant Monsanto and her advocacy for the industry's genetically modified crops.

I believe that our two party system is broken, almost never giving us a choice that matches the preferences of the electorate. Corporate news corporations favor Clinton over Bernie Sanders in subtle and unfair ways, basing so much of their slanted (as directed to the financial interests of the network owners) discussion in terms assuming Hillary Clinton will be the Democratic candidate and pushing the false narrative that Bernie Sanders has no chance of winning the general election.

Some of my friends who are Democrats believe that it is a mistake to not vote for whatever Democratic toadie the establishment runs. What if a Republican wins? Oh NOoos! The sky will fall.

I believe that the sky will fall on our representative democracy if people don't stand up to the political establishment and the corporations that their preferred candidates represent.

Thursday, December 31, 2015

Happy New Year

Hello everyone. I want to wish everyone a happy new year and say a few things about what I expect for the new year.

I believe that one of the most important issues facing "first world" countries like the USA and England are the issues of Internet security and privacy. The news this morning of the umbrage of US congress people to the news that NSA is monitoring of their communications with people in the Israeli government is laughable: let us be clear about this: these people don't care about the privacy of US citizens but they do care about their own privacy and the privacy of leaders of another country. This stinks, and badly.

While privacy is important I believe that a bigger issue security. I would like to see my government (USA) conduct a multi-year "going to the moon" type project for strengthening our information infrastructures to the benefit of people, companies, and governments. This means that there can be no encryption back doors installed in any software and hardware systems. If governments have universal decryption keys then eventually these keys will leak to organized crime, terrorists, and other governments. Imagine the scenario of everyone waking up some morning to emptied bank accounts - that is a possible scenario if 'back doors' are installed in public infrastructure.

On a happier note, the will of the people in my country regarding labeling of GMO foods has won out, at least for now. I believe that 90% of people in the USA poll in favor of accurate labeling of foods. When you consider the power of the food corporation lobbying block, with their paid for politicians (Hillary Clinton and most of the republican presidential candidates, as well as most of Congress) this is a surprising but good victory. Yay!

Despite environmental and political corruption problems I remain very optimistic about the future.

I expect scientific advances in clean energy, artificial intelligence and robotics, and medical breakthroughs to continue at a rapid pace and yield benefits for most people on earth.

In my field (artificial intelligence) we have seen enormous progress in development of useful systems based on deep learning. That said, I don't believe that deep learning is the path to general artificial intelligence. Deep learning is an elegant hack (for training many layer neural networks) but we need a formal model for true AI.

On personal technology: I have spent an enormous amount of time (very enjoyable time) studying and using Haskell, Clojure, Scala, and other languages on projects. While I will always allocate time for learning and practicing with new languages and technologies, for 2016 I have made a news years resolution to "just use Java 8" and "get stuff done." I will continue to mostly use Ruby when I need a scripting language. My decision is to spend more time on artificial intelligence research and projects and Java 8 is usually a practical enough language for this work.

Thursday, December 10, 2015

Raspberry Pi and education

I may be late to the Raspberry Pi party - I just bought my first one this week. The Rasberry Pi is everything that I would hope for in an educational computer: cheap enough for all children to own and based on open source software (Debian Linux, LibreOffice, lots of games, and programming languages like Python, Ruby, Java, Scratch, etc. pre installed).

The open nature of the Raspberry Pi encourages kids to experiment. RPs might not be as practical as other systems like ChromeBook that have more distributed infrastructure behind them but I think that open systems provide a better better environment for experimenting with computers.

I reformatted a 32GB memory card and installed a fresh Debian Linux image provided by the Raspberry Pi project and when hooked up to a large monitor the Raspberry Pi 2 is quite capable. I installed the RubyMine IDE and git cloned a few of my Ruby projects and loaded the manuscript for my current writing project. I find the system is surprising fast with its 4 core ARM processor. For fun I have used it for my work for the last day. Of course I am writing this blog article on my RP setup.

Our future lies in how well our educational system works. In the modern world people should never stop learning new things both for the fun of it and to enhance their careers and their contributions to society. Very inexpensive devices like the RP (the latest model costs $5) that can be experimented with provide children with a good model for a life long process of experimenting and learning.

Saturday, December 05, 2015

Digital Life: a modicum of privacy

This post contains my advice for maintaining a reasonable amount of privacy without reducing the utility and entertainment we get from the Internet. It is no news that governments are pushing back against our right of privacy and we should also be concerned by tracking by both corporations and organized crime. Privacy is a basic human right and once rights are lost or reduced in scope they can be very difficult to get back.

To start with I believe that everyone should have the privacy enhanced Tor web browser installed. Tor was developed originally by the US Navy in support of journalists and other people living in countries with oppressive regimes. I strongly recommend using Tor for the following reasons:

  • Research any medical conditions that you have.
  • You are interested in buying a product and you don't want advertisers to put ads on web sites you visit because you would rather make independent unbiased purchasing decisions.
  • Visit any sites for any reason that you would not like a future employer to know that you visited. We all look at odd information on the web out of curiosity, research, or for whatever reasons.
  • The availability of privacy enhancing tools is important and at least occasional use by the general public of tools like Tor help to legitimize these tools.
I don't think for a minute that privacy enhancement tools prevent major government actors like the NSA and GCHQ from accessing our private data. It slows them down a little, which I argue is a good thing, but does not stop them. For the general public the real benefits come from stopping (or slowing down) access to your data by corporations and organized crime. I think that it would be naive to think that organized crime does not have the interest and the ability to collect private data.

Private cloud storage: I use SpiderOak but there are several other good safe storage options.

When I was a kid I enjoyed writing in a diary. I sort of do the same thing as an adult, writing many short categorized notes about things I want to do, personal philosophy and spirituality, ideas for writing projects, travel notes, etc. I think that if it is worthwhile seriously thinking about something then it is worthwhile making notes. I now use the simple text markdown format for these notes - writing notes helps organize our thoughts and later quickly find old ideas we took the time to journal. For years I used cloud services like Google Docs + Keep and Microsoft OneNote. I am mostly transitioning to using secure and private cloud storage and as it turns out, well organized notes in markdown are as convenient as storing my ideas and notes other less secure cloud services.

Online banking: I prefer to use (relatively) locked down devices like an iPad or a Chromebook for online banking. I think it is less likely that these devices are compromised than Windows, Linux, or Mac laptops. And don't forget to use a private mode window in your browser when doing online banking, access sensitive government web sites like Affordable Health Care, etc.

What about using social media? I enjoy social media, especially Google+ and Twitter and I use all social media to shamelessly plug the books that I write. I use a simple trick for using social media and using Google search: I use the Chrome web browser for these tasks and use either Firefox or Safari for all other web browsing. As far as tracking activities go, this helps prevent information leakage. It is a bit of a nuisance: when I see a web link on social media I would like to look at, instead of clicking the link I right-click the link to copy the URI and use a keyboard shortcut to switch to Firefox or Safari and paste in the link. Yes, this takes about 4 or 5 seconds an is a little inconvenient.

Governments and corporations use strong encryption and so should you. Encryption drives safe information flows and is vital to all of the world economies. Encryption can not have "back doors" because of the threat to the global economy and that of companies and individuals if organized crime (when I talk about organized crime I am also including organizations that others might call terrorists) gained access to back door encryption keys. The damage this would cause is unimaginable. Fortunately many consumer computing devices support encrypted file storage out of the box: modern Android phones, iPhones, iPads, Mac OS X, Ubuntu Linux, and the professional versions of Windows 10. Use encryption - it is well worth the effort.

Wednesday, November 25, 2015

Ruby SVM text classifier

There are several useful Ruby gems/libraries for using Support Vector Machines (SVM) and another to convert text into SVM style feature vectors. I recently packaged up what I needed with a Ruby script to fold the data for testing, etc.

Here is the github repository.

It took me a short while to get everything working together so hopefully this will save you several minutes of extra effort if you want to use SVM for text classification.

Friday, November 13, 2015

My new book "Power Java" is released today

I recommend that you look through the github repo for my book to see what I cover and if it looks interesting please please consider buying my new book on leanpub.

I cover a wide range of topics including machine learning, linked data, network programming techniques for IoT, and some ideas for knowledge management using cloud data.

Thursday, October 01, 2015

My Cognition Technology blog and website

I created a new blog yesterday http://blog.cognition.tech for news and my personal programming experiments involving machine learning and deep learning. There is a companion website www.cognition.tech where I offer consulting, mentoring, and turnkey development.

I will continue using this blog for personal posts, programming languages (mostly Haskell, Java, Clojure, and Ruby) and general discusions on technology.

Monday, September 28, 2015

I need some sympathy: spending most of my time coding in Java and Python

As the universe unfolds, I have been spending most of my time recently working with machine learning and for the forseeable future that will not change. Lets face it, many of the really great ML libraries are written in Java and Python.

I still love development with Clojure and Ruby, and I am still on my long term quest to become passably proficient with Haskell. That said, it is crazy to not simply use languages that have the best library and framework support for any task.

Sunday, August 02, 2015

We are getting closer to the dream of the 1980s and 1990s: software reuse

I worked (mostly) at SAIC in the 1980s and 1990s. In the groups I worked in we developed large software systems (sometimes with hardware components) for customers. Software reuse was a dream back then that was largely unfulfilled. Our procedure for reuse was mostly cut and paste from old projects, with some effort to write reusable libraries. There was also a movement to use commercial off the shelf (COTS) software.

It occurred to me recently that we are now much closer to the dream of widespread software reuse. What has changed is a healthy open source (and libre) software ecosystem of trusted and vetted libraries, frameworks, and complete applications. I tend to trust software from FSF and the Apache Foundation, for example. Organizations and individuals are motivated to release software for a variety of reasons: for help in development and bug detection, for good publicity and self promotion, and sometimes for ego. All good reasons!

My process when starting a new project is to first identify existing open source software that I can build on. My choice of programming language is often dictated by the language used in the open source software projects that would be most beneficial to my project. It is a thrill to build a new project using mostly existing software. This greatly reduces the cost of projects and I think also greatly increases developer satisfaction. Who wants to spend 6 months writing a project mostly from scratch when it can be done much more quickly building on other people's work.

For me the magic that makes this all happen is public repositories like github and bitbucket. The cost of evaluating an open source project for reuse can be very low: a git clone, build, run the tests, look at the tests, and read through the documentation and source code.

So I believe that we have made incredible progress in software reuse in the last 30 years.

Wednesday, July 29, 2015

I tried Windows 10 the first day of the rollout (today!)

Installing Windows 10 on my 5 month old HP Stream 11 was easy.I have no comments on that process.

Visually two things stand out: windows are all white except for a very thin aqua blue margin and my slow laptop seems to run the UI faster. I don't know how much of the speed bump is making the code more efficient and how much is doing away with some animation effects.

The desktop now seems like a mixture between Windows 7 and 8.1. The start menu is back and the bottom icon navigation bar is always visible along the bottom of the screen unless you put an application in full screen mode. Clicking the windows start menu in the lower left corner of the screen brings up a combo: classic looking menu on the left and the metro large icon interface like Windows 8.1 on the right side of the popup - but this popup only covers about 30% of the screen. It all seems a bit odd to me but I really like it - after using it for ten minutes (it took a little while to adjust to the new interface). I think the new interface in general is very well done. I haven't spent enough time with the new Edge web browser to have a firm opinion yet, but it seems functional and the reading view does a good job of reformatting web pages without advertisements for comfortable reading.

Windows 10 Cortana, similar to Google Now and Apple Siri, is always available just to the right of the Windows start button. Just like the search utility on Windows 8.1 this is the way to quickly find stuff in the system. Need to change the PATH, then just type 'environment variables' and instantly the environment edit utility is shown. I think this actually works a little better than Spotlight on OS X and quite a bit better than hot key search on Ubuntu. I tried using Cortana for searching for things in my community. It did OK, but will hopefully get better as it gets access to more of my life context.

I will spend time locking down some of the privacy settings. I alreaded deleted the Skype application because it leaks a little too much personal information for my taste. Cortana is configuraable for setting which types of information are collected. It pays to take the time checking possible privacy settings for products and services from Microsoft, Google, Facebook, Apple, etc. I tend to keep my Linux laptops more locked down, privacy wise, than my Windows and Apple laptops. I try to strike a balance between having some privacy and also enjoy available products and services.

Saturday, July 25, 2015

Comparing Clojure + Clojurescript with Scala + Scala.js

Even though I mostly use Java, Ruby, and Haskell, I have also been getting my head back into using Scala in my spare time. I took Martin Odersky's Functional Programming Principles in Scala class three years ago, and although I really enjoyed the class (he is a great lecturer!), I didn't much care for the tooling for Scala at that time. I ended up mostly using Clojure (with a little Haskell) for my day to day most used programming language.

I experimented with Scala.js a while back and thought that it compared well with Clojurescript. Sweet to write client code in either Clojurescript or Scala.js but I think that sometimes it is faster to not have the extra complexity and the need to transpile and just use plain old Javascript. I took a class in Typescript this year and really liked it but with ES6 quickly becoming a standard some of the benefits of Typescript go away.

This morning I was looking for an interesting template project using Scala for the backend and Scala.js for client code and ran across the impressive Widok project which supports a reactive web framework for the JVM and Scala.js. I have been experimenting this morning with the Widok skeleton client server starting project which provides niceties like automatic asset (SCSS to CSS, etc.) handling, auto re-compilation, etc. I don't have a need for using Widok right now but it is on my personal "watch list." Widok is really nicely done and the client server skeleton project is easy and fun to hack on.

Still, after three years I still find the tooling for Scala to be a bit heavy. I have a much faster laptop on order and that will help reduce the 5 or 10 second delay in processing changed assets and code for my work and generally experimenting with Scala, playing with Widok, etc..

Tooling is where Clojure and Clojurescript development really shine in my opinion. My Clojure setup allows me to edit server side code, Clojurescript client code, change hiccup template code, etc. and basically immediately see the results of code changes. I like that my hiccup templates are also Clojure code. It is fun to work on projects where everything is simply Clojure. Nice. I also like that I am not using any heavy weight frameworks, just a few libraires. This contrasts with richer development kits like Rails and Meteor.js which are fantastic when your application fits their application models. I like just using libraries and keeping the technology stack simple unless there is a particularly good fit for using something more complex.

Monday, July 20, 2015

Ubuntu Linux on my Chromebook without X11

I have been very happy with using Chrome OS on my Toshiba Chromebook 2 but after often reading how easy it is to install Ubuntu I decided to give it a try. I used crouton (which runs Ubuntu along side Chrome OS) and decided to not install X11 to save space - important with just 16GB internal storage. I used:

sudo sh ~/Downloads/crouton -t cl-extra

After installing Ubuntu I had about 10.5 GB of free local storage (remember, Chrome OS is also installed). After installing the gnu command line development tools, Ruby 2.1, Java 8, lein for Clojure development and several of my Ruby and Clojure projects I still have 7.5 GB of storage. I still want to install Haskell so I will have less space to work with.

When developing web apps, you can test using the Chrome web browser since crouton runs Ubuntu right along side Chrome OS. For example, I was running a Sinatra app using port 4567 and could just hit the URL http://localhost:4567 on the Chrome browser in Chrome OS. Easy. I thought that I might have to start a ssh tunnell beyween the two Linux environments, but that was not required.

Opening several terminal windows using ctrl-alt t and then starting a shell in each with:    


shell
sudo enter-chroot

I really like Chrome OS but also having Ubuntu available for local development is great! My Toshiba Chromebook 2 has 4GM of RAM and the processor is fast enough. If you like console development using emacs or vi, then this is a good setup. My Chromebook as a 1080p resolution screen so there is a lot of screen real estate for multiple console windows and a web browser.

Saturday, July 04, 2015

Experiences with my new Toshiba Chromebook 2

When I worked as a contractor at Google in 2013 I noticed that a lot of people were using Chromebooks. On my first orientation day I received a retina MacBook Pro, that was very nice, and I didn't immediately understand the preference of some people to use a Chromebook. Later I understood that a large amount of work performed at Google could be done in a Chrome web browser. They even had a very nice browser based IDE called Cider that was very great to use because it handled all programming languages and interfaced with Perforce source code control.

My curiousity about Chromebooks has persisted. I am going to start teaching free classes at my local library in about two months on Internet secutity and privacy and I used this as an excuse to buy a Tosiba Chromebook 2. I had already bought earlier this year a little HP Stream 11 Windows 8.1 laptop using the same excuse :-)

I will start out this "review" with a list of the good and not so good things about the Chromebook 2 (all just my personal opinions). The good:

  • The number 1 reason: security. Nothing (except for a few SSH keys) is stored locally and I think that if a Chrombook gets compromised it is automatically reset with a factory image.
  • The screen is fairly high resolution 1080p and looks really nice. Some of the user comments on Amazon complained about reflections in the screen, and this would be a problem if it was used outdoors.
  • 4 GB of main memory which helps the performance. Most Chromebooks seem to only have 2GB of memory.
  • The keyboard and trackpad work well - no complaints.
  • Contrary to some reviews, the battery has been lasting about 8 hours - not bad at all.
  • This laptop is inexpensive, given the nice display.

The not so good:

  • The laptop case is plastic. It looks OK, but I expect it to get dinged up easily. I also bought an iPearl hard shell cover that should help prevent some case damage.
  • Only 16 GB of storage. How much would it have added to manufacturing costs to make this 32 GB? I did try using a 32 GB SD Card for a while which worked fine except for the card sticking slightly out of the case. I think that I will use the external memory card only for movies while flying or other cases where I temporarily need a lot of local space.
  • Privacy: I find myself using Google, Dropbox, and Microsoft web services a lot on the Chromebook. There are issues using these services that I have blogged about before: FSF and practical Internet privacy and security.

Using the Toshiba Chromebook 2:

Writing: As you might expect, I find this laptop great for web browsing and watching Netflix. It is also pretty good for writing. I use leanpub for writing. My markdown manuscript files are stored in a book specific Dropbox folder, and I use a web interface to generate preview PDF, Mobi, and ePub files (which get put back into Dropbox). I use the Chrome StackEdit app to edit the markdown files. This is a simple and effective workflow that lets me concentrate on writing.

Programming is a different issue. I do Haskell, Common Lisp, and Scheme programming in Emacs so using the Chrome shell app is fine for keeping multiple SSH shell windows open to whatever server that I am using. I spent 20 minutes this morning fine tuning my emacs Haskell setup and I can honestly say that the Chrombook is > 95% as effective for Haskell development than my MacBook or Linux laptops. The advantages of using remote servers for Haskell development are my servers are usually faster than than my laptops and long running computations like cabal sandbox/init/build don't heat up my laptop. You would think that Clojure development would be fine on a Chromebook except that my Clojure development workflow uses IntelliJ, not Emacs. It is not worth changing my Clojure development style, so I'll just use my other laptops for Clojure work (and for Java and Ruby, which also use IntelliJ for my work flow). I have tried using nitrous.io (web based IDE) for Javascript and Ruby/Sinatra development and that works well on my Chromebook.

I am very happy that I bought the Chromebook - no buyer's remorse :-)

July 5, 2015 (next day) edits):

For web based Haskell development, I left fpcomplete.com off of my list - an excellent web based IDE and Haskell learning center.

Also, I spent some time this morning working on a Clojure project using Emacs + cider instead of IntelliJ. To be honest, since I usually use IntelliJ, I forgot how great cider has become. Beautiful integration with lein, nice auto-complete, etc.


Tuesday, May 19, 2015

using nitrous.io

I recently signed up for the pro version of nitrous.io since they are phasing out the old version that I used for free. So far I am very pleased. Now you get isolated containers to work in with root access. I use the $15/month version that gives you 1G ram and 20G of disk space for a total of two containers. Since I only use nitrous.io for development, I found it easier just use one container (using the entire 1G ram) and cloning the git projects that I am working on.

Some of the things that I particularly like are:

  • The editor built into the web based IDE handles just about all programming languages and file types.
  • There is a separate app than can do two things: sync files between your laptop and your container and also forward ports so you can test run apps in the container and use your local web browser. If you don't want to forward ports, you can use their preview option that opens a new browser tab and lets you test in your browser without any port forwarding.
  • The battery on my laptops last longer if I do builds and test runs in the container. No (or less) fan noise :-)
  • It is easy to spin up a new container for experiments and just delete it.
  • When I am travelling, I don't have to load up a particular laptop with all of my projects - everything is in my cloud container so all I really need is a web browser.
  • Plays nicely with Heroku. I haven't tried it, but nitrous.io is also set up to play nicely with Google Cloud Services and Microsoft Azure.
  • With a little setup, you can SSH into your container from your phone, tablet, and laptops.
  • The nitrous.io web IDE reminds me of the Cider web IDE that I used internally at Google.
  • In general I think I save a fair amount of time having a container always available with my setup.
  • I tend to use my Mac and Windows 8.1 laptops more often that my Linux laptop, and even though I have SSH, git, IntelliJ, etc. set up on all of my laptops, always having a Linux development environment no matter what device I am on is a more consistent development experience.

Tuesday, May 12, 2015

More infrastructure changes: Heroku

I am pleased that Heroku has introduced a new low volume pricing tier. I never felt very comfortable freeloading on their free tier and free tier web apps also timed out leading to longer loading request times. Now for $7/month per app Heroku supports a "hobbyist" mode for lower traffic sites that never get timed out or swapped out. So far I have redeployed three of my low traffic sites to Heroku's new plan, moving them from a dedicated server. They reduced their free tier hosting of a site to only being active just 18 hours a day - in other words: great for testing deployments but not good for hosting sites for free 24/7. I think this is a good move on their part although I did hear some complaints on Hacker News about this. Under this new pricing tier my three low volume sites cost about $21/month to host. Under the old paid plan the cost would have been about $105/month.

BTW, I would like to thank everyone who took the survey (or emailed suggestions directly to me) about topics for my new book project Power Java. When this book is released, hopefully by August 2015, I will then continue work on and finish what is the same book, but using Clojure for the example programs (Power Clojure). Thanks! I appreciated the input.

Tuesday, April 14, 2015

Some infrastructure changes

In addition to using several programming languages, I also like to experiment with different web infrastructures.

A few weeks ago I switched from using gmail as my primary email service to using fastmail.com. I still use gmail as a backup email and for my Google identity but I decided that I liked Fastmail a bit better and the yearly cost is not much.

The other change I have made is switching my www.markwatson.com web site from a Ruby + Sinatra web app on my own server to a PHP app running on Google's AppEngine. My absolutely favorite feature of AppEngine is the rolling system logs that can be checked easily from the AppEngine console. When I worked at Google in 2013, I loved the internal development environment (Borg, online system logs, the Cider IDE, and much more). Using AppEngine is, in a small way, reminiscent of Google's internal environment - at least enough so to make me nostalgic :-)

I have never been a huge fan of PHP although I have used it over the years for occasional tasks for customers and (rarely) for my own web properties. Without using third party libraries, PHP with HTML, CSS, and a little JavaScript seems pleasantly low level and easy to hack.

As much as I like devops and in general configuring and running Linux servers (often VPSs on Azure, Digital Ocean, and AWS), I sometimes feel a little guilty spending time on operations: perhaps my time could be better spent. My preferred PaaS providers are AppEngine and Heroku, with Cloud Foundary technology services (like IBM's Bluemix) also fairly nice. One thing that has always bothered me with PaaS however are "free" usage tiers. For one thing I like being a paying customer with SLAs, support, etc. Also, free tiers have to affect to some degree pricing for paid users.

Wednesday, April 08, 2015

My two Clojure projects, life in Sedona Arizona, and my new book project

Two Clojure projects?
Well, actually, I had just one Clojure project until today. I refer to my project as KB2 (KnowledgeBooks.com 2) and it is basically a kitchen sink for everything that I thought that I wanted in a personal (and perhaps small group) research and content management system:
  • A personal version of Evernote: allows me to collect eBooks, web pages snippets and notes in a personal repository that is searchable. I use a Firefox add-on I wrote to capture multiple selections on web pages and send them to the web app.
  • Uses NLP to identify entities in eBooks, web pages and notes and add an information icon that provides DBPedia (WikiPedia) information on the fly.
  • Uses the Bing search API to find information on what my NLP analysis code considers if the main topic of eBooks, web pages and notes.

I enjoy meditation (also practice Yoga since about 1975 and Qigong for about two years) and after my early morning mediation this morning I had one of those ah-ha moments:
In using KB2 myself, the automatic Bing searches showing results on the side and DBPedia entity lookups started to get in my way after the novelty of these features "wore off." In other words, the Evernote team new what they were doing when they designed their (rather good) product! This morning I cloned KB2 into KB3 removing everything but the "personal version of Evernote" functionality. KB2 has a lot of useful Clojure code in it, and if I am not too lazy I might open source it all. KB2 has had so many rewrites that the Clojure and Clojurescript (and a little Java and JavaScript) code really need some cleanup love.
Life in Sedona Arizona
My wife Carol and I have been enjoying the early spring time in the mountains of Central Arizona. My friend Bill Bohan (one of the authors of the book Great Sedona Hikes) took this picture of me while we were hiking on Bear Mountain last week:


I have also been enjoying gardening. Several friends and I volunteer to keep a 1 mile historic irrigation ditch functioning to provide water to a historic farm and Crescent Moon Red Rock Crossing Park. The following three pictures (the first two taken by Don Fyffe) show me and some friends unloading some wood chips someone gave us for the farm. The third picture shows me holding up a bok choy I grew with the community garden in the background:

My new book project: "Power Java"
I was 'scientific' in my approach to choosing the material for this book: I had 11 topics that I wanted write about and I used a Google survey (it is located here) to get feedback from people who follow me on social media. Both the survey results and some great suggestions emailed to me by Alex Ott really helped me narrow down the topics for the book. Thanks to everyone who helped! (You can still add to the survey if you want.)
It was a difficult decision choosing Java for this book project. Most of my development in the last year has been in Clojure and Haskell (with a little Ruby and Java) but I decided that the book would have a wider audience written in Java.
I might provide an appendix and additional sample code showing the use of some of the examples with Clojure and JRuby wrappers but it is so easy for Clojure and JRuby developers to reuse Java code that I am not sure if it is worthwhile adding this material to the book. (Feedback on this will be appreciated, BTW.)


Monday, March 02, 2015

Net Neutrality (Yeah!), a new distributed wiki, and my current Clojure project

It is, IMHO, a very good thing that net neutrality now has the force of law behind it. The Internet is the most important artifact for sharing information ever and is worth protecting as a neutral platform. Even though there is a lot of value in walled gardens created by Google, Facebook, and Apple I still hope to see many more systems developed that support individual control of our own data and better support for privacy.

I think that Ward Cunningham's new Federated Wiki (on github: github.com/fedwiki) is a very interesting idea for combining local storage with federated sharing of content. Something to keep our eyes on!

I have been working on a combined document repository and general research tool that is tailored to my own needs but I will also sell it as a low cost commercial product (or I might make a compiled version free with the source code available with a commercial license for inclusion in other products - I am not sure yet). I want a system where I can store PDF (and perhaps other formats) files from eBooks I have purchased, papers published on the web, etc. with the usual search and annotation functionality. I am writing browser plugins for Firefox and Chrome to let me clip material from the web (stored with source URL and text from multiple selections) as JSON data for ingestion into my system. The final layer of functionality is support for research notebooks that can be used to collect references to local and web sources, along with my own notes and writing. My system is mostly written in Clojure and ClojureScript (with some Java and JavaScript). I still need to design and implement sharing by exporting research notebooks as PDF files and an easy to reuse JSON dump format, and a mechanism for making parts of a personal knowledge repository public via a read-only web interface.

The first version of my product will be for individual use (simple install on a personal server, or run locally on a laptop) but I am interested in evolving it into a federated system for use by small teams. A federated system is also useful for the single user use case: for example, sync content from a laptop to a server.