This page has moved to https://markwatson.com/blog/

Friday, May 18, 2018

I have removed blog comments to support GDPR compliance

One of the great things about writing a blog is interacting with users through the comments they leave on my blog.

At least for now, I have disabled comments on my blog. Sorry about that!

Saturday, April 28, 2018

Is too much emphasis in Artificial Intelligence given to Deep Learning?

Deep learning has revolutionized the way we do Natural Language Processing, image recognition, translation, and other very specific tasks. I have used deep learning most days at work for about four years. Currently, I do no work in image recognition but I still use convolutional networks for NLP, and in the last year mostly use RNNs and GANs.

While I agree that deep learning is important in many practical use cases, a lot of data science still revolves around simpler models. I feel that other important techniques like probabilistic graph models (PGM) and discrete optimization models (the MiniZinc language is a good place to start) don't get the attention in universities and industry that they deserve.

On a scale of 1 to 10, I estimate the hype level of deep learning to be approximately 15.

I started working in the AI field in 1982 (back then, mostly "symbolic AI" and neural networks) and to me artificial intelligence has always meant a very long term goal of building flexible intelligent systems that can learn on their own, be full partners with human knowledge workers, and reliably take over job functions that can be safely automated.

I don't know what will get us to my (and many other people's) view of truly flexible and general purpose intelligent systems. My gut feeling is that a solution will require many techniques and technologies, perhaps including:
  • Efficiency and scalability of differentiable models running on platforms like TensorFlow
  • Formal coding of "common sense knowledge." There have been good attempts like ConceptNet and Cyc/OpenCyc but these are first steps. A more modern approach is Knowledge Graphs, as used at Google (what I used when I was a contractor at Google), Facebook, and few other organizations that can afford to build and maintain them.
  • A better understanding of human consciousness and how our brains work. The type of flexible intelligence that is our goal does not have to be engineered anything like us but brain inspired models like Hierarchical Temporal Models are useful in narrow application domains and we have to keep working on new model architectures and new theories of consciousness.
  • Faster "conventional" hardware like many-core CPUs and GPU like devices. We need to continue to lower hardware price and energy costs, increase memory bandwidths.
  • New hardware solutions like quantum systems and other possibilities likely no one has even imagined yet.
  • Many new ideas, many new theories, most not leading to to our goal.

Given my personal goals for flexible and general AI, you can understand how I am not pleased with too much emphasis on deep learning models and not optimizing our engineering efforts enough for our longer term goals.


Sunday, April 01, 2018

Java JDK 10

Java 10 was released for general availability a week ago.  I just installed it on macOS from http://jdk.java.net/10/ I un-tar'ed the distribution, set JAVA_HOME to the Home directory in the distribution and put Home/bin first on my PATH. I used Java 10 with the latest version of IntelliJ with no problems: opened an existing Java project and switched the JDK to the newly installed JDK 10.

There are several nice features but the only ones I am likely to use are:

  1. Local variable type inference
  2. Better Docker container awareness in Linux
  3. Improved repl support

I have mixed feelings about the rapid new 6 month release cycles, but I understand the desire to compete with other JVM languages like Kotlin, Scala, Clojure, etc.

I have updating both of my Java books (Power Java and Practical Artificial Intelligence Programming With Java) on my schedule for the next 18 months. Java 11 is slated for release September 2018 and I will probably use Java 11 (whatever it will be!) for the updated books sine Java 11 will be a long term support release.

Sunday, February 11, 2018

Trying out the Integrated Variants library to explain predictions made by a deep learning classification models

"Explainable AI" for deep learning is required for some applications, at least explainable enough to get some intuition for how a model works.

The recent paper "Axiomatic Attribution for Deep Networks" describes how to determine which input features have the most effect on a specific prediction by a deep learning classification model. I used the library IntegratedGradients that works with Keras and another version is available for TensorFlow.

I modified my two year old example model using the University of Wisconsin cancer data set today. If you want to experiment with the ideas in this paper and the IntegratedGradients library, then using my modified example might save you some time and effort.