Saturday, November 27, 2004

I really appreciate Google Adsense ads

I just got another check from Google for having Adsense ads on my web sites. Sweet - pays for all of my server and internet costs, and then some... I use up a fair amount of bandwidth because sometimes I often have several hundred copies of my free web books downloaded a day. I also get a surprising number of contributions from people via PayPal for my free web books. I am not sure, but if I stopped consulting and wrote full time, I could probably meet my living expenses by writing free web books(*) and accepting donations. (But, I enjoy consulting too much: I love solving problems). Anyway, I appreciate Google Adsense ads - give them a try yourself.

(*) speaking of which, I have been working such long hours consulting the last 7 or 8 months, that I have not had too much time to write. I am now taking a 4 or 5 month break and only consulting about half time (for a pleasant change), so I will hopefully get some more open content material written. (But I am also trying to wrap up work on a commercial product.) Paul Graham gave me permission earlier this year to use his algorithm in his plan for SPAM paper for extra chapters in my Java AI and Common Lisp Programming web books - that is interesting stuff, and will not take too long to write up.

Thursday, November 25, 2004

Authoring plain web pages vs. Semantic web pages

Very few people add RDF (etc.) to their web sites, preferring to spend effort creating "stylish" looking web sites to creating web sites useful for both software agents and human readers. I am a little guilty of this myself: I have some RDF data on my main site, but it does not reflect the real semantics of my site.

The problem is that when adding new information to a web site, ideally there would be some "standard" OWL ontology that perfectly reflected the subject matter for the new material and an authoring tool would both help "fill in the blanks" to create an instance and provide choices for XSLT to render this instance to HTML. Perhaps an extension to the Protégé system could do this. The problem is still that few people would want to think in terms of what is really knowledge engineering just to add material to web sites.

What is needed is a set of tools that cover all of the bases here: authoring tools that allow fast selection of an appropriate ontology and then instance creation. Then, there would have to be a light weight framework (perhaps server side PHP) to render HTML from an OWL instance and the author's preference for XSLT (assuming that each OWL class also had several available XSLT styles available in the authoring tool -- I am really wishing here!).

Anyway, none of this will happen soon, but I wish that it would. Authors will only want to enter information one time so I am skeptical of systems that have a content creator write content for human readers, then separately try to play nice by supplying RDF, RDFS, OWL, etc.

A "holy grail" of AI research would be to solve the hugely difficult problem of knowledge acquisition directly from text - let's not hold our collective breaths waiting for that to happen: I have been doing some personal research trying to write highly tailored semantic extraction code for a very simple OWL ontology. What a feeling of deja vu: reminds me of my efforts in the 1980s to write conceptual dependency (CD) parsers.

Saturday, November 20, 2004

Weird: so now Alan Greenspan and I are in agreement?

I have been very critical of Greenspan, the Bush administration, etc. for not giving the straight poop to the American public on the danger of widening trade and budget deficits. Now that the election is over, we are finally seeing some honesty.

A good quote from this article:
"What he's [i.e., Greenspan] saying is that the nation's balance sheet is out of whack, and the family balance sheet is out of whack,'' Meese said.
What I believe the real tragedy is: with a little more saving, some real reduction in spending (on a personal level, buying less crap; at the federal level, fewer expensive wars and social programs) everything would be OK for (probably) a long time.

Friday, November 19, 2004

Apache Daisy CMS sytem

I installed the latest release of Daisy this evening. It was a tough install (about 25 minutes to get the MySQL tables created, configure OpenJMS which is required, edit lots of config files, etc.). Anyway, the effort seems to be worth it! Like the Magnolia CMS system, when using a Firefox web browser, you can create/edit rich text HTML documents. However, Daisy also supports "attachment document" that are simply containers for OpenOffice.org, Word, etc. documents.

Daisy uses the Lucene search engine and both Word documents and plain HTML documents that I created were almost instantly searchable.

I like it!

PS. I have been planning on integrating my next release of KBtextmaster with Plone. I think that I will also supply a plugin for Daisy. KBtextmaster is a standalone service with an XML-RPC interface (and written in Common Lisp, BTW). I am tentatively planning on making any patches to Plone and Daisy (for accessing KBtextmaster vi XMLRPC) public domain. I have not really made up my mind yet - I still have about 150 hours left getting KBtextmaster ready so I will make final decesions on how to market it later. My consulting work load is fairly high right now, so I will be lucky to have a new version ready for sale by next spring.

Magnolia v2 Java-based Content Management System

I had a short while to download and install Magnolia Content Management System. It is designed to allow non-technical people easily build and maintain web sites. So, this is a web site content management system, not a full blown content management system that handles many document types, etc.

You can upload documents (e.g., OpenOffice.org documents) to Magnolia, but they appear as a named link on a web page.

The install was very easy - by default, two Tomcat instances are run: one for the authoring system and one for the "public" system.

Like the Apache Daisy content management, Magnolia seems like a Plone "want to be". For Java developers, the advantage of Magnolia and Daisy is that they are Java web applications (Plone is written in Python). Daisy has the advantage of supporting many file types (OpenOffice.org, Word, PDF, etc.) for indexing, search, etc.

Wednesday, November 17, 2004

Distributed knowledge workers

Knowledge workers are software engineers, writers, business development people, architects, etc. - anyone who deals with organizing ideas and creates intellectual property (hopefully licensed under a Creative Commons copyright, open source software, etc. - but also proprietary intellectual property (*)).

Stephen Covey, author of The 7 Habits of Highly Effective People says (in Business 2.0, December 2004 issue):
The industrial age was about control, and the information age, or knowledge-worker age, is about release.
Covey gets this right: in the new economy and new way of doing things, trust and flexibility is key.

Think of the revolution in manufacturing processes through "just in time" delivery of material and sub-systems. The new knowledge economy, even acknowledging the dot-com meltdown, is also about flexibility in getting work done on schedule by pulling together human resources just as they are needed. There is no doubt in my mind that being able to tap into talent on a global scale facilitates this type of flexibility: make time zone differences work as an advantage, form a trust network of people from many countries who can work together, etc.

Networks of trust and cheap communications infrastructure are what makes this happen.

A similar thought: also in the Business 2.0, December 2004 issue: a writup of technology innovators who chose to save lots of money by setting up their new companies in rural areas (that had great broadband support) in order to drastically cut the cost of doing business. This resonated with me: I live in a fairly remote area in the mountains of Northern Arizona - but, I spend a lot of my time designing and writing code for a company in India. I also take advantage of the relatively low rural cost of living to be able to work on commercial products that are definitely "niche market" - a (slightly) lower return on investment while living in an inexpensive rural area that works for me - if I was still living in Solana Beach California, the lower return on investment would make it impossible to work on the types of products that most interest me on a technical level.

In a sense, work networks are scale free networks like the internet: the "hubs" are large cities where closeness to customers and business partners is worth the higher cost of doing business. However, just as smaller "non hub" internet sites add tremendous value to the internet, distributed smaller businesses (especial high technology businesses) in much less expensive rural areas adds to the value of the economy.

(*) My take is that the most efficient way to do business and build industry is a mix of open source/open content and proprietary intellectual property - I am not arguing for one over the other.

Monday, November 15, 2004

Dumping on our kids and grandchildren: raising the national debt limit

Remember: Tax cuts are really tax deferrals to future generations. Too many people in our society lack the moral backbone to live within their earnings.

Some people say "but, running a deficit is a good thing". Well, of course running a moderate deficit is probably a good thing. Our current deficit limits are way out of whack however for what is in the best interests of our country.

Now, when I say best interests of our country, I am talking about people - I no longer count most corporations because most large corporations have moved offshore to avoid their fair share of taxes. They are no longer American.

I believe that in a free society that people should for the most part be free to make their own decisions and live their lives the way that they want to. However, this deeply immoral passing on economic desperation to future generations is sick and selfish behavior - not the legacy that I want to leave to my children and grandchildren. Anyone who is not part of the solution (e.g., repeatedly write to your elected representatives demanding fiscal responsibility, fair taxes for all people and corporations, etc.) is part of the problem. For the average person: do you really need that brand new car bought on credit? For owners of large corporations: do you really need that extra and excess $$$ so much that you are willing to contribute to, and cause severe economic problems for future generations.

Time for everyone to take a deep breath, and decide that dumping on future generations is a sick and immoral act - and that we should do everything in our power to say no to greed and selfishness. What ever happened to basic old fashion American values that I would summarize as "take what you need and leave something for other people". While it is in our nature to constantly strive for better lives for ourselves and our families, I question this new addiction to materialism.

Saturday, November 13, 2004

Semantic Web: replace RDF storage with relational database?

OK, this might not be a good idea for several reasons, but I am going to toss out this idea anyway: for Semantic Web applications instead of modeling and defining application classes (e.g., with OWL, RDFS, etc.) design appropriate database tables and store meta-data, etc. directly in a relational database instead of in RDF.

There are good Java tools for handling RDF, RDFS, OWL, etc. (e.g., Jena, Sesame).

Still, the idea of storing meta-data for online documents, relationships between documents, etc. directly in a database solves a few problems that seem to be holding back progress with Semantic Web technologies: learning curve for RDF/RDFS/OWL, acquisition and learning curve for tools for storing RDF, learning curve for one of the several RDF query languages, etc.

The next release of my KBtextmaster product will include an option for automatic RDF generation from documents and web pages. I think that also supporting extraction of meta-data to a relational database would be useful. In the first case, I will only attempt to extract data for limited schemas (e.g., part of dublin core, my own OWL ontology for news stories, etc.) - every schema needs hand-crafted extraction code so I need to limit what types of information I try to extract. If I also support use of a relational database, I would need to design equivalent tables (sort of an OO to relational mapping problem). Anyway, the semantic extraction problem is most of the hard work and I thought that some potential product users might have an easier time if extracted data was placed directly into a database so that they would not have to learn any new technologies if they did not want to.

Friday, November 12, 2004

My free advice to Steve Jobs: movie equivalent for iTunes music store

I am pleased with Apple for many reasons (*), so, Steve, this advice is free! Current online movie services are lame for a few reasons:
  • They don't run on Macs!
  • DRM limits movie watching to 24 hours
So, following up deals with the music industry to allow Apple's music store (which I really enjoy, BTW), negotiate with the movie industry and support a movie store site that:
  • Works with Macs!
  • With DRM that allows watching a downloaded movie for a whole week
  • Support two pricing structures: 99 cents for a low resolution movie (perhaps 450x250 pixels) and $2.50 for a higher resolution movie (perhaps 800x500 pixels). Support full screen mode with interpolation.

(*) I wrote the free chess program distributed with very early Apple IIs and received fun notoriety for that. I am a slightly wealthier man today because I wrote a commercial Mac application in 1984. The last batch of Apple stock that I bought has gone from $15 to $55 per share. Thanks Apple!

Simplified EJB 3.0 specification: still not sure how much I will use it

When I must use EJBs, I use XDoclet comments, ant build targets, etc. to take most of the hassle out of using EJBs. The new EJB 3.0 specification that uses new features of JDK 1.5 (or 5) looks like a definite improvement.

Still, like a lot of server side Java developers I usually do not need EJBs for my projects. If I can avoid even using a relational database, the use of Prevalyer makes server side back-end development fast and easy - if not, Hibernate is my tool of choice.

I use IntelliJ for about 90% of my Java development. Future versions of IntelliJ will probably simplify using the new EJB 3.0 specification just as the current version greatly facilitates using JSPs, EJBs, etc. So, I might start using EJBs again - wait and see.

Tuesday, November 09, 2004

Using Firefox client, server side Java for a standard platform

I do get tired of supporting multiple browsers when writing interactive web applications. Life would be simpler, especially when trying to write "sort of fat client applications" (e.g., Javascript+HTML like GMail) to have a single platform for client and server: only support Mozilla Firefox on the client side and tailor a development environment with appropriate tag libraries, etc. for dealing with Firefox (e.g., dynamic tree displays, standard Javascript utilities for verifying form input, etc.)

Now, while I consider myself to be an excellent Java, Common Lisp, Python, and Prolog programmer, I am admitedly a little weak using Javascript. For Javascript gurus, my desire for a single standard web browser client probably seems a little lazy :-)

There is a lot of push-back in industry for not supporting the Microsoft Internet Explorer browser, but I would argue that installing Mozilla Firefox is a fairly light weight requirement for using a specific web application.

I often look for commercially available software components to reduce development costs. Most of the systems that I write for customers run on a single server and have a modest number of users so reasonable licensing costs can be a lot cheaper than extra labor costs for devlopment. I have a wish-list for Firefox extensions for form input, inline rich text editors, spreadsheet components, etc. If developers of useful Firefox extensions had a larger market (i.e., lots of web sites and web applications standardized on Firefox) then the cost per component would go down, and everyone but Microsoft wins.

Excellent summary of Bush economic policy

Now that the election is over we can all get back to other important things like work, studying the economy (whoever does not keep a sharp eye on the economy is a fool!), etc.

The New Republic magazine has a very useful summary of the Bush economic policy:
There is a simple way to understand economic policy-making under George W. Bush: Whichever pressure group has the strongest and most direct stake in an issue gets its way. Wealthy individuals and business owners have received large tax cuts; farmers have gotten lavish assistance; and insurance and drug companies won enormous subsidies in the Medicare prescription-drug bill. When steel firms lobbied for tariffs, Bush granted them. When automakers and other manufacturers later lobbied Bush to reverse course, complaining that those tariffs had raised the cost of the steel they buy, he began to back down. If there's a single prominent case where Bush offended a powerful corporate interest--except to benefit an even more powerful corporate interest--we have not come across it.
I think that understanding our president's economic policy is very important for planning investments, trying to figure out which markets may be hotter than others, etc. Trying to figure out how the world really works (in addition to being really interesting) helps one's personal finances.

Monday, November 08, 2004

A great software demo; tools for creating demo movies

I enjoyed watching the Ruby Rails demo that is a 10 minute demo of building a web application using Ruby Rails. This demo was made by capturing a video as the demo creator smoothly went through the steps to build a web app.

I have looked at a few Mac OS X demo-making to Quicktime movie products and I think that I will make my selection soon. I am motivated to do this for a few reasons. One of my customers is interested in ways to demo the web application that I developed for his company and creating a demo like the Ruby Rails demo seems like a good idea. As bandwidth gets cheaper, I have also thought of augmenting the free web books (on my main web site) with free instructional videos.

Sunday, November 07, 2004

Hypertext and Knowledge Management: personal history

I was thinking this morning that this is about my 20th aniversary of experimenting with hypertext technologies (which I view as any chunks of information with typed links): using the Grapher (DAGs) on my Xerox Lisp Machine in the early 1980s to visualize typed links between data; a commercial Smalltalk class library to do the same thing a few years later; Owl Guide hypertext authoring tool on the Mac; ordering and reading the Ted Nelson's Xanadu manefesto document; using a text-based browser to explore physics documents at CERN in the very early 1990s (my first exposure to HTTP protocol and HTML - thanks to Karl Weibe for that!), etc.

In earlier history, some wealthy people had their own personal libraries to store and organize knowledge. Now the tools are better and usually support close to free access to information.

As good as graphical user interfaces are for browsing the web, examining links between data sources, etc., I still think that the "killer application" for knowledge management will have a simple (mostly) text interface that uses natural language understanding and (in a semantic web sort of way) be able to search on content based on meaning.

As an aside: my top priority research project that I am scheduling about half my time for in December and early next year is to add "RDF generation from plain text" to my KBtextmaster system. I will post a link when I have a white paper ready on how I plan on doing this.

Saturday, November 06, 2004

Positive Media Workshop

My friend Tom Munnecke (*) organized a Positive Media Workshop to take place in NYC next week. Tom believes (and puts his effort) in transforming the world in positive ways by sharing good ideas that work.

Please do look at the linked web page - interesting and worth a look. While I agree that the news media business wallows in bad news to increase profits (and I would argue the obvious point of pushing political agendas highly beneficial to corporate owners) I am personally not so optimistic that the mega-size corporation owned news media can be transformed in a positive way. However I also hold out some hope that "person to person" and group related communication technologies like web blogs, wikis, communities built on interlinked web sites, etc. will continue to allow (**) a minority of people to connect in positive ways. My pessimism however is based in a belief that most people are happy enough with 15 second 'sound bite' news that is highly filtered: like putting blinders on a horse -- but, to each their own: some people are satisfied with the world as it is and others want something better.

(*) who took the great picture of my wife and I in front of the Taj Mahal on my main web site.

(**) although I believe that here in the US we may eventually see censorship on the Internet as we see today in China. There are too many powerful business and political interests that might see alternative more personalized communication as a threat to their interests.