I updated my new KBdocs.com web portal to support search and exporting all of a user’s documents compressed into a ZIP file.
I only had an hour or so free today to work on this, so I got lazy: I had intended to export the rich text documents that a user keeps on the web portal as OpenDocument formatted files for OpenOffice.org, AbiWord, etc. I often read (programatically, that is) OpenDocument files: easy: it is a ZIP file, so just grab the contents, style, whatever that you need as ZIP file entries.
The problem was that it looked like a multi-hour task to take the internal rich text format used on the web portal (which is really just HTML snippets) and generate equivalent looking OpenDocument formatted document files. So, I got lazy and just bundle all of a user’s documents in one ZIP file for users to download. It turns out that OpenOffice.org imports these ill-formed HTML files just fine – so, I am happy that I did not spend the time right now generating OpenDocument files.
I usually use Lucene for full text search – I love it because it so easy to work with and customize (see my KBtextmaster GPLed project for utilities to search Word, PDF, OpenOffice.org, AbiWord, etc. files with Lucene). For now, I am just using a database text search for the web portal – good enough for the short term.