Sunday, July 13, 2008

I am evaluating Google's Protocol Buffers for my knowledgebooks.com KB_bundle product

I am working on a new Java version of my knowledgebooks.com KB_bundle product (see home page for an overview) that implements an all in one toolbox for Natural Language Processing (NLP), entity extraction from text, text summarizing, text clustering, knowledge extraction to RDF/RDFS, support for document management (file management, index/search), and SPARQL querires of either embedded or external RDF data stores. KB_bundle will be free for non-commercial use and evaluation, and available for a fee for commercial use.

While I designed KB_bundle as an embedded Java library, I have always planned for both RESTful and SOAP web service support. I have been looking at Google's Protocol Buffer documentation and examples this weekend and I think that I will also supply a third wrapper for Protocol Buffer RPC support.

Earlier this year, a project that I was working on had performance problems due to the overhead of serializing data to XML and then parsing it in a REST based system. The problem was that when the project started, relatively little data was transferred between back end processes and a front end Rails application so the overhead of using XML was OK. As the project requirements changed, we passed much more data encoded in XML. I am looking at Protocol Buffer in general as a way to avoid performance problems in the future.

1 comment:

dontcare said...

Be careful about protocol buffer's performance

claim this article can explain a bit more http://soa.sys-con.com/node/250512