Java and Clojure examples for reading the new WARC Common Crawl files

Originally published January 26, 2014

I just added a Clojure example to my Common Crawl repo. This Clojure example assumes that you have locally copied a crawl segment file to your laptop. In the next week I will add another Clojure example that pulls segment files from S3.

There are two Java examples in the repo for reading local segment files and from S3.


Popular posts from this blog

DBPedia Natural Language Interface Using Huggingface Transformer

Custom built SBCL and using spaCy and TensorFlow in Common Lisp

I have a new job helping to build a Knowledge Graph at Olive AI