In order to be able to use Storm Crawler when it has features we want, we need to bring its output up to parity with nutch - request records - write robotstxt and crawldiagnostics warcs - metadata records? - other? -
In order to be able to use Storm Crawler when it has features we want, we need to bring its output up to parity with nutch