Tim Hennekey
Tim Hennekey
I was trying to use it in conjunction with the [HBase recrawl modules](https://github.com/internetarchive/heritrix3/tree/master/contrib/src/main/java/org/archive/modules/recrawl/hbase) but everything besides these class seems to use `A_FETCH_BEGAN_TIME` and so they're writing to the wrong attribute...
@anjackson Sorry about that, I've updated the link to the code. Any explanation about how you make use of `A_TIMESTAMP` would be helpful so I can do the right thing.
Yeah, I did reference that comment. It occurs to me now that perhaps semantically `A_TIMESTAMP` is meant to be the time the history map entry was created for that CrawlURI...
Your use of this constant also makes it clear that it's a more dangerous change than I thought. It is `public` afterall so there may be other users. Perhaps the...
Hey @mutlurasit can you elaborate a little on what you need help trying to do? The cxml file is essentially the configuration for a "type" of crawl. They provide information...
@mutlurasit Are you code-savvy? You can implement custom processors to replace the ones you see in the cxml file to achieve custom behavior, like creating separate WARC files per domain....
This is where the code finds the writer (and consequently the file) to use to persist data to a WARC: https://github.com/internetarchive/heritrix3/blob/master/modules/src/main/java/org/archive/modules/writer/WARCWriterProcessor.java#L155