Thomas Arildsen

Results 44 comments of Thomas Arildsen

@khinsen I will have to think more about the practicalities. In particular, I imagine it is something that should ideally work from the point of submission and I am not...

This could be helpful https://lord.io/blog/2014/travis-multiple-subdirs/

I am not sure how "big" such contributions must be, but maybe the Journal of Open Research Software (http://openresearchsoftware.metajnl.com/) could be an option for this as well? I have some...

I am http://orcid.org/0000-0003-3254-3790

I don't know, actually. I will have to investigate. Does warcio have a way to tell up front how many records a WARC file contains? I have only used the...

I had to put this project on the back burner over Christmas and New Year. Now I can hopefully get back to working on it again. @ikreymer I do not...

I have got access to my data again now and I can see that my script using an `ArchiveIterator` to iterate through the WARC for images and `warcio index` both...

@wumpus so far I can at least say that this does not seem to happen at the last record in the file. The file is approximately 53GB and `warcio index`...

I have identified the problematic record in my WARC now. I can open an `ArchiveIterator` on the file positioned at the preceding record and the get the next item: In[76]:...

I am trying to do that now. What is the difference between `arch_iter.get_record_length()` and `record.length` above? I assume they are talking about the same record, but `arch_iter.get_record_length()` includes some metadata...