Janek Bevendorff

Results 317 comments of Janek Bevendorff

Yes. It's a web archive processing library and the wrapper is one new module. Here's the repository: https://github.com/chatnoir-eu/chatnoir-resiliparse and here are the docs (stable version, no wrapper docs yet): https://resiliparse.chatnoir.eu/en/stable/...

I would avoid anything that requires tree traversal (up or down). All I am interested in is whether the pointer that I currently hold is still valid. That could be...

I don't think you really need something fancy if you hold a weak smart pointer reference to a node in your wrapper. If the pointer is != NULL, you know...

I just noticed that I also cannot destroy unparented nodes in a `DOMNode` wrapper object due to this bug/missing feature. Let's take this example: ```python tree = HTMLTree.parse("...") new_element =...

@ryanthompson591 @tvalentyn I updated the PR. The venvs are now using random names and are bound to the workers, which is the only way to make this safe. I also...

@tvalentyn @ryanthompson591 All right. I think I have it a point where it's ready for a final review. It runs robustly on a 130 node Flink cluster, all processes are...

I added one more change: If a worker exits with a non-zero exit code, the boot binary also exits with a non-zero code. That makes it easier to debug things...

I reverted the return code thing, because it turns out Flink likes to send SIGTERM also on successful completion, so it's of no use.