Sanskar Modi
Sanskar Modi
@wangshengjie123 Is there any doc or ticket explaining this approach? Also for the sort based approach that you mentioned.
From my understanding, in this PR we're diverting from vanilla spark approach based on mapIndex and just dividing the full partition into multiple sub-partition based on some heuristics. I'm new...
@pan3793 This does not become problem if we are maintaining the concept of mapIndex ranges as spark will always read deterministic output for each sub-partition. As vanilla spark always read...
Also, I think this issue would not be only limited to ResultStage, this can happen with ShuffleMapStage as well in some complex cases. Consider another scenario – `ShuffleMapStage1 -----> ShuffleMapStage2...
Thanks a lot @waitinfuture for the sort based approach description. > Is it possible to force make it as indeterministic? IMO this would be very difficult to do it from...
> a) If recomputation happens, we should fail the stage and not allow retries - this will prevent data loss. > b) We should recommend enabling replication to leverage this...
This error occurs generally when given data does not follow proper `utf-8` encoding, so you can take a look that data you are providing contains proper `utf-8` charset. I think...
Hey @achilleas-k, i was trying to run `tag = blk.create_tag('TestTag', 'Test', position=[10])` from above code but it's giving me this error. ```python ArgumentError Traceback (most recent call last) in ()...
Isn't it weird in [overview.html](https://github.com/G-Node/nixpy/blob/master/docs/source/overview.rst#L24) Line 24 it is written `import nixio as nix` but on [API documentation](http://g-node.github.io/nixpy/overview.html) it is showing `import nix`?
cc: @hiboyang for viz