Logan Adams
Logan Adams
Hi @sirus20x6 - this issue looks to be similar to this one: https://github.com/microsoft/DeepSpeed/issues/5597 Could you share the output of `hostname --help` and `hostname -V`?
Thanks, @sirus20x6 - we are also looking at switching to just using `socket.gethostname()` and `socket.gethostbyname_ex()` to work around this entirely, do you think that would work for your needs?
If you want, you could test with `pip install git+https://github.com/microsoft/deepspeed.git@loadams/update-hostname-I`
This should be resolved with this PR: #6990. Let me know if you are still having issues with this.
@boqiny - do you have a full repro script? And can you share the device type and info about the system you're on?
> Hi @arashb, @duli2012, @awan-10, @eltonzheng, > > I hope you're doing well. When you have a moment, could you kindly take a look at this PR? It has already...
@LalchandPandia - could you update the title to reflect your issue?
@hongpeng-guo - I was not able to repro this on a GPU node: ``` annotated-types 0.7.0 deepspeed 0.16.4 einops 0.8.1 filelock 3.13.1 fsspec 2024.6.1 hjson 3.1.0 Jinja2 3.1.3 MarkupSafe 2.1.5...
Hi @gayatripadmani - you're running out of memory on your device, can you share what model you are using? Or can you try with a smaller model or with more...
Hi @gayatripadmani - I'm going to close this for being stale. Apologies for being slow to reply - but please comment if you need us to re-open this.