glint icon indicating copy to clipboard operation
glint copied to clipboard

Yarn support

Open batizty opened this issue 7 years ago • 8 comments

Hi rjagemen,

Could you please help me to review the request?

All codes are tested on online in my cluster environment.

Any question is welcome and appreciate your previous work.

Thanks

batizty avatar Jun 07 '17 14:06 batizty

@rjagerman Could You Please help me to review the change. Thanks

batizty avatar Jun 09 '17 08:06 batizty

Hi @batizty,

Thanks! This looks really nice! I haven't had the time yet to review it due to several projects and deadlines at work. I hope to review it some time next week.

rjagerman avatar Jun 13 '17 07:06 rjagerman

Hi @rjagerman,

Understand.

And feature for yarn support is used in weibo.com(Maybe you have heard about this web site, maybe not, and it is top 5 website in China, similar twitter with more users in China). And it works well.

And I also developed some other features on Glint, which includes additional operations like Save and Load which could used to store and read quickly models in HDFS, and I believe it is useful for most of Glint Users who are working on Big Vector and Matrix Machine Learning.

If could, I wanna to be an contributor for Glint because it is very simple and stable for large scale Machine learning.

Thank you for your work on Glint.

batizty avatar Jun 14 '17 02:06 batizty

Still haven't found the time to do it, too many deadlines unfortunately :-( I'll let you know when I get around to it.

rjagerman avatar Jun 27 '17 11:06 rjagerman

Got it.

later I will send out another patch for Glint, which could be used to store all parameters into HDFS by nodes independently. And I have tested before, if you want to pull all weight vector/matrix which sizes is over 100m, it took about more than 30min. And I add an operation 'Save' to store the weights in parameter nodes, fortunately it took me less than 1min. I believe it is useful for others who will work on huge models.

Thanks.

batizty avatar Jun 28 '17 01:06 batizty

Hi, @batizty I want to use Glint to store weights for machine learning algorithms, but it's too difficult to save weights to local file or hdfs file. fortunately, i found that you had met this problem and solved it, could you please send out your branch? Thanks.

baukloze avatar Dec 26 '17 10:12 baukloze

Hi, @baukloze Sorry, I forgot this issue.

And could you please wait one or two days, I will send out my modification ASAP. Hope you like it.

By the way, @rjagerman my workmates and i have implemented basic ML algorithms based on Glint, but it is not stable enough now. When our data size reached to 1000B, and the matrix/vector width reached 500B, a lot of traffic load will cause some of AKKA nodes became Quarantined State. Any Suggestion or method to fix this problem?

batizty avatar Dec 26 '17 10:12 batizty

@batizty ok, thanks.

baukloze avatar Dec 26 '17 10:12 baukloze