Amey Agrawal

Results 25 comments of Amey Agrawal

@webvictim this would be a great feature to have, can we merge this if possible?

My initial conclusion was wrong. I had been running different configurations on g2.8xlarge and p2.8xlarge so that that the model could fit on the smaller cards K520. But strangely it...

The one with the batch-norm works the other doesn't. Further, digging down it seems that only the first batch normalisation layer is important for the network to work. I tried...

I guess [this](https://drive.google.com/a/bits-pilani.ac.in/file/d/0BwyWFeou7mr2cTdlcF9zV253NGs/view?usp=sharing) should sort the problem. Also the logo now weighs 5kbs instead of 44kbs.

Sorry for inconvenience,I have fixed the link.

@VictorSanh can you please review this PR?

I am creating a PR for the same.

https://github.com/huggingface/knockknock/pull/62

@simon-mo the sarathi fork also has extensive metric logging framework if that is of interest - https://microsoft-research.wandb.io/msri-ai-infrastructure/llm-simulator-v2/reports/Sarathi-Benchmark-Suite-Demo--VmlldzoyNDMx?accessToken=d81jj8r843ntfhjle51uac1y57jvm80urmizil5rxt9jcafqnd1eib5swevpfejx

Let me know if you want us to create a PR for this