VatsaDev comments

Results 88 comments of


                                            VatsaDev

trafficstars

AssertionError when trying to run sample.py

Whats your environment? Do you have the config file? Did you add anything to the CLI? Is every command formatted right?

How to Set "vocab_size" and "block_size" for Word Embedding?

just set them in the train config with --block_size, vocab size is tokenizer size, depends on BPE or char, not a you setting > "vocab_size" needs to be larger than...

To reduce GPU memory usage & found a bug

That Chatgpt response has legit ripped the words from https://saturncloud.io/blog/how-to-clear-gpu-memory-after-pytorch-model-training-without-restarting-kernel/, but you should make this a PR, and the del keyword doesnt need gc, should clear mem immediatly

Stop words?

The best I could do was generate, then string partition, This works fine for short inference, but is terrible for long ones

Is it possible: davinci-003?

If you mean gpt-3 level, You're several billion parameters short. If you mean ChatGPT, then you need RLHF and finetuning on conversational data. The best I could get was using...

Finetuning did not seem to change the generated contents of GPT-2?

I finetuned the gpt2 model from the walk through on the readme, and have forked and ran hundreds of times, definitely works

Is there an easy way to add generation of multiple possible sequences?

@vladimirlitvinyuk You are talking about a model with anywhere between 124 million to 1.5 billion parameters. It doesn't make duplicates very often, unless you give it some very narrow input....

Is this possible to produce text with few shot learning?

@Yusuf-YENICERI ``` Message: Support has been terrible for 2 weeks... Sentiment: Negative ### Message: I love your API, it is simple and so fast! Sentiment: Positive ### Message: GPT-J has...

Is graph card required?

You don't need a GPU on your machine to use NanoGPT, an environment like google colab can easily support up till gpt2-medium, and maybe more, depending on how you change...

Summarization Task Special Token

@nashcaps2255 You are not under a false impression, You Probably can do the above task, but it depends a lot on your model, hyperparameters, and dataset size. I have a...