Text generation with GPT2, ORT .NET & BlingFire

Open myl1ne opened this issue 2 years ago • 1 comments

Ask a Question

I am trying to get the GPT2-LM-head model to run in Unity (C#).

I'm able to tokenize my string using BlingFire and feed that to the inference session ( https://github.com/myl1ne/unity-geepeetee/blob/main/Assets/Scripts/NeuralNetTest.cs#L119 )

I have trouble understanding the output I'm getting. It is a tensor with dimensions [1,input_sequence_length,1,50257]. So I am assuming that I can slice that on the 2nd dimension and get the best-predicted token at each step of the input sequence. Therefore taking the last should give me the most likely word after the input sequence.

That seems all good in theory, but if I start displaying those tokens and their likelihood, I'm getting mostly garbage, for example:

Input: Hello world. How are

Tokens: [15496,995,13,1374,389]

Model output (0):
 	11 (,) => 0.09602461
	13 (.) => 0.07864164
	198 (\n) => 0.04264725
	12 (-) => 0.02264417
	25 (:) => 0.02175451

Model output (1):
 	13 (.) => 0.07689274
	11 (,) => 0.06940337
	286 (of) => 0.03741954
	290 (and) => 0.0328411
	284 (to) => 0.02558311

Model output (2):
 	198 (\n) => 0.1820033
	383 (The) => 0.01936875
	314 (I) => 0.01474143
	366 (") => 0.01393547
	357 (() => 0.01144133

Model output (3):
 	284 (to) => 0.04022994
	262 (the) => 0.03436097
	11 (,) => 0.02293221
	198 (\n) => 0.02028034
	13 (.) => 0.01647755

Model output (4):
 	262 (the) => 0.0442411
	257 (a) => 0.03345602
	284 (to) => 0.02408092
	287 (in) => 0.02084269
	11 (,) => 0.01694705

Question

Any idea what I am doing wrong? I understand that I'd still have to implement beam search & co to get nicer results, yet I would have expected to get more sound words already at this level.

Further information

Feel free to clone this repo if you want to test the project: https://github.com/myl1ne/unity-geepeetee

Is this issue related to a specific model?
https://github.com/onnx/models/blob/master/text/machine_comprehension/gpt-2/model/gpt2-lm-head-10.onnx

Notes

Any additional information, code snippets.

Oct 06 '21 09:10 myl1ne

check this model for an example to see how you can decode the output. https://github.com/onnx/models/tree/master/text/machine_comprehension/gpt2-bs

Oct 14 '21 18:10 wenbingl

models models copied to clipboard

Text generation with GPT2, ORT .NET & BlingFire

Ask a Question

Question

Further information

Notes

models
models copied to clipboard