gpt-2-simple icon indicating copy to clipboard operation
gpt-2-simple copied to clipboard

How to make gpt2 simple generate custom text based on some structured data ?

Open shaktisd opened this issue 4 years ago • 2 comments

Hi I am trying to work on a practical application of GPT2 to generate text commentary based on some structured data. For Example, for the following weather data , there is a text associated with the numerical data that describes the weather in simple english like Showers late. Mostly cloudy. and Isolated tstorms late. Morning clouds. . Let's assume this can be completely derived from the given numerical data, how to finetune GPT2 so that if we give the input like 30 / 22 °C | 30 °C | 17 km/h | ↑ | 54% | 57% | 0.9 mm | -2147483648 (Low) | 5:52 | 18:44 and it spits out => Maximum temperature is 30 Degree, with partial clouds .

Sample Data

Day Temperature Feels Like Wind   Humidity Chance Amount UV Sunrise Sunset Weather Details
Mon 30 / 22 °C 30 / 22 °C 30 °C 17 km/h 54% 57% 0.9 mm -2147483648 (Low) 5:52 18:44
30 / 22 °C
8-Jun
Tue 30 / 21 °C 30 / 21 °C 30 °C 16 km/h 53% 57% 1.5 mm -2147483648 (Low) 5:52 18:45
30 / 21 °C
 
9-Jun
Wed 30 / 22 °C 30 / 22 °C 30 °C 16 km/h 52% 57% 2.2 mm -2147483648 (Low) 5:53 18:45
30 / 22 °C
 
10-Jun
Thu 28 / 22 °C 28 / 22 °C 28 °C 22 km/h 63% 62% 5.2 mm -2147483648 (Low) 5:53 18:45
28 / 22 °C
 
11-Jun
Fri 26 / 22 °C 26 / 22 °C 26 °C 22 km/h 68% 98% 3.4 mm -2147483648 (Low) 5:53 18:46
26 / 22 °C
 
12-Jun

shaktisd avatar Jun 08 '20 12:06 shaktisd

This advice is anecdotal and not technical, but in my experiments I've learned that simply giving GPT-2 enough data with consistent formatting in a plain old .txt file is enough to make it learn to produce the desired results. You can do this with inline tags; separate the ordered data and result samples with [DATA] and [RESULTS] and GPT-2 will do a pretty good job of keeping everything in place, and even correlating the data. Because GPT-2 is forward-predictive, make sure that you take that into account when it decides to generate new data.

scripples avatar Jul 16 '20 18:07 scripples

Was there any more headway made on the above question? This article takes a very elaborate approach to it, but I am also looking for something that is more similar to the question posed above https://medium.com/analytics-vidhya/natural-language-generation-from-structured-data-9d70f3f224af

joshwpeters avatar Nov 29 '20 02:11 joshwpeters