Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Dataset: 100k c++ function annotations

Open casey-martin opened this issue 2 years ago • 2 comments
trafficstars

Brendan Dolan-Gavitt shared a dataset of 100K c++ functions with annotations and tags generated by GPT-3.5 turbo.

https://twitter.com/moyix/status/1644355889602654210

For example:

int gg_set_focus_pos(gg_widget_t *widget, int x, int y) {
	return 1;
}

Q. What language is the above code written in?
A. C/C++.
Q. What is the purpose of the above code?
A.
This code defines a function called `gg_set_focus_pos` that takes a pointer to a `gg_widget_t` structure and two integer 
arguments `x` and `y`. The function returns an integer value of 1. The purpose of this function is not clear from this code 
snippet alone, as it does not contain any implementation details. However, based on the function name and arguments, it is 
likely that this function sets the focus position of a graphical user interface (GUI) widget to the specified `x` and `y` 
coordinates.

Tags: GUI, widget, focus, position, function.

I've generated an INSTRUCTION RESPONSE dataset with some prompt variety and uploaded it to HuggingFace. Hopefully, this would improve OA's code understanding/generation.

The processed dataset can be found here: https://huggingface.co/datasets/casey-martin/oa_cpp_annotate

casey-martin avatar Apr 21 '23 03:04 casey-martin

Great dataset. You could make a PR adding it to the datasets folder and adding it to init. When you add it to the datasets folder, the code to reproduce it is usually included.

CheckMC avatar Apr 21 '23 03:04 CheckMC

Sounds good. I'll start cleaning up my scripts and submit a PR.

casey-martin avatar Apr 21 '23 03:04 casey-martin