CodeTF
CodeTF copied to clipboard
`defect` and `refine` tasks for `codet5p-16b`
Hi Nice project!
Do you have the checkpoint and evaluation conclusions of the defect
and refine
tasks for codet5p-16b
?
Here I tested the codet5 defect and refine task examples in codetf, and found that the effect is not very good.
hi, thanks for the question. May i know what do you mean by the "effect"? Note that we haven't concluded the project and haven't officially released yet, there are still things to improve.
- For
Salesforce/codet5-base-codexglue-defect
:
code_snippets = """
#include <stdio.h>
int main()
{
int *pointer_var;
printf("Address of the given pointer variable: %d", pointer_var);
printf("Value of pointer variable is : %d", * pointer_var);
return 0;
}
"""
defect_model = load_model_pipeline(model_name="codet5", task="defect",
model_type="base", is_eval=True,
load_in_8bit=True, weight_sharding=False)
defects = defect_model.predict([code_snippets])
print(defects)
code_snippets
: a c/c++ demo about null pointer error.
But Salesforce/codet5-base-codexglue-defect
thinks it's not a defect, and output ['false']
.
- For
Salesforce/codet5-base-codexglue-refine-medium
:
code_snippets = """
class GFG
{
public void test ()
{
// Initializing String variable with null value
String ptr = null;
// This line of code throws NullPointerException
// because ptr is null
if (ptr.equals("gfg"))
System.out.print("Same");
else
System.out.print("Not Same");
}
}
"""
refine_model = load_model_pipeline(model_name="codet5", task="refine",
model_type="base", is_eval=True,
load_in_8bit=True, weight_sharding=False)
refines = refine_model.predict([code_snippets])
print(refines)
code_snippets
: a java demo about null pointer error.
But Salesforce/codet5-base-codexglue-refine-medium
output the fixed version with more defects:
public void test ( ) {
// Initializing String variable with null value
String ptr = null ;
// This line of code throws NullPointerException because ptr is null
if ( ptr. equals ( "gfg" ) )
java.lang.System.out.print ( "Same" ) ;
else
java.lang.System.out.print ( "Not Same" ) ;
}
- defect 1: ptr.equals will have null pointer error.
- defect2: java.lang.System.out.print
i see, thanks for this, the CodeT5-defect is fine-tuned on this data from Microsoft CodeXGLUE: https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/Defect-detection. In fact, this dataset is mostly used for academic research, apply the results into real-world application can produce unsatisfied result for many cases.
You can check more on the results in CodeT5 paper: https://aclanthology.org/2021.emnlp-main.685.pdf.
Thanks.
Do you have more datasets about defect and refine to recommend?