coming repair search space

marker issue for @sedflix' work

May 21 '19 06:05 monperrus

Some thoughts on the module designs. Note: this is just to get a discussion started.

a new mode type named?
a new CLI argument called "repair-tool-name"
There are two conditions:
- repair-tool-name is specified: all revisions will be classified as yer or no
- repair-tool-name is not specified: all revisions with be classified with multiclass labels
The module
- runs a mieinstance job for each repair-tool with multiple pattern files we have made manually for each repair-tool.
- The output of the mineinstance job is passed to another filter for that specific repair-tool to get the final output. The filter will be used to check conditions which can't be checked using patterns.
the output could be similar to that of mineinstance with an extra field called probable-repair-tools

What do you think? @martinezmatias @monperrus

Jun 04 '19 06:06 sedflix

LGTM.

The name of the module / CLI can be repairability-analysis or analyze-repairability

Jun 05 '19 06:06 monperrus

Hi @sedflix Perfect.

runs a mieinstance job for each repair-tool with multiple pattern files we have made manually for each repair-tool. The output of the mineinstance job is passed to another filter for that specific repair-tool to get the final output. The filter will be used to check conditions which can't be checked using patterns

It makes sense. There, the challenge is to determine, as you mention, which "checks" must be included in this new filter you mention and which once can be incorporated to the "mine instance" analyzer by improving our current pattern specification.

Jun 05 '19 08:06 martinezmatias

Cool. I will try to avoid the use of "checks" and if they are required I would most probably make an issue here to discuss including such features in the pattern specification itself

Jun 05 '19 11:06 sedflix

#74 represents the ongoing work.

The current flow of the module is as follows:

apply FineGrainDifftAnalyzer to the input
extract all the patterns that need to be mined using fr.inria.coming.repairability.RepairTools

Right now, I'm using the JSonPatternInstanceOutput. The pattern name of the instance specifies its label.

Jun 11 '19 12:06 sedflix

Does an output like this make sense and is it okay? repairability is an array so that a single revision can be classified into multiple tools.

{
  "instances": [
    {
      "revision": "patch1-Chart-26-jMutRepair",
      "repairability": [
        {
          "tool-name": "JMutRepair",
          "pattern-name": "JMutRepair:unary",
          "instance_detail": [
            {
              "pattern_action": "ANY",
              "pattern_entity": {
                "entity_type": "UnaryOperator",
                "entity_new value": "*",
                "entity_role": "*",
                "entity_parent": "null"
              },
              "concrete_change": {
                "operator": "INS",
                "src_type": "UnaryOperator",
                "dst_type": "null",
                "src": "(!b1)",
                "dst": "null",
                "src_parent_type": "BinaryOperator",
                "dst_parent_type": "null",
                "src_parent": "(!b1) || b2",
                "dst_parent": "null"
              },
              "file": "/test",
              "line": 2538
            }
          ]
        }
      ]
    },
    {
      "revision": "patch1-Chart-7-jMutRepair",
      "repairability": [
        {
          "tool-name": "JMutRepair",
          "pattern-name": "JMutRepair:binary",
          "instance_detail": [
            {
              "pattern_action": "UPD",
              "pattern_entity": {
                "entity_type": "BinaryOperator",
                "entity_new value": "*",
                "entity_role": "*",
                "entity_parent": "null"
              },
              "concrete_change": {
                "operator": "UPD",
                "src_type": "BinaryOperator",
                "dst_type": "BinaryOperator",
                "src": "dataset != null",
                "dst": "dataset == null",
                "src_parent_type": "If",
                "dst_parent_type": "If",
                "src_parent": "if (dataset != null) {\n    return result;\n}",
                "dst_parent": "if (dataset == null) {\n    return result;\n}"
              },
              "file": "/test",
              "line": 2370
            }
          ]
        }
      ]
    }
  ]
}

Jun 11 '19 12:06 sedflix

Hi @sedflix I would say that's okey: you added to the instance detection the information that the module needs 1) the repair tool ( "tool-name": "JMutRepair") and 2) the repair applied ("pattern-name": "JMutRepair:binary")

Jun 11 '19 13:06 martinezmatias

Hi @sedflix FYI: I am implementing one change to avoid having the harcoded "file": "/test",". I am changing Gt-Spoon and Coming. PR soon.

Jun 11 '19 14:06 martinezmatias

Hi @martinezmatias, If I'm correct IntermediateResultProcessorCallback is called after execution of all the analyzers and before the execution of output processors? Therefore, IntermediateResultProcessorCallback will be an appropriate way to implement the filter as discussed above. What do you think?

Jun 12 '19 11:06 sedflix

Hi @sedflix

If I'm correct IntermediateResultProcessorCallback is called after execution of all the analyzers and before the execution of output processors?

Yes. It's called once all analyzers are executed.

Therefore, IntermediateResultProcessorCallback will be an appropriate way to implement the filter as discussed above.

I'd say that it's not a good place to put that functionality there. A better option IMHO is to create a new Analyzer. Note that Coming creates a pipe of analyzers, where the results from an analyzer is passed forward. Thus, I would add a new analyzer, which takes the pattern detection output and refines the matching.

Jun 12 '19 12:06 martinezmatias

Cool!

Jun 12 '19 12:06 sedflix

FYI: I am implementing one change to avoid having the harcoded "file": "/test",". I am changing Gt-Spoon and Coming. PR soon.

Implemented and merged in both GT-Spoon and Coming. PR #78

Jun 12 '19 15:06 martinezmatias

Hey @martinezmatias and @monperrus , What do you think about how to proceed with the quantitative analysis of repairability module, in particular, the false-positives and true-negative cases?

The current dataset lets us consider only true-positives and false-negatives cases!

Jun 13 '19 14:06 sedflix

When you have a dataset with ground truth classification (such as DRR) we have all four cases. Correct?

Jun 14 '19 16:06 monperrus

See Estimating the Potential of Program Repair Search Spaces with Commit Analysis (Khashayar Etemadi, Niloofar Tarighat, Siddharth Yadav, Matias Martinez and Martin Monperrus), In Journal of Systems and Software, 2022

Mar 25 '24 21:03 monperrus