coming icon indicating copy to clipboard operation
coming copied to clipboard

repair search space

Open monperrus opened this issue 6 years ago • 14 comments

marker issue for @sedflix' work

monperrus avatar May 21 '19 06:05 monperrus

Some thoughts on the module designs. Note: this is just to get a discussion started.

  • a new mode type named?
  • a new CLI argument called "repair-tool-name"
  • There are two conditions:
    • repair-tool-name is specified: all revisions will be classified as yer or no
    • repair-tool-name is not specified: all revisions with be classified with multiclass labels
  • The module
    • runs a mieinstance job for each repair-tool with multiple pattern files we have made manually for each repair-tool.
    • The output of the mineinstance job is passed to another filter for that specific repair-tool to get the final output. The filter will be used to check conditions which can't be checked using patterns.
  • the output could be similar to that of mineinstance with an extra field called probable-repair-tools

What do you think? @martinezmatias @monperrus

sedflix avatar Jun 04 '19 06:06 sedflix

LGTM.

The name of the module / CLI can be repairability-analysis or analyze-repairability

monperrus avatar Jun 05 '19 06:06 monperrus

Hi @sedflix Perfect.

runs a mieinstance job for each repair-tool with multiple pattern files we have made manually for each repair-tool. The output of the mineinstance job is passed to another filter for that specific repair-tool to get the final output. The filter will be used to check conditions which can't be checked using patterns

It makes sense. There, the challenge is to determine, as you mention, which "checks" must be included in this new filter you mention and which once can be incorporated to the "mine instance" analyzer by improving our current pattern specification.

martinezmatias avatar Jun 05 '19 08:06 martinezmatias

Cool. I will try to avoid the use of "checks" and if they are required I would most probably make an issue here to discuss including such features in the pattern specification itself

sedflix avatar Jun 05 '19 11:06 sedflix

#74 represents the ongoing work.

The current flow of the module is as follows:

  • apply FineGrainDifftAnalyzer to the input
  • extract all the patterns that need to be mined using fr.inria.coming.repairability.RepairTools

Right now, I'm using the JSonPatternInstanceOutput. The pattern name of the instance specifies its label.

sedflix avatar Jun 11 '19 12:06 sedflix

Does an output like this make sense and is it okay? repairability is an array so that a single revision can be classified into multiple tools.

{
  "instances": [
    {
      "revision": "patch1-Chart-26-jMutRepair",
      "repairability": [
        {
          "tool-name": "JMutRepair",
          "pattern-name": "JMutRepair:unary",
          "instance_detail": [
            {
              "pattern_action": "ANY",
              "pattern_entity": {
                "entity_type": "UnaryOperator",
                "entity_new value": "*",
                "entity_role": "*",
                "entity_parent": "null"
              },
              "concrete_change": {
                "operator": "INS",
                "src_type": "UnaryOperator",
                "dst_type": "null",
                "src": "(!b1)",
                "dst": "null",
                "src_parent_type": "BinaryOperator",
                "dst_parent_type": "null",
                "src_parent": "(!b1) || b2",
                "dst_parent": "null"
              },
              "file": "/test",
              "line": 2538
            }
          ]
        }
      ]
    },
    {
      "revision": "patch1-Chart-7-jMutRepair",
      "repairability": [
        {
          "tool-name": "JMutRepair",
          "pattern-name": "JMutRepair:binary",
          "instance_detail": [
            {
              "pattern_action": "UPD",
              "pattern_entity": {
                "entity_type": "BinaryOperator",
                "entity_new value": "*",
                "entity_role": "*",
                "entity_parent": "null"
              },
              "concrete_change": {
                "operator": "UPD",
                "src_type": "BinaryOperator",
                "dst_type": "BinaryOperator",
                "src": "dataset != null",
                "dst": "dataset == null",
                "src_parent_type": "If",
                "dst_parent_type": "If",
                "src_parent": "if (dataset != null) {\n    return result;\n}",
                "dst_parent": "if (dataset == null) {\n    return result;\n}"
              },
              "file": "/test",
              "line": 2370
            }
          ]
        }
      ]
    }
  ]
}

sedflix avatar Jun 11 '19 12:06 sedflix

Hi @sedflix I would say that's okey: you added to the instance detection the information that the module needs 1) the repair tool ( "tool-name": "JMutRepair") and 2) the repair applied ("pattern-name": "JMutRepair:binary")

martinezmatias avatar Jun 11 '19 13:06 martinezmatias

Hi @sedflix FYI: I am implementing one change to avoid having the harcoded "file": "/test",". I am changing Gt-Spoon and Coming. PR soon.

martinezmatias avatar Jun 11 '19 14:06 martinezmatias

Hi @martinezmatias, If I'm correct IntermediateResultProcessorCallback is called after execution of all the analyzers and before the execution of output processors? Therefore, IntermediateResultProcessorCallback will be an appropriate way to implement the filter as discussed above. What do you think?

sedflix avatar Jun 12 '19 11:06 sedflix

Hi @sedflix

If I'm correct IntermediateResultProcessorCallback is called after execution of all the analyzers and before the execution of output processors?

Yes. It's called once all analyzers are executed.

Therefore, IntermediateResultProcessorCallback will be an appropriate way to implement the filter as discussed above.

I'd say that it's not a good place to put that functionality there. A better option IMHO is to create a new Analyzer. Note that Coming creates a pipe of analyzers, where the results from an analyzer is passed forward. Thus, I would add a new analyzer, which takes the pattern detection output and refines the matching.

martinezmatias avatar Jun 12 '19 12:06 martinezmatias

Cool!

sedflix avatar Jun 12 '19 12:06 sedflix

FYI: I am implementing one change to avoid having the harcoded "file": "/test",". I am changing Gt-Spoon and Coming. PR soon.

Implemented and merged in both GT-Spoon and Coming. PR #78

martinezmatias avatar Jun 12 '19 15:06 martinezmatias

Hey @martinezmatias and @monperrus , What do you think about how to proceed with the quantitative analysis of repairability module, in particular, the false-positives and true-negative cases?

The current dataset lets us consider only true-positives and false-negatives cases!

sedflix avatar Jun 13 '19 14:06 sedflix

When you have a dataset with ground truth classification (such as DRR) we have all four cases. Correct?

monperrus avatar Jun 14 '19 16:06 monperrus

See Estimating the Potential of Program Repair Search Spaces with Commit Analysis (Khashayar Etemadi, Niloofar Tarighat, Siddharth Yadav, Matias Martinez and Martin Monperrus), In Journal of Systems and Software, 2022

monperrus avatar Mar 25 '24 21:03 monperrus