difftastic icon indicating copy to clipboard operation
difftastic copied to clipboard

Improve 'subtree mostly changed' heuristics

Open Wilfred opened this issue 4 years ago • 5 comments

See e.g. main.rs 4883edd90cc8011041a6cee0622805d6bc7847a0 hunk 5/11 or the res.append dot in e1ffa2af2a38917133a06c823fd0fc365f5eaa0d.

Wilfred avatar Dec 30 '21 00:12 Wilfred

Old:

fn diff_file() {
    let rhs_binary = true;
    if rhs_binary {
        print!("{}", style::header(display_path, 1, 1, "binary"));
        return;
    }
    let extension = 1;
}

New:

fn diff_file() {
    let rhs_binary = true;
    if rhs_binary {
        return DiffResult {
            path: display_path.into(),
            language: None,
            binary: true,
            lhs_src: "".into(),
            lhs_positions: vec![],
            rhs_positions: vec![],
        };
    }
    let extension = 1;
}

Wilfred avatar Jan 01 '22 19:01 Wilfred

errors.mli in HHVM 71abf8d56763d497e95c1e79e13724e6e103c32d is another good example. It might benefit from never splitting a LHS top-level node over multiple RHS top-level nodes. IIRC Autochrome has some ideas in this space.

Wilfred avatar Jan 31 '22 01:01 Wilfred

Giving top-level lists a bigger discount for being wholly novel might help matters.

Wilfred avatar Feb 07 '22 08:02 Wilfred

Deadgrep 14c7d6b74c7891ed7294abe1a6f5914948e4ab49 has an interesting example where deadgrep--directory has its body factored out.

Wilfred avatar Feb 08 '22 06:02 Wilfred

But deadgrep bdcdf138cd71b0a5a80ca64b3bd68b7355084757 is an example where defining a new function deadgrep--escape-backslash is considered as partially reused.

Wilfred avatar Feb 08 '22 06:02 Wilfred