style-analyzer icon indicating copy to clipboard operation
style-analyzer copied to clipboard

Clean quality report repositories CSV file

Open m09 opened this issue 6 years ago • 4 comments

cf further discussion in this issue (that happened before this edition, cf original issue text): Currently, we list the repositories that we use in the quality report in a CSV file that has 3 columns: url, to and from. Only url and from are used. We should therefore remove the to column and rename from to commit_hash.

Original issue text

Excerpt from quality_report_repos.csv:

url,to,from
https://github.com/reduxjs/redux,902484ed735d38aec06683c847810a7218d8dba2,b307091af4d7e846d9e5f080fb81caf4a8b4aab1

In that repo, one of the files we make predictions for (and actually get errors from) is examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js.

It is present in the from revision:

$ git show b307091af4d7e846d9e5f080fb81caf4a8b4aab1:examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js
// @flow

import { setVisibilityFilter } from '../../actions/visibilityFilter';
import visibilityFilter from '../../reducers/visibilityFilter';

describe('visibilityFilter', () => {
  it('should handle initial state', () => {
    expect(visibilityFilter(undefined, { type: '@@INIT' })).toEqual('SHOW_ALL');
  });

  it('should handle SET_VISIBILITY_FILTER', () => {
    expect(
      visibilityFilter(undefined, setVisibilityFilter('SHOW_ACTIVE'))
    ).toEqual('SHOW_ACTIVE');
  });
});

but not in the to revision:

$ git show 902484ed735d38aec06683c847810a7218d8dba2:examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js
fatal: Path 'examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js' does not exist in '902484ed735d38aec06683c847810a7218d8dba2'

Therefore we should not make predictions for it.

m09 avatar Mar 21 '19 09:03 m09

Did you get it from make report-quality?

We do not use to column from the related csv file. Only to make lookout work. So It uses from column for train and analyze by design.

zurk avatar Mar 21 '19 10:03 zurk

Thanks for the precisions :)

To quote the comment I made on slack:

We should document that we don't use the to column (and probably remove it from csv if possible)

m09 avatar Mar 21 '19 12:03 m09

Good point. We should try. Last time I think about removal it was tricky because it is required to feed two unequal commits to lookout-sdk review call.

zurk avatar Mar 21 '19 12:03 zurk

The from commit and HEAD should work if I remember correctly. from should also probably be renamed to something like rev since it'll be the only one we'll be using.

m09 avatar Mar 21 '19 15:03 m09