microdadosBrasil icon indicating copy to clipboard operation
microdadosBrasil copied to clipboard

Fix no_dic_overlap() function

Open nicolassoarespinto opened this issue 7 years ago • 3 comments

Should not split positions that are not continuous but not overlapping either:


OLD:

10 - 11          10 - 11            15 - 16         15 -18
11 - 12          11 - 12     +                  +
15 - 16    => 
15 - 18


NEW:

10 - 11        10 - 11        15 - 18
11 - 12        11 - 12   +    
15 - 16    =>  15 - 16
15 - 18

nicolassoarespinto avatar Dec 07 '17 17:12 nicolassoarespinto

why does old or new make any difference?

Please check if after importing the variables follow the order in which they appear in the dicionary. I thik if there is an overlap the overlapping variables would get moved to a 2nd import round and then merged into the end of the file, right?

If that is not too complicated, we should reorder the variables follow the original dic.

abs Lucas

2017-12-07 15:30 GMT-02:00 nicolassoarespinto [email protected]:

Should not split positions that are not continuous but not overlapping either:

OLD:

10 - 11 10 - 11 15 - 16 15 -18 11 - 12 11 - 12 + + 15 - 16 => 15 - 18

NEW:

10 - 11 10 - 11 15 - 18 11 - 12 11 - 12 + 15 - 16 => 15 - 16 15 - 18

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lucasmation/microdadosBrasil/issues/154, or mute the thread https://github.com/notifications/unsubscribe-auth/ABXDiHTyu4ooQhPl0haD9jwfq3qs9Lopks5s-CCYgaJpZM4Q56fj .

lucasmation avatar Dec 07 '17 17:12 lucasmation

@lucasmation The end result is the same. For one particular file the dictionary is splitted in hundreds of dictionaries because of one big discontinuity, and calling read_fwf hundreds of times its time consuming. It is actually very simple to fix it, I already did and will push soon, only opened the Issue because it is a change that I would like to document in case that any thing goes wrong.

nicolassoarespinto avatar Dec 07 '17 18:12 nicolassoarespinto

In the current implementation the variables are not reordered, will work on that.

nicolassoarespinto avatar Dec 07 '17 18:12 nicolassoarespinto