unicc Performance issues

Trying to build the grammar from postgresql mentioned here https://github.com/phorward/unicc/issues/22#issuecomment-984550196 it takes a long time to generate the parser and looking at the profile it seems that most of the time is spent on size_t plist_union( plist* all, plist* from ) (see bellow) that is doing a naive exponential comparison/insertion between two lists.

Looking through the code I think that creating a sorted list first and then doing a binary search on it to decide to push/add one list element could improve things (not ideal but it's something), but I'm not sure about it because I don't know if the lists can have duplicates.

Any thought about this ?

/usr/bin/time unicc-nb postgresql-13.3.y.par 
104.40user 0.10system 1:44.57elapsed 99%CPU (0avgtext+0avgdata 135872maxresident)k
152inputs+11096outputs (0major+34371minor)pagefaults 0swaps

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 10.95      8.67     8.67 1398180161     0.00     0.00  plist_compare
 10.50     16.98     8.31 12371648     0.00     0.00  plist_union
  9.83     24.76     7.78 3259644116     0.00     0.00  plist_next
  9.61     32.37     7.61  2294100     0.00     0.00  close_item
  7.97     38.68     6.31        1     6.31     6.38  detect_default_productions
  7.06     44.27     5.59     8438     0.00     0.01  lalr1_closure
  5.59     48.69     4.43 2272182630     0.00     0.00  plist_access
  5.48     53.03     4.34 1343718440     0.00     0.00  pccl_count
  5.00     56.99     3.96  6007214     0.00     0.00  list_push
  3.27     59.58     2.59 209994052     0.00     0.00  list_count
  2.69     61.71     2.13 113372032     0.00     0.00  plistel_drop
  2.30     63.53     1.82                             plist_key
  2.08     65.18     1.65 582855415     0.00     0.00  pccl_intersect
  1.91     66.69     1.51     3306     0.00     0.00  pregex_dfa_from_nfa
  1.63     67.98     1.29  1223780     0.00     0.00  list_find
  1.48     69.15     1.18 78921959     0.00     0.00  test_same_kernel
  1.26     70.15     1.00 586092893     0.00     0.00  pccl_compat
  1.12     71.04     0.89        1     0.89     3.53  check_regex_anomalies
  0.86     71.72     0.68 11977201     0.00     0.00  plist_erase
  0.84     72.38     0.67 113382156     0.00     0.00  plist_insert
  0.78     73.00     0.62 27035168     0.00     0.00  pccl_testrange
  0.75     73.59     0.59 179834200     0.00     0.00  sort_symbols
  0.57     74.04     0.45   855253     0.00     0.00  pregex_ptn_to_nfa
  0.44     74.39     0.35                             plist_set_printfn
  0.40     74.71     0.32    11081     0.00     0.00  plist_subsort
  0.37     75.00     0.29 35355186     0.00     0.00  plist_remove
  0.35     75.28     0.28 130690767     0.00     0.00  pmalloc
  0.32     75.53     0.25 247164803     0.00     0.00  pfree
  0.30     75.77     0.24 10357446     0.00     0.00  plist_get_by_ptr
  0.26     75.97     0.21 154645260     0.00     0.00  plist_first
  0.24     76.16     0.19  4939448     0.00     0.00  pccl_create
  0.24     76.35     0.19  1447245     0.00     0.00  pccl_compare
  0.20     76.51     0.16 124362026     0.00     0.00  plist_last

Jul 16 '22 09:07 mingodad

Again looking through the code I can see that instead of using a BitArray for computing first/follow sets it's using plist. Probably plist is fine to use when parsing the grammar but then converting then to arrays and doing all processing on then would improve the performance.

Jul 16 '22 09:07 mingodad

Hello @mingodad, thank you very much for your detailed report on this problem.

You're totally right, the current implementation is a naive exponential comparison of lists, mostly because the lists are self-implemented in libphorward. There was no focus on performance yet, but on usability.

There are couple of things inside unicc that could be improved. But due its very rare use, lack of time and priority on other projects, I currently don't have the time on dealing with this issues, because it might require a deeper examination to resolve the problem properly.

Looking through the code I think that creating a sorted list first and then doing a binary search on it to decide to push/add one list element could improve things (not ideal but it's something), but I'm not sure about it because I don't know if the lists can have duplicates.

Normally, there won't be any duplicates, so your attempt sounds useful. There's also the parray structure which might be useful as well for specific use-cases (this was used in the unfinished unicc2 code base in several situations to increase speed and decrease memory usage), but it requires a lot of effort to rewrite the current implementation from plist to parray.

If you have any proposal, or a quick working solution in form of a pull request, this is very welcome!

Jul 21 '22 09:07 phorward

Adding a cached sorted list array the total time to parse was reduced around 40% and the profile output is now:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 14.87      7.97     7.97  2368251     0.00     0.00  close_item
 13.43     15.17     7.20     8735     0.00     0.00  lalr1_closure
 11.77     21.48     6.31        1     6.31     6.37  detect_default_productions
  9.86     26.77     5.29  6222279     0.00     0.00  list_push
  6.39     30.19     3.43 208112882     0.00     0.00  list_count
  6.32     33.58     3.39 2069372468     0.00     0.00  plist_access
  4.91     36.21     2.63 1860430152     0.00     0.00  plist_next
  3.40     38.03     1.82 65828038     0.00     0.00  bsearch_r
  3.02     39.65     1.62 865039536     0.00     0.00  plist_compare
  2.97     41.24     1.59 1444959779     0.00     0.00  plist_prev
  2.95     42.82     1.58  1206508     0.00     0.00  list_find
  2.72     44.28     1.46 34071855     0.00     0.00  plist_get_by_ptr
  2.19     45.45     1.18 77750733     0.00     0.00  test_same_kernel
  1.72     46.37     0.92  2109590     0.00     0.00  plist_offset
  1.66     47.26     0.89 234329705     0.00     0.00  sort_symbols
  1.65     48.15     0.89 390224369     0.00     0.00  cmp_ordered_list_bsearch
  1.51     48.96     0.81                             ordered_list_compare
  0.95     49.47     0.51 73347230     0.00     0.00  plist_insert
  0.76     49.88     0.41                             plist_set_printfn
  0.76     50.28     0.41                             plist_key
  0.60     50.60     0.32 73337106     0.00     0.00  plistel_drop
  0.52     50.88     0.28     6008     0.00     0.00  plist_subsort
  0.50     51.15     0.27 12956766     0.00     0.00  plist_union
  0.49     51.41     0.26 10123214     0.00     0.00  plist_erase
  0.40     51.63     0.22  1657945     0.00     0.00  recreate_ordered_list
  0.34     51.81     0.18 13145292     0.00     0.00  plist_get
  0.28     51.96     0.15 166448742     0.00     0.00  pfree
  0.28     52.11     0.15 85400518     0.00     0.00  pmalloc
  0.24     52.24     0.13 78722663     0.00     0.00  plist_first
  0.22     52.36     0.12                             plist_riter_access
  0.21     52.47     0.11 26824786     0.00     0.00  list_free
  0.20     52.57     0.11                             list_dup

/usr/bin/time unicc-nb postgresql-13.3.y.par 
60.34user 0.09system 1:00.44elapsed 99%CPU (0avgtext+0avgdata 140336maxresident)k
0inputs+11096outputs (0major+60347minor)pagefaults 0swaps

Jul 22 '22 12:07 mingodad

Profiling with with optimization now I'm getting this:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 38.37     17.40    17.40        1    17.40    18.36  pregex_dfa_from_nfa
 25.76     29.08    11.68                             generate_tables
  9.13     33.22     4.14                             detect_default_productions
  6.68     36.25     3.03  4922934     0.00     0.00  list_push
  4.07     38.10     1.85 12956766     0.00     0.00  plist_union
  3.37     39.63     1.53  1206508     0.00     0.00  list_find
  2.94     40.96     1.34 442540698     0.00     0.00  plist_compare
  2.40     42.05     1.09        1     1.09     1.11  pregex_dfa_minimize
  1.55     42.76     0.71 234330865     0.00     0.00  sort_symbols
  1.03     43.22     0.47 489312439     0.00     0.00  plist_access
  0.82     43.59     0.37 73347229     0.00     0.00  plist_insert
  0.79     43.95     0.36    11971     0.00     0.00  ordered_list_compare
  0.71     44.27     0.32 10123214     0.00     0.00  plist_erase
  0.64     44.56     0.29     6008     0.00     0.00  plist_subsort
  0.33     44.71     0.15  1901053     0.00     0.00  plist_concat
  0.13     44.77     0.06 84101173     0.00     0.00  pmalloc
  0.11     44.82     0.05  8899391     0.00     0.00  seek_rhs_first
  0.11     44.87     0.05      486     0.00     0.00  pregex_ptn_to_NFA
  0.10     44.92     0.05 14962100     0.00     0.00  plist_count
  0.10     44.96     0.05                             pregex_qreplace
  0.09     45.00     0.04 13080956     0.00     0.00  plist_get
  0.09     45.04     0.04                             build_code
  0.08     45.08     0.04                             plist_set_printfn
  0.07     45.11     0.03  8212527     0.00     0.00  plist_get_by_ptr
  0.07     45.14     0.03  2106378     0.00     0.00  plist_diff
  0.07     45.17     0.03    24343     0.00     0.00  list_count
  0.06     45.19     0.03  1208019     0.00     0.00  create_item
  0.04     45.21     0.02  1423413     0.00     0.00  plist_init
  0.04     45.23     0.02  1244475     0.00     0.00  plist_first
  0.04     45.25     0.02   458244     0.00     0.00  pstrrender
  0.04     45.27     0.02    57916     0.00     0.00  find_tabcol
  0.03     45.29     0.02    66068     0.00     0.00  parray_compare
  0.02     45.30     0.01  1037143     0.00     0.00  list_free
  0.02     45.31     0.01     3306     0.00     0.00  free_state
  0.02     45.32     0.01        1     0.01     0.16  _parse
  0.02     45.33     0.01                             pdbl_to_wcs
  0.02     45.34     0.01                             plist_dbgstats

Jul 22 '22 18:07 mingodad

unicc unicc copied to clipboard

Performance issues

unicc
unicc copied to clipboard