fuzion icon indicating copy to clipboard operation
fuzion copied to clipboard

AoC issue: Poor DFA performance

Open fridis opened this issue 2 years ago • 2 comments

The Fuzion Advent-of-Code solution from 12 Dec

dec12_part2 is

  input := (io.stdin.with ()->io.buffered.read_lines).val.filter !=""

  binomial(n, k i64) =>
    for res := i64 1, res * (n+1-i) / i
        i in i64 1 .. k
    else
      res

  record(s String) =>
    (record (s.split[0]
             .as_codepoints)
            (s.split[1]
              .split ","
              .map_sequence (.parse_i32.val)))

  record(springs Sequence codepoint, groups Sequence i32) is

    times5 => (record ((springs ++ ["?".as_codepoints.first]).cycle.take springs.count*5+4).as_array
                      (groups.cycle.take groups.count*5).as_array)

    count(s, g, sr) i64 =>
      if sr > 0
        if      s >= springs.count || springs[s] = "."  then 0
                                                        else count s+1 g   sr-1
      else if sr = 0
        if      s >= springs.count || springs[s] = "."  then count s   g   -1
        else if                       springs[s] = "#"  then 0
        else                                                 count s+1 g   -1
      else
        if      s >= springs.count && g >= groups.count then 1
        else if s >= springs.count                      then 0
        else if g >= groups.count  && springs[s] = "."  then count s+1 g   sr
        else if g >= groups.count  && springs[s] = "#"  then 0
        else if g >= groups.count  && springs[s] = "?"  then count s+1 g   sr
        else if                       springs[s] = "."  then count s+1 g   sr
        else if                       springs[s] = "#"  then count s+1 g+1 groups[g]-1
        else
          for q := 0, q+1 while s+q<springs.count && springs[s+q] = "?" else
            next_is_dot := if s+q >= springs.count || springs[s+q] = "." then 1 else 0
            for ng := 0, ng + 1
                csum := i64 0, csum + cng + cls
            while g+ng <= groups.count && ((0..ng-1).map gi->groups[g+gi] |> sum) + ng - next_is_dot <= q do
              # collapse groups to size 1 including gap:
              q_collapsed := q - ((0..ng-1).map gi->groups[g+gi] |> sum) + next_is_dot

              # we fit ng groups into q question marks
              factor := binomial q_collapsed.as_i64 ng.as_i64
              cng := if factor = 0 then i64 0 else factor * count s+q g+ng sr

              # now try to connect the following one
              cls i64 := if next_is_dot = 0 && g+ng < groups.count
                  lsz := groups[g+ng]
                  for cl := i64 0, cl + cd
                      nl := 1, nl + 1
                  while nl < lsz && q_collapsed-nl >= ng
                    factor2 := binomial (q_collapsed-nl).as_i64 ng.as_i64
                    cd := if factor2 = 0 then i64 0 else ca := count s+q g+ng+1 lsz-nl; factor2 * ca
                  else
                    cl
                else
                  0
            else
              csum

  data := input.map_sequence record

  part1 => data.map_sequence (.count 0 0 -1) |> sum
  part2 => data.map_sequence (.times5)
               .map_sequence (.count 0 0 -1) |> sum

  say "part1 $part1 part2 $part2"

requires almost 10sec to run on an empty input:

> echo "" | time ~/fuzion/clean/fuzion/build/bin/fz -XjavaProf part2.fz
part1 0 part2 0
 + pid33883-fuzion-XjavaProf-flamegraph.svg
27.45user 1.26system 0:09.16elapsed 313%CPU (0avgtext+0avgdata 1765460maxresident)k
 > 

The problem is that this test apparently stresses the DFA (4.6s) and findAllClasses (1.7s) phases: pid33883-fuzion-XjavaProf-flamegraph

fridis avatar Dec 13 '23 07:12 fridis

duplicate of #3391

michaellilltokiwa avatar Jul 16 '24 07:07 michaellilltokiwa

The reason for this is probably different than that of #3391, so I re-open this.

fridis avatar Jul 16 '24 07:07 fridis

updated example:

dec12_part2 is

  _ := mut

  input := ((io.stdin mutate).with (() -> io.buffered.read_lines mutate)).val.filter !=""

  binomial(n, k i64) =>
    for res := i64 1, res * (n+1-i) / i
        i in i64 1 .. k
    else
      res

  record(s String) =>
    (record (s.split[0]
             .as_codepoints)
            (s.split[1]
              .split ","
              .map (.parse_i32.val)))

  record(springs Sequence codepoint, groups Sequence i32) is

    times5 => (record ((springs ++ ["?".as_codepoints.first]).cycle.take springs.count*5+4).as_array
                      (groups.cycle.take groups.count*5).as_array)

    count(s, g, sr) i64 =>
      if sr > 0
        if      s >= springs.count || springs[s] = "."  then 0
                                                        else count s+1 g   sr-1
      else if sr = 0
        if      s >= springs.count || springs[s] = "."  then count s   g   -1
        else if                       springs[s] = "#"  then 0
        else                                                 count s+1 g   -1
      else
        if      s >= springs.count && g >= groups.count then 1
        else if s >= springs.count                      then 0
        else if g >= groups.count  && springs[s] = "."  then count s+1 g   sr
        else if g >= groups.count  && springs[s] = "#"  then 0
        else if g >= groups.count  && springs[s] = "?"  then count s+1 g   sr
        else if                       springs[s] = "."  then count s+1 g   sr
        else if                       springs[s] = "#"  then count s+1 g+1 groups[g]-1
        else
          for q := 0, q+1 while s+q<springs.count && springs[s+q] = "?" else
            next_is_dot := if s+q >= springs.count || springs[s+q] = "." then 1 else 0
            for ng := 0, ng + 1
                csum := i64 0, csum + cng + cls
            while g+ng <= groups.count && ((0..ng-1).map gi->groups[g+gi] |> sum) + ng - next_is_dot <= q do
              # collapse groups to size 1 including gap:
              q_collapsed := q - ((0..ng-1).map gi->groups[g+gi] |> sum) + next_is_dot

              # we fit ng groups into q question marks
              factor := binomial q_collapsed.as_i64 ng.as_i64
              cng := if factor = 0 then i64 0 else factor * count s+q g+ng sr

              # now try to connect the following one
              cls i64 := if next_is_dot = 0 && g+ng < groups.count
                  lsz := groups[g+ng]
                  for cl := i64 0, cl + cd
                      nl := 1, nl + 1
                  while nl < lsz && q_collapsed-nl >= ng
                    factor2 := binomial (q_collapsed-nl).as_i64 ng.as_i64
                    cd := if factor2 = 0 then i64 0 else ca := count s+q g+ng+1 lsz-nl; factor2 * ca
                  else
                    cl
                else
                  0
            else
              csum

  data := input.map record

  part1 => data.map (.count 0 0 -1) |> sum
  part2 => data.map (.times5)
               .map (.count 0 0 -1) |> sum

  say "part1 $part1 part2 $part2"

michaellilltokiwa avatar Oct 24 '24 08:10 michaellilltokiwa

bad DFA performance is not the case anymore, closing:

dec12_part2 is

  LM : mutate is
  input := LM ! ()->
    (io.stdin.reader LM) ! ()->(io.buffered LM .read_lines).filter !=""

  binomial(n, k i64) =>
    for res := i64 1, res * (n+1-i) / i
        i in i64 1 .. k
    else
      res

  record(s String) =>
    (record (s.split[0]
             .codepoints)
            (s.split[1]
              .split ","
              .map (.parse_i32.val)))

  record(springs Sequence codepoint, groups Sequence i32) is

    times5 => (record ((springs ++ ["?".codepoints.first.get]).cycle.take springs.count*5+4).as_array
                      (groups.cycle.take groups.count*5).as_array)

    count(s, g, sr) i64 =>
      if sr > 0
        if      s >= springs.count || springs[s] = "."  then 0
                                                        else count s+1 g   sr-1
      else if sr = 0
        if      s >= springs.count || springs[s] = "."  then count s   g   -1
        else if                       springs[s] = "#"  then 0
        else                                                 count s+1 g   -1
      else
        if      s >= springs.count && g >= groups.count then 1
        else if s >= springs.count                      then 0
        else if g >= groups.count  && springs[s] = "."  then count s+1 g   sr
        else if g >= groups.count  && springs[s] = "#"  then 0
        else if g >= groups.count  && springs[s] = "?"  then count s+1 g   sr
        else if                       springs[s] = "."  then count s+1 g   sr
        else if                       springs[s] = "#"  then count s+1 g+1 groups[g]-1
        else
          for q := 0, q+1 while s+q<springs.count && springs[s+q] = "?" else
            next_is_dot := if s+q >= springs.count || springs[s+q] = "." then 1 else 0
            for ng := 0, ng + 1
                csum := i64 0, csum + cng + cls
            while g+ng <= groups.count && ((0..ng-1).map gi->groups[g+gi]).sum + ng - next_is_dot <= q do
              # collapse groups to size 1 including gap:
              q_collapsed := q - ((0..ng-1).map gi->groups[g+gi]).sum + next_is_dot

              # we fit ng groups into q question marks
              factor := binomial q_collapsed.as_i64 ng.as_i64
              cng := if factor = 0 then i64 0 else factor * count s+q g+ng sr

              # now try to connect the following one
              cls i64 :=
                if next_is_dot = 0 && g+ng < groups.count
                  lsz := groups[g+ng]
                  for cl := i64 0, cl + cd
                      nl := 1, nl + 1
                  while nl < lsz && q_collapsed-nl >= ng
                    factor2 := binomial (q_collapsed-nl).as_i64 ng.as_i64
                    cd := if factor2 = 0 then i64 0 else ca := count s+q g+ng+1 lsz-nl; factor2 * ca
                  else
                    cl
                else
                  0
            else
              csum

  data := input.map record

  part1 => data.map (.count 0 0 -1) .sum
  part2 => data.map (.times5)
               .map (.count 0 0 -1) .sum

  say "part1 $part1 part2 $part2"
238 features 180 types and 1 source files included in fum file.
DFA pre iteration #1: --------------------------------------------------calls:0,values:2,envs:0
DFA pre iteration #2: --------------------------------------------------calls:1410,values:1593,envs:0
DFA pre iteration #3: --------------------------------------------------calls:1612,values:1796,envs:0
DFA pre iteration #4: --------------------------------------------------calls:1634,values:1814,envs:0
DFA pre iteration #5: --------------------------------------------------calls:1655,values:1818,envs:0
DFA pre iteration #6: --------------------------------------------------calls:1657,values:1822,envs:0
DFA pre iteration #7: --------------------------------------------------calls:1665,values:1834,envs:0
DFA pre iteration #8: --------------------------------------------------calls:1670,values:1834,envs:0
DFA pre iteration #9: --------------------------------------------------calls:1670,values:1834,envs:0
DFA pre iteration #10: --------------------------------------------------calls:1670,values:1834,envs:0
DFA pre iteration #11: --------------------------------------------------calls:1670,values:1834,envs:0
DFA pre iteration #12: --------------------------------------------------calls:1670,values:1834,envs:0
DFA pre iteration #13: --------------------------------------------------calls:2666,values:2907,envs:0
DFA pre iteration #14: --------------------------------------------------calls:2713,values:2935,envs:0
DFA pre iteration #15: --------------------------------------------------calls:2723,values:2946,envs:0
DFA pre iteration #16: --------------------------------------------------calls:2733,values:2948,envs:0
DFA pre iteration #17: --------------------------------------------------calls:2734,values:2949,envs:0
DFA real iteration #1: --------------------------------------------------calls:0,values:2949,envs:0
DFA real iteration #2: --------------------------------------------------calls:1607,values:4664,envs:0
DFA real iteration #3: --------------------------------------------------calls:2612,values:5679,envs:0
DFA real iteration #4: --------------------------------------------------calls:2671,values:5712,envs:0
DFA real iteration #5: --------------------------------------------------calls:2722,values:5750,envs:0
DFA real iteration #6: --------------------------------------------------calls:2732,values:5752,envs:0
DFA real iteration #7: --------------------------------------------------calls:2733,values:5753,envs:0
DFA needed 24 iterations (pre/real) (17/7).
Elapsed time for phases: prep 34ms, fe 437ms, createMIR 95ms, ir 8ms, dfa_pre 1579ms, dfa_real 581ms, opt 3ms, be 993ms
``

michaellilltokiwa avatar Oct 06 '25 09:10 michaellilltokiwa