copilot `copilot-c99`: appending values to a stream of arrays produces incorrect C99 code

Description

Appending values to a stream should delay the samples. When (++) is applied to a stream of arrays, the values produced by the C99 backend are incorrect. Consider the following minimal example:

{-# LANGUAGE CPP       #-}
{-# LANGUAGE DataKinds #-}
module Main where

import Copilot.Compile.C99
import Language.Copilot
import Prelude             hiding ((++))

spec :: Spec
spec = trigger "f" true [ arg $ stream]
  where
    stream :: Stream (Array 2 Int16)
    stream = [array [3,4]] ++ rest

    rest :: Stream (Array 2 Int16)
    rest = constant $ array [5,6]

main :: IO ()
main = do
#ifdef INTERPRET
 interpret 5 spec
#else
 reify spec >>= compile "delay"
#endif

When executed in the interpreter, the values produced are consistent with the meaning of append: all values in the array are delayed by one sample:

$ cabal v1-exec -- runhaskell -DINTERPRET Test.hs 
f:      
([3,4]) 
([5,6]) 
([5,6]) 
([5,6]) 
([5,6])

In contrast, when executed in C with the following main:

#include <stdint.h>
#include <stdio.h>
#include "delay.h"

void f (int16_t f_arg0[(2)]) {
  printf("([%d,%d])\n", f_arg0[0], f_arg0[1]);
}

int main(void) {
  int i = 0;

  printf("f:\n");

  for (i=0; i<5; i++) {
    step();
  }
  return 0;
}

the results produced are incorrect:

$ gcc main.c delay.c -o main
$ ./main 
f:
([3,4])
([5,4])
([5,4])
([5,4])
([5,4])

There is an inconsistency between the result from the interpreter and the C99 backend, and it seems clear from the meaning of append that the bug is in the C99 backend.

Trying the same with a stream without an array (with a number, with a struct) does not seem to produce any incorrect results.

Type

Bug: code produced is incorrect.

Additional context

None.

Requester

Ivan Perez

Method to check presence of bug

The results produced by the following spec should be the same in the interpreter and the C99 backend:

{-# LANGUAGE CPP       #-}
{-# LANGUAGE DataKinds #-}
module Main where

import Copilot.Compile.C99
import Language.Copilot
import Prelude             hiding ((++))

spec :: Spec
spec = trigger "f" true [ arg $ stream]
  where
    stream :: Stream (Array 2 Int16)
    stream = [array [3,4]] ++ rest

    rest :: Stream (Array 2 Int16)
    rest = constant $ array [5,6]

main :: IO ()
main = do
#ifdef INTERPRET
 interpret 5 spec
#else
 reify spec >>= compile "delay"
#endif

The output should be exactly the same when the C code is linked with the following main.c:

#include <stdint.h>
#include <stdio.h>
#include "delay.h"

void f (int16_t f_arg0[(2)]) {
  printf("([%d,%d])\n", f_arg0[0], f_arg0[1]);
}

int main(void) {
  int i = 0;

  printf("f:\n");

  for (i=0; i<5; i++) {
    step();
  }
  return 0;
}

Expected result

The C main provided should produce the same values as the interpreted spec.

Desired result

The C main provided should produce the same values as the interpreted spec.

Proposed solution

To be determined.

Further notes

None.

Apr 25 '22 14:04 ivanperez-keera

I believe the issue may have to do with the code that copies values for the next sample. For arrays with Int64s, the code produced by copilot includes the following aids to delay arrays:

  (memcpy)((s0_tmp), ((s0_gen)()), (2));
  (memcpy)(((s0)[s0_idx]), (s0_tmp), (2));

but that is not taking the size of int64_t into account. The following code seems to work:

  (memcpy)((s0_tmp), ((s0_gen)()), (2)*sizeof(int64_t));
  (memcpy)(((s0)[s0_idx]), (s0_tmp), (2)*sizeof(int64_t));

The bug seems to also be present in uses of extern arrays. The code produced to copy the extern arrays for local monitoring does not take the size the type of values in the array (e.g., int64_t) into account.

The functions copilot-core:Copilot.Core.Type.tysize and tylength are suspect, since they are used to generate the memcpys and they do not take the size of the type of elements in the array into account.

Apr 25 '22 14:04 ivanperez-keera

Change Manager: Confirmed that the bug exists.

Aug 04 '22 11:08 ivanperez-keera

Technical Lead: Confirmed that the issue should be addressed.

Aug 04 '22 11:08 ivanperez-keera

Technical Lead: Issue scheduled for fixing in Copilot 3.11.

Fix assigned to: @ivanperez-keera .

Aug 04 '22 11:08 ivanperez-keera

@ivanperez-keera @agoodloe, I have implemented the sizeof operator in language-c99-simple (https://github.com/fdedden/language-c99-simple/pull/12), that should make it straightforward to implement. I have tested the implementation with arrays and nested arrays as well.

Currently that addition is not yet published to hackage, I think it is better to try it out with Copilot first, but I can publish right away it if we want.

Aug 22 '22 13:08 fdedden

Understood. Thanks!

I will draft what the solution to this issue #314 should look like and we can check that it does the right thing.

Aug 22 '22 19:08 ivanperez-keera

I have drafted the solution here: https://github.com/ivanperez-keera/copilot/tree/develop-delay-array

The example above works for me. It would be good if others could also try it @fdedden and maybe also @RyanGlScott .

EDIT: modify URL to branch. Original was incorrect as pointed out by Ryan.

Aug 24 '22 01:08 ivanperez-keera

I tried out that branch, but I still witness the buggy behavior on the program above. (Looking at the commit on that branch, all it does is remove some constraints—are there subsequent commits that haven't been pushed yet?)

Aug 24 '22 12:08 RyanGlScott

Thank you and my apologies. It should have been this one: https://github.com/ivanperez-keera/copilot/tree/develop-delay-array

Aug 24 '22 13:08 ivanperez-keera

Thanks. I was able to build this branch after adding the following to my cabal.project.local:

source-repository-package
  type: git
  location: https://github.com/fdedden/language-c99-simple
  tag: 1482a2a4cc104ce6d3efb4f84e5685d6ecc9979b

Something strikes me as suspicious about the generated C code for the program above:

#include <stdint.h>
#include <stdbool.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>

#include "delay.h"

static int16_t s0[(1)][(2)] = {{((int16_t)(3)), ((int16_t)(4))}};
static size_t s0_idx = (0);

int16_t * s0_get(size_t x) {
  return (s0)[((s0_idx) + (x)) % (1)];
}

int16_t * s0_gen(void) {
  return (int16_t [(2)]){((int16_t)(5)), ((int16_t)(6))};
}

bool f_guard(void) {
  return true;
}

int16_t * f_arg0(void) {
  return (s0_get)((0));
}

void step(void) {
  int16_t s0_tmp[(2)];
  int16_t * f_arg_temp0;
  if ((f_guard)()) {
    {(f_arg_temp0) = ((f_arg0)());
     (f)((f_arg_temp0));}
  };
  (memcpy)((s0_tmp), ((s0_gen)()), ((2) * (sizeof(int16_t))));
  (memcpy)(((s0)[s0_idx]), (s0_tmp), ((2) * (sizeof(int16_t))));
  (s0_idx) = (((s0_idx) + (1)) % (1));
}

In particular, the definition of s0_gen seems wrong:

int16_t * s0_gen(void) {
  return (int16_t [(2)]){((int16_t)(5)), ((int16_t)(6))};
}

This will stack-allocate an array and return a reference to the array. Because it is stack allocated, however, the lifetime of the array is limited to the scope of the s0_gen() function, so the pointer that s0_gen() returns will be dangling. This can be seen if you compile the main.c program with and without optimizations:

$ gcc -O0 main.c delay.c -o main && ./main # Without optimizations
f:
([3,4])
([5,6])
([5,6])
([5,6])
([5,6])

$ gcc -O main.c delay.c -o main && ./main # With optimizations
f:
([3,4])
([0,0])
([0,0])
([0,0])
([0,0])

Due to a lucky fluke, the program behaves reasonably without optimizations. We run out of luck when optimizations are enabled, however, as the result is incorrect.

The memcpy-related parts of the code seem reasonable to me. I think if the s0_gen() issue above were addressed, then the generated C code would be correct.

Aug 24 '22 15:08 RyanGlScott

I also tried out some variations of the program above using nested arrays and structs, and they also produce invocations of memcpy that appear valid. I did uncover one compilation issue when trying to use an example involving an array of structs, however. That appears to be a separate issue, so I have opened #373 for this.

Aug 24 '22 15:08 RyanGlScott

@RyanGlScott If I understand what you are saying correctly, the issue is unrelated to the sizeof part of the copy, and, if anywhere, it can manifest in this particular point:

(memcpy)((s0_tmp), ((s0_gen)()), ((2) * (sizeof(int16_t))));

After that, the values are in s0_tmp, so we are good.

I think that a solution to this would be to pass the direction of s0_tmp as an argument to s0_gen for it to store the result in. The generators are only used here as far as I can see:

https://github.com/Copilot-Language/copilot/blob/c5ab11fee16f508250849285609de4d1e82e194f/copilot-c99/src/Copilot/Compile/C99/CodeGen.hs#L101-L105

https://github.com/Copilot-Language/copilot/blob/c5ab11fee16f508250849285609de4d1e82e194f/copilot-c99/src/Copilot/Compile/C99/CodeGen.hs#L123

Aug 24 '22 17:08 ivanperez-keera

I think that a solution to this would be to pass the direction of s0_tmp as an argument to s0_gen for it to store the result in.

Yes, that sounds reasonable to me.

Aug 24 '22 17:08 RyanGlScott

Ok. I think these are two issues then (very closely related, but two different bugs).

One issue has to do with the first memcpy:

  (memcpy)((s0_tmp), ((s0_gen)()), (2));

which should be:

  s0_gen(&s0_tmp);

The second has to do with:

  (memcpy)(((s0)[s0_idx]), (s0_tmp), (2));

which should be:

  (memcpy)(((s0)[s0_idx]), (s0_tmp), (2)*sizeof(int64_t));

I'll close the second as part of this issue and open a separate one for the first.

Aug 24 '22 18:08 ivanperez-keera

I have drafted the solution here: https://github.com/ivanperez-keera/copilot/tree/develop-delay-array

The example above works for me. It would be good if others could also try it @fdedden and maybe also @RyanGlScott .

EDIT: modify URL to branch. Original was incorrect as pointed out by Ryan.

I ran some tests with your changes, and can verified that your implementation works. I will release a new version of language-c99-simple on hackage.

@RyanGlScott I can (unfortunately) confirm that returning an array as a pointer is a bug. Weird though how the optimisation level of GCC makes a difference. I always thought that up to -O2, semantics didn't change.

Aug 28 '22 20:08 fdedden

Weird though how the optimisation level of GCC makes a difference. I always thought that up to -O2, semantics didn't change.

I'm fairly certain that the program in https://github.com/Copilot-Language/copilot/issues/314#issuecomment-1225858862 exhibits undefined behavior, and I doubt that compilers make any guarantees about optimizations preserving semantics for those kinds of situations.

Aug 28 '22 20:08 RyanGlScott

My package has been published (https://hackage.haskell.org/package/language-c99-simple-0.2.2), so the changes made by @ivanperez-keera should work now, if the dependency version is increased accordingly.

Aug 28 '22 20:08 fdedden

Weird though how the optimisation level of GCC makes a difference. I always thought that up to -O2, semantics didn't change.

I'm fairly certain that the program in #314 (comment) exhibits undefined behavior, and I doubt that compilers make any guarantees about optimizations preserving semantics for those kinds of situations.

Yeah that sounds reasonable. On gcc 11.2.1, a -Os also breaks the code.

Aug 28 '22 20:08 fdedden

Thanks a lot for dealing with this quickly and over the weekend!

Aug 28 '22 21:08 ivanperez-keera

Implementor: Solution implemented, review requested.

Aug 30 '22 01:08 ivanperez-keera

Change Manager: Verified that:

Solution is implemented:

[X] The code proposed compiles and passes all tests. Details: Build log: https://app.travis-ci.com/github/Copilot-Language/copilot/builds/254956244

[X] The solution proposed produces the expected result. Details: The following Dockerfile and the accompanying test files check that Copilot behaves the same way in the interpreter and the C99 compiler. The script will run both the interpreter and the compiler for an example that was known to fail, compile the C99 with a program that will print the values of one of the streams (producing the same output as the interpreter), and compare the two outputs. A patch to be applied to the generated C file is necessary due to a different bug that only manifests in specific circumstances but is unrelated to #314:

--- Dockerfile
FROM ubuntu:trusty

RUN apt-get update

RUN apt-get install --yes software-properties-common
RUN add-apt-repository ppa:hvr/ghc
RUN apt-get update

RUN apt-get install --yes ghc-8.6.5 cabal-install-2.4
RUN apt-get install --yes libz-dev

ENV PATH=/opt/ghc/8.6.5/bin:/opt/cabal/2.4/bin:$PWD/.cabal-sandbox/bin:$PATH

RUN echo 1
RUN cabal update
RUN cabal v1-sandbox init
RUN cabal v1-install alex happy
RUN apt-get install --yes git

SHELL ["/bin/bash", "-c"]
ADD main.c /tmp/main.c
ADD Main.hs /tmp/Main.hs
ADD delay.c.patch /tmp/delay.c.patch

CMD git clone $REPO && cd $NAME && git checkout $COMMIT && cd .. \
  && cabal v1-install copilot/copilot**/ \
  && cabal v1-exec -- runhaskell -DINTERPRET /tmp/Main.hs > /tmp/out1 \
  && cabal v1-exec -- runhaskell /tmp/Main.hs \
  && patch -R delay.c /tmp/delay.c.patch \
  && gcc -I. delay.c /tmp/main.c -O0 -o main \
  && ./main > /tmp/out2 \
  && sed -i -e 's/ \+$//g' /tmp/out* \
  && diff /tmp/out1 /tmp/out2 \
  && echo "Success"

--- Main.hs
{-# LANGUAGE CPP       #-}
{-# LANGUAGE DataKinds #-}
module Main where

import Copilot.Compile.C99
import Language.Copilot
import Prelude             hiding ((++))

spec :: Spec
spec = trigger "f" true [ arg $ stream]
  where
    stream :: Stream (Array 2 Int16)
    stream = [array [3,4]] ++ rest

    rest :: Stream (Array 2 Int16)
    rest = constant $ array [5,6]

main :: IO ()
main = do
#ifdef INTERPRET
 interpret 5 spec
#else
 reify spec >>= compile "delay"
#endif

--- main.c
#include <stdint.h>
#include <stdio.h>
#include "delay.h"

void f (int16_t f_arg0[(2)]) {
  printf("([%d,%d])\n", f_arg0[0], f_arg0[1]);
}

int main(void) {
  int i = 0;

  printf("f:\n");

  for (i=0; i<5; i++) {
    step();
  }
  return 0;
}

--- delay.c.patch
16,18c16,17
< void s0_gen(int16_t* pointer) {
<   int16_t res[2] = {((int16_t)(5)), ((int16_t)(6))};
<   memcpy(pointer, res, ((2) * (sizeof(int16_t))));
---
> int16_t * s0_gen(void) {
>   return (int16_t [(2)]){((int16_t)(5)), ((int16_t)(6))};
36c35
<   s0_gen(s0_tmp);
---
>   (memcpy)((s0_tmp), ((s0_gen)()), ((2) * (sizeof(int16_t))));

Command (substitute variables based on new path after merge):

$ docker run -e "REPO=https://github.com/ivanperez-keera/copilot" -e "NAME=copilot" -e "COMMIT=4b8adac0f72bf527257008009ec62778a2ca3b83" -it copilot-verify-314

[X] Implementation is documented. Details: No changes needed.
[X] Change history is clear.
[X] Commit messages are clear.
[X] Changelogs are updated.
[X] Examples are updated. Details: No updates needed.
[X] Required version bumps are evaluated. Details: Bump not needed (internal change function).

Aug 30 '22 01:08 ivanperez-keera

Change Manager: Implementation ready to be merged.

Aug 30 '22 01:08 ivanperez-keera

copilot copilot copied to clipboard

`copilot-c99`: appending values to a stream of arrays produces incorrect C99 code

copilot
copilot copied to clipboard