oh icon indicating copy to clipboard operation
oh copied to clipboard

mailbox: Writes dissapear

Open olajep opened this issue 9 years ago • 14 comments

I need to insert nops between writes to the mailbox from Epiphany.

olajep avatar Jan 19 '16 19:01 olajep

Can you try the following: -run two scenarios, one with 0 noops inserted and one with 16 nops inserted -for each scenario, try twoo addresses, 1.) write to mailboxlo, 2) write to some address in memory

what do you get?

On Tue, Jan 19, 2016 at 2:27 PM, Ola Jeppsson [email protected] wrote:

I need to insert nops between writes to the mailbox from Epiphany.

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

aolofsson avatar Jan 19 '16 20:01 aolofsson

Do you want:

  1. Write to mailbox
  2. Write to DRAM
  3. nops / no nops

or

  1. Write to mailbox
  2. nops / no nops
  3. Write to DRAM
  4. nops / no nops

// Ola

olajep avatar Jan 19 '16 20:01 olajep

I want to see what happens with the exact same Epiphany code with the only difference being the addressing being written to. My theory is that you happened on a corner case that affects all writes from Epiphany to FPGA. The reason we may not have see this is that you wrote a tight assembly loop. Our other tests either use DMA (streaming) or slow C code loops. (in other words our elink test regression still needs major work!) Would ftest, matmul, etc code have the same sequence that you wrote in assembly for mailbox test?

(a fix has tested and checked in)

On Tue, Jan 19, 2016 at 3:57 PM, Ola Jeppsson [email protected] wrote:

Do you want:

  1. Write to mailbox
  2. Write to DRAM
  3. nops / no nops

or

  1. Write to mailbox
  2. nops / no nops
  3. Write to DRAM
  4. nops / no nops

// Ola

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-172984554.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

aolofsson avatar Jan 19 '16 21:01 aolofsson

Without nops but with an added str to 0x8f100000 all ten writes to the mailbox arrives. The ARM side reads until E_MAILBOXSTATUS == 0.

00000660 <_main>:
 660:   4a0b 0552       mov r2,0x5550
 664:   754b 0aa2       mov r3,0xaaaa
 668:   060b 4072       mov r16,0x730
 66c:   4aab 1552       movt r2,0x5555
 670:   754b 1aa2       movt r3,0xaaaa
 674:   01eb 5812       movt r16,0x810f
 678:   200b 0002       mov r1,0x0
 67c:   800b 2502       mov r12,0x5000
 680:   407c 0800       strd r2,[r16]
 684:   220b 18f2       movt r1,0x8f10
 688:   954b 3aa2       movt r12,0xaaaa
 68c:   845c 2000       str r12,[r1]
 690:   407c 0800       strd r2,[r16]
 694:   845c 2000       str r12,[r1]
 698:   407c 0800       strd r2,[r16]
 69c:   845c 2000       str r12,[r1]
 6a0:   407c 0800       strd r2,[r16]
 6a4:   845c 2000       str r12,[r1]
 6a8:   407c 0800       strd r2,[r16]
 6ac:   845c 2000       str r12,[r1]
 6b0:   407c 0800       strd r2,[r16]
 6b4:   845c 2000       str r12,[r1]
 6b8:   407c 0800       strd r2,[r16]
 6bc:   845c 2000       str r12,[r1]
 6c0:   407c 0800       strd r2,[r16]
 6c4:   845c 2000       str r12,[r1]
 6c8:   407c 0800       strd r2,[r16]
 6cc:   845c 2000       str r12,[r1]
 6d0:   407c 0800       strd r2,[r16]
 6d4:   845c 2000       str r12,[r1]
 6d8:   194f 0402       rts
 6dc:   0000            beq 6dc <_main+0x7c>
        ...

EDIT: With commit https://github.com/parallella/oh/commit/b26255dfb59bd28b65e48cca0a882a47d64f2aaa

olajep avatar Jan 19 '16 21:01 olajep

Bug is still present in e86567241deabc8f070b61becf07be175394a980 Running through epiphany examples now ...

olajep avatar Jan 19 '16 21:01 olajep

(examples still work)

The magic number with e865672 is 10 cycles (no stalls AFAIK) between each strd

I.e., this works:

00000660 <_main>:
 660:   0a0b 0552       mov r0,0x5550
 664:   354b 0aa2       mov r1,0xaaaa
 668:   460b 0072       mov r2,0x730
 66c:   0aab 1552       movt r0,0x5555
 670:   354b 1aa2       movt r1,0xaaaa
 674:   41eb 1812       movt r2,0x810f
 678:   0874            strd r0,[r2]
 67a:   01a2            nop
 67c:   01a2            nop
 67e:   01a2            nop
 680:   01a2            nop
 682:   01a2            nop
 684:   01a2            nop
 686:   0a0b 0552       mov r0,0x5550
 68a:   354b 0aa2       mov r1,0xaaaa
 68e:   0aab 1552       movt r0,0x5555
 692:   354b 1aa2       movt r1,0xaaaa
 696:   0874            strd r0,[r2]
 698:   01a2            nop
...

Remove one nop and things break (only 6 entries in fifo)

olajep avatar Jan 19 '16 21:01 olajep

@aolofsson

Which are the right values for ELINK_RXDELAY0 and ELINK_RXDELAY1 ?

olajep avatar Jan 19 '16 22:01 olajep

The following should work:

idelay0=0xaaaaaaaa idleya1=0x0000000a

On Tue, Jan 19, 2016 at 5:00 PM, Ola Jeppsson [email protected] wrote:

@aolofsson https://github.com/aolofsson

Which are the right values for ELINK_RXDELAY0 and ELINK_RXDELAY1 ?

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173001846.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

aolofsson avatar Jan 19 '16 23:01 aolofsson

thanks, as expected it doesn't fix this issue but I had to test

// Ola

On 2016-01-20 00:08, Andreas Olofsson wrote:

The following should work:

idelay0=0xaaaaaaaa idleya1=0x0000000a

On Tue, Jan 19, 2016 at 5:00 PM, Ola Jeppsson [email protected] wrote:

@aolofsson https://github.com/aolofsson

Which are the right values for ELINK_RXDELAY0 and ELINK_RXDELAY1 ?

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173001846.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173017631.

olajep avatar Jan 19 '16 23:01 olajep

Does the following break?

00000660 <_main>: 660: 0a0b 0552 mov r0,0x5550 664: 354b 0aa2 mov r1,0xaaaa

  • 668: 460b 0072 mov r2,0x0000* 66c: 0aab 1552 movt r0,0x5555 670: 354b 1aa2 movt r1,0xaaaa
  • 674: 41eb 1812 movt r2,0x8f10* 678: 0874 strd r0,[r2] 67a: 01a2 nop 67c: 01a2 nop 67e: 01a2 nop 680: 01a2 nop 682: 01a2 nop 684: 01a2 nop 686: 0a0b 0552 mov r0,0x5550 68a: 354b 0aa2 mov r1,0xaaaa 68e: 0aab 1552 movt r0,0x5555 692: 354b 1aa2 movt r1,0xaaaa 696: 0874 strd r0,[r2] 698: 01a2 nop ...

On Tue, Jan 19, 2016 at 6:13 PM, Ola Jeppsson [email protected] wrote:

thanks, as expected it doesn't fix this issue but I had to test

// Ola

On 2016-01-20 00:08, Andreas Olofsson wrote:

The following should work:

idelay0=0xaaaaaaaa idleya1=0x0000000a

On Tue, Jan 19, 2016 at 5:00 PM, Ola Jeppsson [email protected] wrote:

@aolofsson https://github.com/aolofsson

Which are the right values for ELINK_RXDELAY0 and ELINK_RXDELAY1 ?

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173001846.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

— Reply to this email directly or view it on GitHub < https://github.com/parallella/oh/issues/37#issuecomment-173017631>.

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173018561.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

aolofsson avatar Jan 19 '16 23:01 aolofsson

The reason we may not have see this is that you wrote a tight assembly loop. Our other tests either use DMA (streaming) or slow C code loops. (in other words our elink test regression still needs major work!) Would ftest, matmul, etc code have the same sequence that you wrote in assembly for mailbox test?

This works (will check in complete test tomorrow)

FUNC(etest)
.global SYM(etest)
.align 3
SYM(etest):
    mov r20, %low(EXPECTED_0)
    movt    r20, %high(EXPECTED_0)
    mov r21, %low(EXPECTED_1)
    movt    r21, %high(EXPECTED_1)
    mov r22, %low(EXPECTED_2)
    movt    r22, %high(EXPECTED_2)
    mov r23, %low(EXPECTED_3)
    movt    r23, %high(EXPECTED_3)
    mov r24, %low(EXPECTED_4)
    movt    r24, %high(EXPECTED_4)
    mov r25, %low(EXPECTED_5)
    movt    r25, %high(EXPECTED_5)
    mov r26, %low(EXPECTED_6)
    movt    r26, %high(EXPECTED_6)
    mov r27, %low(EXPECTED_7)
    movt    r27, %high(EXPECTED_7)

    mov r30, %low(RESULT_ADDR)
    movt    r30, %high(RESULT_ADDR)

    str r20,[r30],+1
    str r21,[r30],+1
    str r22,[r30],+1
    str r23,[r30],+1
    str r24,[r30],+1
    str r25,[r30],+1
    str r26,[r30],+1
    str r27,[r30],+1

    ;; repeat w/ 64-bit writes
    strd    r20,[r30],+1
    strd    r22,[r30],+1
    strd    r24,[r30],+1
    strd    r26,[r30],+1

    rts

olajep avatar Jan 19 '16 23:01 olajep

(result_addr is in DRAM)

olajep avatar Jan 19 '16 23:01 olajep

Found the bug!! NASTY!!!!!

Remembered that we have a long forgotten mode in the epiphany chip elink (not impemented in the fpga elink) that creates bursts when you write doubles to the same address. (F**K!) So the writes were likely coming in as bursts. Looks like the mailbox works fine when you write in "int"s (I tested it on the board with consecutive) (see "mailbox_test" in elink/sw0)

Should not be a big deal to implement burst with increment set to 0 instead of 8, for now let's work with ints.

Please verify.

On Tue, Jan 19, 2016 at 6:38 PM, Ola Jeppsson [email protected] wrote:

(result_addr is in DRAM)

— Reply to this email directly or view it on GitHub https://github.com/parallella/oh/issues/37#issuecomment-173024031.


Andreas Olofsson, CEO/Founder at Adapteva Cell: +1 781 325 6688 Twitter: @adapteva Web: adapteva.com

Linkedin: linkedin.com/in/andreasolofsson

http://www.adapteva.com/

aolofsson avatar Jan 20 '16 04:01 aolofsson

32-bit (str) writes work with 1f42630f1cf0a17fe14e9f847669598a8734fa79

olajep avatar Jan 20 '16 11:01 olajep