can-utils icon indicating copy to clipboard operation
can-utils copied to clipboard

cansend remote request only works every other time

Open bigguiness opened this issue 2 years ago • 29 comments

I am trying to test a CAN interface on a Zynq board and have problems with cansend remote requests.

# candump can0 &
[1] 659
# cansend can0 001#R
# cansend can0 001#R
  can0  001   [0]  remote request
  can0  001   [0]  remote request
  can0  001   [2]  00 01
# cansend can0 001#R
# cansend can0 001#R
  can0  001   [0]  remote request
  can0  001   [0]  remote request
  can0  001   [2]  00 01

But, non remote requests always work:

# cansend can0 002#0000
  can0  002   [2]  00 00
  can0  004   [0] 
# cansend can0 002#0001
  can0  002   [2]  00 01
  can0  004   [0]

Any ideas?

bigguiness avatar Feb 06 '23 21:02 bigguiness

well... that formatted badly...

bigguiness avatar Feb 06 '23 21:02 bigguiness

fixed formatting - use 3 backticks instead of 1 for multi line code

marckleinebudde avatar Feb 07 '23 07:02 marckleinebudde

Use candump any,0:0,#FFFFFFFF -cexdtA to get more information.

Looking at your output:

# cansend can0 001#R
# cansend can0 001#R
  can0  001   [0]  remote request
  can0  001   [0]  remote request

You sent 2 remote requests....

  can0  001   [2]  00 01

...but only receive 1 answer. But you expect 2 answers - is this the problem?

Who is sending the answer to the RTR messages? Can you add a 2nd CAN adapter to the bus to see if can0 properly sends 2 RTR messages?

marckleinebudde avatar Feb 07 '23 07:02 marckleinebudde

Thanks for the reply.

Yes, I was expecting an answer for each request.

I have an Arduino based device connected to the CAN bus that receives the message and answers. It is only seeing the 2nd message.

I also have a logic analyzer connected to the CAN TX/RX signals going to the transceiver and a oscilloscope connected to the CANH/CANL output of the transceiver. On the first RTR message I see no activity. For the second RTR I see expected activity n both the LA and scope.

The weird thing is it only happens for the RTR messages...

root@s6:/proc# candump any,0:0,#FFFFFFFF -cexdtA &
[1] 945
root@s6:/proc# cansend can0 003#R
root@s6:/proc# cansend can0 003#R
 (2023-02-07 07:18:45.626450)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:18:45.626452)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:18:45.626779)  can0  RX - -  003   [3]  07 00 00
root@s6:/proc# cansend can0 002#5555
 (2023-02-07 07:18:54.095936)  can0  TX - -  002   [2]  55 55
 (2023-02-07 07:18:54.096142)  can0  RX - -  004   [0] 
root@s6:/proc# cansend can0 003#R
root@s6:/proc# cansend can0 003#R
 (2023-02-07 07:19:00.175716)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:00.175719)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:00.176056)  can0  RX - -  003   [3]  07 55 55
root@s6:/proc# cansend can0 002#aaaa
 (2023-02-07 07:19:06.345800)  can0  TX - -  002   [2]  AA AA
 (2023-02-07 07:19:06.346013)  can0  RX - -  004   [0] 
root@s6:/proc# cansend can0 003#R
root@s6:/proc# cansend can0 003#R
 (2023-02-07 07:19:10.505904)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:10.505906)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:10.506226)  can0  RX - -  003   [3]  07 AA AA
root@s6:/proc# cansend can0 002#0000
 (2023-02-07 07:19:16.755913)  can0  TX - -  002   [2]  00 00
 (2023-02-07 07:19:16.756145)  can0  RX - -  004   [0] 
root@s6:/proc# cansend can0 003#R
root@s6:/proc# cansend can0 003#R
 (2023-02-07 07:19:22.925631)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:22.925634)  can0  TX - -  003   [0]  remote request
 (2023-02-07 07:19:22.925959)  can0  RX - -  003   [3]  07 00 00
root@s6:/proc# 

bigguiness avatar Feb 07 '23 15:02 bigguiness

Your two sent RTR frames are sent with nearly no time gap - definitely much faster than you can invoke the second cansend on the terminal. Is it possible that the Arduino ot its CAN transceiver is in some sleep mode?

What happens if you send some other CAN frame right before the RTR frames?

hartkopp avatar Feb 07 '23 17:02 hartkopp

Same result.

My Arduino code does not sleep. It just loops checking for can messages and then processes them.

The Arduino board is a simple CAN connected GPIO expander with 16 outputs and 8 inputs.

001#R is a "version" request, expected response is 001 [2] 00 01 (0x0100 = 1.0) 002#lohi is "set output" command, where lo sets outputs 0-7 and hi sets outputs 8-15, expected response is 004 [0] (ACK) 003#R is a "query" command, expected response is 003 [3] in lo hi, where in is the input 0-7 state and lo hi is the current output 0-7, 8-15 state

The Arduino board works fine with another system (PowerPC based, not sure what driver is used off hand).

I'm trying to test the CAN interface on a Xilinx Zynq based system using the xilinx_can driver. It seems to be working other than needing two 'sendcan' messages for every remote request.

It seems like the first 'sendcan' doesn't actually do anything. 'ifconfig' does not show any increase in the TX or RX bytes for the first request. On the second request I see the TX and RX bytes increment.

Confused...

bigguiness avatar Feb 07 '23 18:02 bigguiness

Also, if I do the first 'cansend', which does not do anything, and then wait a couple minutes before doing the second 'cansend' I still get the candump output showing the two RTR frames with nearly no time gap.

root@s6:/proc# cansend can0 001#R
 (2023-02-07 11:21:29.565825)  can0  TX - -  001   [0]  remote request
 (2023-02-07 11:21:29.565828)  can0  TX - -  001   [0]  remote request
 (2023-02-07 11:21:29.566113)  can0  RX - -  001   [2]  00 01

bigguiness avatar Feb 07 '23 18:02 bigguiness

Maybe this will help.

xilinx_can e0008000.can can0: bitrate error 0.0%
root@s6:~# ifconfig can0 up
IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
root@s6:~# ifconfig can0   
can0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:21 

root@s6:~# candump any,0:0,#FFFFFFFF -cexdtA &
[1] 538
root@s6:~# cansend can0 001#R
root@s6:~# ifconfig can0
can0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Interrupt:21 

root@s6:~# cansend can0 001#R
 (2023-02-07 11:32:15.427178)  can0  TX - -  001   [0]  remote request
 (2023-02-07 11:32:15.427181)  can0  TX - -  001   [0]  remote request
 (2023-02-07 11:32:15.427466)  can0  RX - -  001   [2]  00 01
root@s6:~# ifconfig can0
can0      Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          UP RUNNING NOARP  MTU:16  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:10 
          RX bytes:2 (2.0 B)  TX bytes:0 (0.0 B)
          Interrupt:21 

The fist cansend does not show any TX packets in the ifconfig output. The second shows 2 TX packets and 1 RX packet.

bigguiness avatar Feb 07 '23 18:02 bigguiness

Ok, let's check systematically. Can you get some rough timestamps with a 2nd system, maybe with the Arduino?

In one terminal start candump any,0:0,#FFFFFFFF -cexdtA, in another terminal start the following:

# 64 CAN Frames
cangen can0 -Ii -Li -p 10 -n64

# 64 RTR CAN Frames
cangen can0 -Ii -Li -p 10 -n64 -R

You should see 64 CAN frames, 1 about every 200ms (data is random):

 (2023-02-07 20:06:34.652177)  mcp251xfd0  TX - -  000   [0] 
 (2023-02-07 20:06:34.852589)  mcp251xfd0  TX - -  001   [1]  FE
 (2023-02-07 20:06:35.052919)  mcp251xfd0  TX - -  002   [2]  F9 0F
 (2023-02-07 20:06:35.253225)  mcp251xfd0  TX - -  003   [3]  3C 7D 08
 (2023-02-07 20:06:35.453543)  mcp251xfd0  TX - -  004   [4]  F6 DB 3C 06
 (2023-02-07 20:06:35.653867)  mcp251xfd0  TX - -  005   [5]  41 5C E0 38 BC
 (2023-02-07 20:06:35.854190)  mcp251xfd0  TX - -  006   [6]  1F A7 50 3A 10 2F
 (2023-02-07 20:06:36.054500)  mcp251xfd0  TX - -  007   [7]  6D 8A 32 44 66 3A A9
 (2023-02-07 20:06:36.254814)  mcp251xfd0  TX - -  008   [8]  EC 5E 71 2C 00 48 6E 1A

Then 64 RTR CAN Frames:

 (2023-02-07 20:07:35.191925)  mcp251xfd0  TX - -  000   [0]  remote request
 (2023-02-07 20:07:35.392334)  mcp251xfd0  TX - -  001   [1]  remote request
 (2023-02-07 20:07:35.592619)  mcp251xfd0  TX - -  002   [2]  remote request
 (2023-02-07 20:07:35.792916)  mcp251xfd0  TX - -  003   [3]  remote request
 (2023-02-07 20:07:35.993219)  mcp251xfd0  TX - -  004   [4]  remote request
 (2023-02-07 20:07:36.193508)  mcp251xfd0  TX - -  005   [5]  remote request
 (2023-02-07 20:07:36.393795)  mcp251xfd0  TX - -  006   [6]  remote request
 (2023-02-07 20:07:36.594091)  mcp251xfd0  TX - -  007   [7]  remote request
 (2023-02-07 20:07:36.794387)  mcp251xfd0  TX - -  008   [8]  remote request

Compare the output of candump with the Arduino.

marckleinebudde avatar Feb 07 '23 19:02 marckleinebudde

I will need to modify my Arduino code to produce some kind of timestamp.

I the mean time I did notice this:

root@s6:~# cansend can0 001#R
# no response, now request the 'status'
root@s6:~# cansend can0 003#R
 (2023-02-07 12:37:27.676845)  can0  TX - -  001   [0]  remote request   # sent the 'version' request
 (2023-02-07 12:37:27.676847)  can0  TX - -  003   [0]  remote request   # sent the 'status' request
 (2023-02-07 12:37:27.677136)  can0  RX - -  001   [2]  00 01                  # response was the 'version'
# request the 'status'
root@s6:~# cansend can0 003#R
# no response, request the 'verison'
root@s6:~# cansend can0 001#R
 (2023-02-07 12:38:48.896847)  can0  TX - -  003   [0]  remote request  # sent the 'status' request
 (2023-02-07 12:38:48.896850)  can0  TX - -  001   [0]  remote request  # sent the 'version request'
 (2023-02-07 12:38:48.897183)  can0  RX - -  003   [3]  07 D0 CD           # response was the 'status'

Is there some kind of buffering of the remote requests?

bigguiness avatar Feb 07 '23 19:02 bigguiness

Is there some kind of buffering of the remote requests?

Seems so - but that's a bug.

Can you send a bug report to [email protected], add Appana Durga Kedareswara rao <[email protected]> and Naga Sureshkumar Relli <[email protected]> on Cc.

marckleinebudde avatar Feb 07 '23 19:02 marckleinebudde

Never actually submitted a bug report. What is supposed to be included?

bigguiness avatar Feb 07 '23 20:02 bigguiness

  • text only mail - no HTML :)
  • kernel version
  • hardware (bard)
  • describe what you do
  • what happens
  • what you expect to happen (basically what's in https://github.com/linux-can/can-utils/issues/405#issuecomment-1421347483)
  • also include the output of candump for cangen can0 -Ii -Li -p 10 -n8 that we see that it works for non RTR messages

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

The text only is a problem. My work has seriously screwed up our mail server. Everything gets converted to HTML.

I'll try to figure it out.

bigguiness avatar Feb 07 '23 20:02 bigguiness

Hmmm... Non RTR messages work fine:

root@s6:~# cansend can0 001#R
 (2023-02-07 13:16:51.216848)  can0  TX - -  001   [0]  remote request
 (2023-02-07 13:16:51.216851)  can0  TX - -  001   [0]  remote request
 (2023-02-07 13:16:51.217139)  can0  RX - -  001   [2]  00 01
root@s6:~# cansend can0 001# 
 (2023-02-07 13:17:06.347146)  can0  TX - -  001   [0] 
 (2023-02-07 13:17:06.347433)  can0  RX - -  001   [2]  00 01

Is the 001# message actually valid?

bigguiness avatar Feb 07 '23 20:02 bigguiness

Is the 001# message actually valid?

Yes, you can send messages without data. Seems the other side doesn't care if it's a RTR or not?

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

@marckleinebudde would it make sense for one of us to simply write this bug report and put Hartley and his e-mail in CC? The code in https://elixir.bootlin.com/linux/latest/source/drivers/net/can/xilinx_can.c#L633 really looks tricky as the bitstream engine has a pretty unusual register layout.

hartkopp avatar Feb 07 '23 20:02 hartkopp

@hartkopp Good idea, feel free to forward the bug report, include a link to this issue.

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

Must not. The Arduino board is a Teensy 3.2 and I'm just using the can libraries provided by the Teensyduino support for that board.

My PowerPC code that uses the board always sent the RTR message so I assumed it was needed.

bigguiness avatar Feb 07 '23 20:02 bigguiness

If you handle all messages on the Teensy in software you can check if the RTR bit is set in you RX-Handler and decide to handle it somehow "special". With some controllers you can "pre-load" a message into the controller. It's send automatically by the CAN controller hardware once it receives a certain CAN-ID + RTR bit set.

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

Ah, I'm not doing that on the Teensy. Basically it's just doing:

  CAN_message_t msg;

  updateInputs();

  if (can0.read(msg)) {
    switch (msg.id) {
    case 0x001:    sendVersion();  break;
    case 0x002:    setOutputs(msg, 0x004);  break;    // 0x004 is the ACK response
    case 0x003:    sendStatus();   break;
    default;
      break;
    }
  }
}

Simple, stupid, but it works...

bigguiness avatar Feb 07 '23 20:02 bigguiness

so no one's looking at the RTR bit :)

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

Guess not. But still probably a bug in the xilinx driver.

Thanks for all your help!

bigguiness avatar Feb 07 '23 20:02 bigguiness

For the reference: https://lore.kernel.org/all/[email protected]

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

Guess not. But still probably a bug in the xilinx driver.

This could be either a bug in the driver - or even in the xilinx CAN IP core. Let's see. Many thanks for the report!

hartkopp avatar Feb 07 '23 20:02 hartkopp

@bigguiness are you using a hardware xilinx CAN core or one in the FPGA?

marckleinebudde avatar Feb 07 '23 20:02 marckleinebudde

The hardware CAN.

Also, this is with a kernel/rootfs created using Petalinux 2020.2.

root@s6:~# uname -a
Linux s6 5.4.0-xilinx-v2020.2 #1 SMP PREEMPT Fri Feb 3 17:27:45 UTC 2023 armv7l GNU/Linux

bigguiness avatar Feb 07 '23 21:02 bigguiness

FYI: opening and closing backticks go into separate lines.

marckleinebudde avatar Feb 07 '23 21:02 marckleinebudde

Hi,

Thanks for letting us know, We are able to reproduce the issue on our end. we are looking into it. Will get back to you.

Thanks Neeli Srinivas

sneeli-git avatar Feb 21 '23 10:02 sneeli-git