python-can icon indicating copy to clipboard operation
python-can copied to clipboard

SocketCan eventually ran out of memory

Open jizhoul opened this issue 2 years ago • 3 comments

Describe the bug

When running socketcan to send message for more than 48 hours, it eventually would crash with "Cannot allocate memory" error message

To Reproduce

from signal import signal, SIGINT
from sys import exit
import can
import serial
import time
import os
import subprocess
os.system('/usr/local/bin/startSMAGsUsbCan.sh can5')
#
# when board configured as socketcan, bring up link first:
# sudo ip link set can0 up type can bitrate 500000
#
#

def getSignedNumber(number, bitLength):
    mask = (2 ** bitLength) - 1
    if number & (1 << (bitLength - 1)):
        return number | ~mask
    else:
        return number & mask

def getbytes(integer):
    return divmod(integer, 0x100)
    
def handler(signal_received, frame):
    # Handle any cleanup here
    print('SIGINT or CTRL-C detected. Exiting gracefully')
    exit(0)

if __name__ == "__main__":
  # Tell Python to run the handler() function when SIGINT is recieved
  signal(SIGINT, handler)

  print('Running. Press CTRL-C to exit.')

  print ("Can bus init")
  canBusTTY = "/dev/ttyACM0"
  canBus = can.interface.Bus(bustype='socketcan', channel="can5", bitrate=500000, txqueuelen=10000)
  canBus.shutdown()
  print ("Can bus init done")

  while True:
    os.system('/usr/local/bin/startSMAGsUsbCan.sh can5')
    canBus = can.interface.Bus(bustype='socketcan', channel="can5", bitrate=500000, txqueuelen=10000)
    time.sleep(2)
    # Do nothing and hog CPU forever until SIGINT received.
    
    SoC_HD = 99.5
    Soc = int(SoC_HD)
    Req_Charge_HD = 100
    Req_Charge_A = 100.0
    Req_Discharge_HD = 30
    Req_Discharge_A = 200.0
    Max_V_HD = 56
    Max_V = 60.0
    Min_V = 46.0 
    
   #breakup some of the values for CAN packing
    SoC_HD = int(SoC_HD*100)
    SoC_HD_H, SoC_HD_L = getbytes(SoC_HD)
    Req_Charge_HD = int(Req_Charge_A*10)
    Req_Charge_H, Req_Charge_L = getbytes(Req_Charge_HD)
    Req_Discharge_HD = int(Req_Discharge_A*10)
    Req_Discharge_H, Req_Discharge_L = getbytes(Req_Discharge_HD)
    Max_V_HD = int(Max_V*10)
    Max_V_H, Max_V_L = getbytes(Max_V_HD)
    Min_V_HD = int(Min_V*10)
    Min_V_H, Min_V_L = getbytes(Min_V_HD)


    msg = can.Message(arbitration_id=0x351, 
      data=[Max_V_L, Max_V_H, Req_Charge_L, Req_Charge_H, Req_Discharge_L, Req_Discharge_H, Min_V_L, Min_V_H],
      is_extended_id=False)
    msg2 = can.Message(arbitration_id=0x355,
      data=[Soc, 0x00, 0x64, 0x0, SoC_HD_L, SoC_HD_H],
      is_extended_id=False)
    msg3 = can.Message(arbitration_id=0x356,
      data=[0x00, 0x00, 0x00, 0x0, 0xf0, 0x00],
      is_extended_id=False)
    msg4 = can.Message(arbitration_id=0x35a,
      data=[0x00, 0x00, 0x00, 0x0, 0x00, 0x00, 0x00, 0x00],
      is_extended_id=False)
    msg5 = can.Message(arbitration_id=0x35e,
      data=[0x42, 0x41, 0x54, 0x52, 0x49, 0x55, 0x4d, 0x20],
      is_extended_id=False)
    msg6 = can.Message(arbitration_id=0x35f,
      data=[0x03, 0x04, 0x0a, 0x04, 0x76, 0x02, 0x00, 0x00],
      is_extended_id=False)
    
    try:
        canBus.send(msg)
        time.sleep(.100)

        canBus.send(msg2)
        time.sleep(.100)

        canBus.send(msg3)
        time.sleep(.100)

        canBus.send(msg4)
        time.sleep(.100)

        canBus.send(msg5)
        time.sleep(.100)
        
        canBus.send(msg6)
        time.sleep(.100)

        print("Sent 6 frames on {}".format(canBus.channel_info))
        canBus.shutdown()

    except (can.CanError) as e:
      print("CAN BUS Transmit error (is controller missing?): %s" % e.message)
	
    pass

the shell script: /usr/local/bin/startSMAGsUsbCan.sh

#!/bin/sh

# called from udev ADD action to bring up the related link
#  
#
# $1 is the device (e.g., can10)
#
# these actions are only required prior to v2.90~18

if [ ! -f "/etc/venus/newUdevRules" ] ; then
	# bring the link up
	/sbin/ip link set $1 down
	/sbin/ip link set $1 up type can bitrate 500000
	/sbin/ip link set $1 txqueuelen 1000
	
	# restart services so they see this new interface
	svc -t /service/can-bus-bms.$1
	svc -t /service/dbus-systemcalc-py
	svc -t /service/dbus-adc
	svc -t /service/dbus-modbus-client
	svc -t /service/netmon
fi

Expected behavior

I expect the loop to keep running and sending messages

Additional context

OS and version: Venus OS v2.93 Python version: python3.8.13 python-can version: 4.1.0 python-can interface/s (if applicable):

Traceback and logs

jizhoul avatar Apr 14 '23 11:04 jizhoul

Did you try taking tracemalloc snapshots? Maybe that could help finding the problem

zariiii9003 avatar Apr 16 '23 14:04 zariiii9003

Hello @jizhoul ,

I don't think this is a bug in Socketcan.

The memory issues I noticed in your script:

  • opening and closing the Bus instance in each loop
  • creating new Message objects in each loop, even though the data of each message is static (doesn't change over time)

Even if the data of a Message were to change, you can just assign new data bytes to the Message object instance, no need to discard the old one and create a completely new one each time the data changes.

In the refactored script I listed below I made the following memory improvements:

  • added a function to prepare the Message objects
  • used subprocess.check_call instead of os.system, as recommended by the official documentation
  • used with to instantiate the SocketcanBus and ensure proper cleanup of resources

Other minor improvements:

  • streamlined the while loop and displaying of messages
  • renamed the functions to follow python style guidelines
  • added type annotations

Try this script:

from __future__ import annotations
from sys import exit
from can.interfaces.socketcan import SocketcanBus
from can import Message, CanError
from time import sleep
from subprocess import check_call, CalledProcessError


#
# when board configured as socketcan, bring up link first:
# sudo ip link set can0 up type can bitrate 500000
#
#

def get_signed_number(number: int, bit_length: int) -> int:
    mask = (2 ** bit_length) - 1
    if number & (1 << (bit_length - 1)):
        return number | ~mask
    else:
        return number & mask


def get_bytes(value: int) -> tuple[int, int]:
    return divmod(value, 0x100)

    
def prepare_messages() -> list[Message]:
    """Prepare the CAN messages"""
    SoC_HD = 99.5
    Soc = int(SoC_HD)
    Req_Charge_A = 100.0
    Req_Discharge_A = 200.0
    Max_V = 60.0
    Min_V = 46.0

    # breakup some of the values for CAN packing
    SoC_HD = int(SoC_HD * 100)
    SoC_HD_H, SoC_HD_L = get_bytes(SoC_HD)
    Req_Charge_HD = int(Req_Charge_A * 10)
    Req_Charge_H, Req_Charge_L = get_bytes(Req_Charge_HD)
    Req_Discharge_HD = int(Req_Discharge_A * 10)
    Req_Discharge_H, Req_Discharge_L = get_bytes(Req_Discharge_HD)
    Max_V_HD = int(Max_V * 10)
    Max_V_H, Max_V_L = get_bytes(Max_V_HD)
    Min_V_HD = int(Min_V * 10)
    Min_V_H, Min_V_L = get_bytes(Min_V_HD)

    return [
        Message(arbitration_id=0x351,
                data=[Max_V_L, Max_V_H, Req_Charge_L, Req_Charge_H, Req_Discharge_L, Req_Discharge_H, Min_V_L,
                      Min_V_H],
                is_extended_id=False),
        Message(arbitration_id=0x355,
                data=[Soc, 0x00, 0x64, 0x0, SoC_HD_L, SoC_HD_H],
                is_extended_id=False),
        Message(arbitration_id=0x356,
                data=[0x00, 0x00, 0x00, 0x0, 0xf0, 0x00],
                is_extended_id=False),
        Message(arbitration_id=0x35a,
                data=[0x00, 0x00, 0x00, 0x0, 0x00, 0x00, 0x00, 0x00],
                is_extended_id=False),
        Message(arbitration_id=0x35e,
                data=[0x42, 0x41, 0x54, 0x52, 0x49, 0x55, 0x4d, 0x20],
                is_extended_id=False),
        Message(arbitration_id=0x35f,
                data=[0x03, 0x04, 0x0a, 0x04, 0x76, 0x02, 0x00, 0x00],
                is_extended_id=False)
    ]


if __name__ == "__main__":
    # The bash script shouldn't really be called from the python script, especially if it requires sudo rights, since
    # that means the python script needs to be started with sudo, and that is pointless escalation of rights for
    # just sending some CAN messages repeatedly.
    print("Configuring can interface: can5")
    try:
        check_call(['/usr/local/bin/startSMAGsUsbCan.sh', 'can5'])
    except CalledProcessError as cpe:
        print("Failed to configure CAN interface, script exited with code: ", cpe.returncode)
        exit(cpe.returncode)

    print("Preparing messages...")
    messages = prepare_messages()
    count = len(messages)

    # Using with will ensure the bus is properly closed when it is no longer needed
    with SocketcanBus(channel="can5", bitrate=500000, txqueuelen=10000) as can_bus:
        display_message = f"Sent {count} frames on {can_bus.channel_info}"
        print('Running. Press CTRL-C to exit.')
        while True:
            try:
                print("Waiting for 2 seconds...")
                sleep(2)
                print("Sending messages...")
                for message in messages:
                    can_bus.send(message)
                    sleep(0.1)
                print(display_message)
            except KeyboardInterrupt:
                print('SIGINT or CTRL-C detected. Exiting gracefully')
                break
            except CanError as error:
                print(f"CAN BUS Transmit error (is controller missing?): {error}")

AlexanderRavenheart avatar Nov 06 '23 20:11 AlexanderRavenheart

I also wonder why the txqueuelen is set to 10000 ?!? Usually the default is 10.

When the transmission rate of CAN frames is much higher than the available bandwidth the tx queue is filled up over time. But as @AlexanderRavenheart pointed out the the shell script: /usr/local/bin/startSMAGsUsbCan.sh is called every time in the loop. And shutting down the interface flushes the tx queue.

hartkopp avatar Nov 07 '23 17:11 hartkopp