srsRAN_Project icon indicating copy to clipboard operation
srsRAN_Project copied to clipboard

SrsRAN gnb Will freeze if trying to isolate cpus (commit 4ac5300)

Open aibtw opened this issue 1 year ago • 1 comments

Issue Description

I am trying to run the gNB with isolated cpus. as soon as the gNB start processing, everything freezes, and I can see two of the isolated cores are 100% loaded. The system then requires hard reboot.

Setup Details

  • Commit 4ac5300
  • i9-13900k CPU (hyper-threading enabled, isolated cores 1-18).
  • 128GB RAM
  • RAN550 RU

Expected Behavior

To run normally, separate the load on multiple cores as configured in the config file.

Actual Behaviour

the system freezes, gNB loads two cores to the max

Steps to reproduce the problem

My setup for isolating the cores is:

  • edit /etc/default/grub
  • Add GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt isolcpus=1-18 nohz_full=1-18 rcu_nocbs=1-18 kthread_cpus=0,19-31 rcu_nocb_poll mitigations=off skew_tick=1 selinux=0 enforcing=0 tsc=reliable nmi_watchdog=0 softlockup_panic=0 audit=0 intel_pstate=disable nosoftlockup hugepagesz=1G hugepages=8 hugepagesz=2M hugepages=0 default_hugepagesz=1G pcie_aspm=off"
  • update grub
  • reboot
  • run gNB normally like ./gnb -c conf_file The expert execution section:
expert_execution:
  cell_affinities:
    -
     l1_dl_cpus: 2,3
     l1_ul_cpus: 4,5
     l2_cell_cpus: 6,7 
     ru_cpus: 10,11,12,13  
     l1_dl_pinning: mask                            # Optional TEXT. Sets the policy used for assigning CPU cores to L1 downlink tasks.
     l1_ul_pinning: mask                              # Optional TEXT. Sets the policy used for assigning CPU cores to L1 uplink tasks. 
     l2_cell_pinning: mask                            # Optional TEXT. Sets the policy used for assigning CPU cores to L2 cell tasks.
     ru_pinning: mask                                 # Optional TEXT. Sets the policy used for assigning CPU cores to Radio Unity tasks.
  affinities:
    isolated_cpus: 1-18
    low_priority_cpus: 8,9
  threads: 
    upper_phy: 
     pdsch_processor_type: auto                    # Optional TEXT (auto). Sets the PDSCH processor type. Supported: [auto, generic, concurrent, lite].
     nof_pusch_decoder_threads: 6                          # Optional UINT (1). Sets the number of threads used to encode PUSCH.
     nof_ul_threads: 6                           # Optional UINT (1). Sets the number of upprt PHY threads to proccess uplink.
     nof_dl_threads: 6                             # Optional UINT (1). Sets the number of upprt PHY threads to proccess downlink.  
    lower_phy:
      execution_profile: quad                            # Optional TEXT. Sets the lower physical layer executor profile. Supported: [single, dual, quad].
    ofh: 
      enable_dl_parallelization: 1                  # Optional BOOLEAN. Sets the Open Fronthaul downlink parallelization flag. Supported: [0, 1].

Additional Information

The release version (24.04) doesn't get stuck.

aibtw avatar Aug 19 '24 13:08 aibtw