srsRAN_Project No stable connection between GNB and UE

Issue Description

When running the GNB and UE, an RRC connection is established, but it works sporadically. There is always no PDU session established, therefore no ip address is assigned to the UE. I get the same result with wireless and wired connection (using 30db cable attenuator).

Using previous versions of srsRAN_Project, srsRAN_4G and the same setup, we got a stable RRC connection and a PDU Session #269 . Following the new instructions in the tutorial of srsRAN_Project (srate:23.04 and channel_bandwidth_Mhz: 20) the UE can not even find the GNB and build a connection.

We ran the performance script.
uhd_usrp_probe works on all used devices as it should
We also checked that the core is working by using UERANSIM.
Changing the Tx and Rx gains didn't fix the issue.

All configuration files and logs are attached below.

Setup Details

srsRAN_Project Commit: 0b2702cca srsRAN_4G Commit: eea87b1d8

UE: B200mini Ubuntu 20.04 UHD 3.15.0.0 Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1

GNB: X310 Ubuntu 20.04 UHD 3.15.0.0 Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1

Core: Open5GS v2.6.6 Ubuntu 20.04 Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1

Expected Behavior

PDU session establishment and ip address asignment like in the tutorial 275524024-55014b3b-dbb3-470c-b31a-55d8a4c93a1c

Actual Behaviour

Core is starting and connecting to AMF, UE is starting and Connecting to the GNB, the connection lasts less than 1 second and then it directly gets lost, also there is no PDU session established. In the Core no registration of the UE can be seen. ue_V23 11_rrc_connected

Steps to reproduce the problem

As stated above, UE, GNB and Core run on seperate machines, the config files can be found below. We are using the srsRAN_Project GNB and the srsRAN_4G UE.

ue.zip core(2).zip gnb.zip

Additional Information

[Any additional information, configuration or data that might be necessary to reproduce the issue]

Jan 18 '24 11:01 dhiaboujebha

PDSCH seems to be ok, but there is a lot of CRC=KO reported in gnb log for PUSCH.

Please try to tune the time_adv_nsamples parameter in the srsUE config, for example time_adv_nsamples = 300

Jan 18 '24 11:01 pgawlowicz

@pgawlowicz we have tried to change the time_adv_nsample several times between 20 and 300, but it didn't solve the problem.

Jan 18 '24 11:01 dhiaboujebha

hmm, so if you go back to BW=10MHz it works and with 20MHz it does not work?

Jan 18 '24 11:01 pgawlowicz

@pgawlowicz in both cases it doesn't work, but with BW=10MHz we get RRC Connection for few seconds. With BW=20MHz we don't get an RRC connection at all.

Jan 18 '24 11:01 dhiaboujebha

could you connect both gnb and ue USPRs to the same clock source?

Jan 18 '24 11:01 pgawlowicz

@pgawlowicz We are sadly In a room that is almost shielded by metal. We do not have the capabilities yet to sync them in this room. However we might be able to try connecting them to the Leo Bodnar clock this afternoon.

What is confusing to us, is that the RRC connection was really stable in the mentioned case before #269 and now it is not anymore.

Jan 18 '24 11:01 dhiaboujebha

Could you revert to the previous srsUE release (commit: fa56836) and check? also would be good to test with the previous gNB release, do you remember which one was used back then?

Jan 18 '24 12:01 pgawlowicz

Hi @dhiaboujebha, we had the same error yesterday. Did you updated the open5gs core network? If it is the case, check the config files, their structure have been changed. In particular the /etc/open5gs/nrf.yaml config has been added.

We still have some connection issues, but fine-tuning the time_adv_nsamples parameter in the srsue_conf file allowed us to connect the two radios. The connection does not seem to be stable; we establish the PDU session, obtaining the UE IP address, but after a few seconds, the connection appears to be lost.

Jan 24 '24 09:01 Rinelli96

@Rinelli96 Thanks for your comment. No, we have not updated the Open5GS Core. Can you please attach the config files that you changed? Can you tell me also which release of Open5GS, srsRAN_project and srsRAN_4G are you using?

Jan 24 '24 09:01 dhiaboujebha

open5gs verions: 2.7.0 srsRAN_4G version: commit eea87b1d8 srsRAN_Project version: commit 0b2702cca Open5GS Configs.zip

BR

Jan 24 '24 11:01 Rinelli96

Could you revert to the previous srsUE release (commit: fa56836) and check? also would be good to test with the previous gNB release, do you remember which one was used back then?

@pgawlowicz sorry for responding a bit late. We actually tried to use srsUE release (commit: fa56836) withe the srsRAN_Project commit 0b2702, but we still face the same issue: RRC Connected for only 1 second and then Scheduling request failed. We would try next to connect both SDRs to the same clock source. If you have any further advice, it would be so helpful.

Jan 31 '24 13:01 dhiaboujebha

The problem still persists. So we switched to another setup, where we deployed the hardware devices with an external clock. Also in the new one we are facing some problems #442 . I'll close this issue, as we are not working on this setup anymore. Thanks for your help :)

Feb 01 '24 14:02 dhiaboujebha

I tried to analyze the logs of the gnb and the ue, so i've found out that the Scheduling Request failure that we get directly after the RRC Connected message is because of this:

2024-02-21T09:32:45.194542 [RRC ] [W] ue=0 "RRC Setup Procedure" timed out after 720ms 2024-02-21T09:32:45.314596 [DU-MNG ] [I] ue=0 proc="UE Delete": Procedure started.... 2024-02-21T09:32:45.314831 [DU-F1 ] [I] ue=0 c-rnti=0x4608 GNB-DU-UE-F1AP-ID=0 GNB-CU-UE-F1AP-ID=0: F1 UE context removed. 2024-02-21T09:32:45.316110 [DU-MNG ] [I] ue=0 proc="UE Delete": Procedure finished successfully. 2024-02-21T09:32:45.316385 [CU-UEMNG] [I] ue=0 removed 2024-02-21T09:32:45.319649 [UL-PHY1 ] [I] [ 551.9] PUCCH: rnti=0x4608 format=1 prb1=105 prb2=na symb=[0, 14) cs=6 occ=4 sr=no t=159.0us 2024-02-21T09:32:45.319786 [MAC ] [I] [ 552.4] rnti=17928: Discarding UCI PDU. Cause: No UE with provided RNTI exists.

Here are the gnb.log and the ue.log.

ue_b200mini.conf.txt gnb_x310_v2_15_02.yml.txt

Any explanation? Thank you in advance

Feb 21 '24 09:02 dhiaboujebha

I see the rrc Setup message not being ACKed:

2024-02-21T09:32:44.526527 [SCHED   ] [I] [   473.1] DL HARQ rnti=0x4608 cell=0 h_id=0: Discarding HARQ process tb=0 with tbs=325. Cause: Maximum number of reTxs 4 exceeded

And no rrcSetupComplete is ever sent back to the UE. It is strange, because I can see the UE sending back positive CSI reports:

2024-02-21T09:32:44.546661 [UL-PHY1 ] [I] [   474.6] PUCCH: rnti=0x4608 format=2 prb=[104, 105) prb2=na symb=[10, 12) csi1=1111 t=168.9us

So, it seems that the UE is struggling with PUCCH Format 1.

Feb 21 '24 10:02 frankist

@frankist Thanks for your comment! After making some changes in the setup, it has finally worked, but the behavior is always sporadic! I will list the changes made and the configs used in our setup. Both hardware devices are connected with 2 antennas, no clock or a cable attenuator. Here is the working setup configurations:

working_setup_X310_B200mini.zip

Open5GS Core: v2.7.0-86-g41d8934 srsRAN_Project: commit 0b2702cca srsRAN-4G: 23.11 commit ec29b0c1f UHD: 3.15.0.0-2build5 Ubuntu: 20.04

Steps to get this result:

I tried different combinations of software versions:

(Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 fa5683 / srsRAN_Project : 0b2702)
(Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 eea87b1d8 / srsRAN_Project : 0b2702)
(Open5GS : v2.6.6-26-ge12b1be / srsRAN_4G : release 23_04_1 eea87b1d8 / srsRAN_Project : 374200d) All those tries gave me the same results: RRC Connected for less than 1 second and then scheduling request failed.

=> Then i updated the core to v2.7.0 and changed the USRP X310 with another one (i think it's an older device), and it worked.

Trying 3* X310 with the same configuration (with adjusting the addr in the device_agr) but only one of them worked (the one with the address 192.168.11.2) and 2* B200mini and only one of them worked. Do you have any explanation for that?

The connection now lasts between 10 seconds and ~ 1 minute. The next challenge is to make it stable and reliable. What can i change to get a stronger connection? :v:

Feb 22 '24 11:02 dhiaboujebha

Hey, your latest logs show a slightly different story. I see many underflows in the gnb log (e.g. [W] Real-time failure in RF: underflow or real-time failure in RF: underflow) that is making the UE lose sync. Can you assign more cores to your gnb application (4 is a bit too low) and increase the number of PHY threads?

Feb 22 '24 12:02 frankist

We only have 4 Cores available, as you can see here in the specs: UE: B200mini Ubuntu 20.04 UHD 3.15.0.0 Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1

GNB: X310 Ubuntu 20.04 UHD 3.15.0.0 Intel(R) Core(TM) i7-7700T CPU @ 2.90GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1

In the beginning we also thought our problems might be related to the limited physical resources. But we were told it is most likely not a problem.

          @tilldroemmer, double-check that your core and UE configurations match. Normally you see this type of issue when there is a problem with the APN configuration, or how the UE is registered in the core. I would be surprised if it has to do with the available resources.

Originally posted by @brendan-mcauliffe in https://github.com/srsran/srsRAN_Project/issues/269#issuecomment-1798655684

We neither could find any minimal resource requirements in the specs. Can you help us with the resource requirements?

Feb 28 '24 08:02 dhiaboujebha

We are working on HW recommendations. Something similar to what we have for srsRAN-4G project: https://docs.srsran.com/projects/4g/en/latest/app_notes/source/hw_packs/source/index.html

Regarding your setup, the 4 cores seem to be not enough. By default, we use 8 threads - please see the expert_execution section in the gnb config reference.

You might try to reduce BW and then add the following section to your gnb config:

expert_execution:
  threads: 
    non_rt: 
      nof_non_rt_threads: 1                   # Optional UINT (4). Sets the number of non real time threads for processing of CP and UP data in upper layers. 
    upper_phy: 
      pdsch_processor_type: auto              # Optional TEXT (auto). Sets the PDSCH processor type. Supported: [auto, generic, concurrent, lite].
      nof_pusch_decoder_threads: 1            # Optional UINT (1). Sets the number of threads used to encode PUSCH.
      nof_ul_threads: 1                       # Optional UINT (1). Sets the number of upprt PHY threads to proccess uplink.
      nof_dl_threads: 1                       # Optional UINT (1). Sets the number of upprt PHY threads to proccess downlink.

But still, I am not sure whether this setup will work correctly.

Feb 29 '24 08:02 pgawlowicz

@pgawlowicz we upgraded the gNB computer to a robuster one with 8 cores. After that we got a better behavior of the setup, but it is still not reliable, because it takes some tries to establish a PDU session, but we always can get an RRC connection. Yesterday we installed a new network card on the gNB computer to connect the X310 with a 10Gbit to the computer. So now we use the full MTU size 8000. After some tests this doesn't seem to bring a big difference.

Mar 21 '24 09:03 dhiaboujebha

@dhiaboujebha could you provide the output of the console trace from both the gnb and srsUE? You need to press t in both consoles to activate trace logging.

You might be also interested in checking this discussion and it was a similar issue: https://github.com/srsran/srsRAN_Project/issues/489#issuecomment-1994954337

Mar 21 '24 09:03 pgawlowicz

@pgawlowicz here are the traces of the gnb and the ue. trace_ue_21_03.txt trace_gnb_21_03.txt The PDU session lasts for long period now. The only problem that we are facing is the first session, which is hard to establish.

Mar 21 '24 15:03 dhiaboujebha

hmm, could you try to reduce the TX gain of the gnb and try again? It seems that the signal received at UE is very good (snr>30dB). But the signal from UE at gnb is bad. Then you might try to increase srsUE tx_gain

Mar 21 '24 17:03 pgawlowicz

@pgawlowicz we tried to reduce the Tx gains of the gNB and we kept the Tx gains of the UE at the maximum. Here you can find the traces of the tests done. tx_10.zip tx_15.zip tx_20.zip

Mar 26 '24 14:03 dhiaboujebha

PDSCH SNR around 20dB looks good. Now you need to also improve uplink. I have seen that the rx_gain in gnb is already 30 and tx_gain in srsUE is 80. Maybe it is already too much and the link is saturated. Could you try to reduce both a bit and see if PUSCH SNR increases?

Mar 26 '24 14:03 pgawlowicz

@dhiaboujebha any update on this issue?

Apr 11 '24 08:04 pgawlowicz

@pgawlowicz sorry for the delay. We made some changes in the tx gains of ue and rx gains of gnb as you said. Here are the traces: traces_tx_70_rx_25.zip traces_tx_70_rx_20.zip traces_tx_60_rx_20.zip

Apr 16 '24 10:04 dhiaboujebha

hi @dhiaboujebha, are you still working on this setup?

Jul 16 '24 08:07 pgawlowicz

@pgawlowicz we changed our setup to COTS UE. We are using right now quectel instead of the b200mini. The old setup and the new one are working btw. So i'll close the issue. Thanks for you support during all this period. :)

Jul 16 '24 08:07 dhiaboujebha