srsRAN_4G icon indicating copy to clipboard operation
srsRAN_4G copied to clipboard

srsUE keeps doing RRC reconnect until after a while it just gets disconnected

Open akhila-rao opened this issue 4 years ago • 5 comments

Issue Description

I used srsue with USRP X310 and srsenb with X310. The UE is able to connetc to the eNB. But after less than a minute, it begins to say warning radio link failure and keeps reconnecting with higher rnti values. Meanwhile I am able to send downlink traffic over the network with sometime a few seconds of traffic missing as the UE reconnects. But after a while there is a failure and the RRC connect is rejected. On some occassions after like 100-500s the connection just automatically comes back up again. It is very unstable. I have checked that I am setting the TX powers in a range that it should be able to connect. The devices are also kept around 1 meter apart, so I do not know where the radio link failure is coming from.

here is what I see

Opening 1 channels in RF device=default with args=addr=193.10.65.8,type=x300,num_recv_frames=1500,num_send_frames=1500 [INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; UHD_3.15.0.0-release [INFO] [LOGGING] Fastpath logging disabled at runtime. [INFO] [UHD RF] RF UHD Generic instance constructed [INFO] [X300] X300 initialization sequence... [INFO] [X300] Maximum frame size: 1472 bytes. [INFO] [X300] Radio 1x clock: 184.32 MHz [INFO] [GPS] Found an internal GPSDO: LC_XO, Firmware Rev 0.929a [INFO] [0/DmaFIFO_0] Initializing block control (NOC ID: 0xF1F0D00000000000) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1309 MB/s) [INFO] [0/DmaFIFO_0] BIST passed (Throughput: 1291 MB/s) [INFO] [0/Radio_0] Initializing block control (NOC ID: 0x12AD100000000001) [INFO] [0/Radio_1] Initializing block control (NOC ID: 0x12AD100000000001) [INFO] [0/DDC_0] Initializing block control (NOC ID: 0xDDC0000000000000) [INFO] [0/DDC_1] Initializing block control (NOC ID: 0xDDC0000000000000) [INFO] [0/DUC_0] Initializing block control (NOC ID: 0xD0C0000000000000) [INFO] [0/DUC_1] Initializing block control (NOC ID: 0xD0C0000000000000) [WARNING] [UDP] The current recv_buff_size of 2426666 is less than the minimum recommended size of 14400000 and may result in dropped packets on some NICs [WARNING] [UDP] The current send_buff_size of 2426666 is less than the minimum recommended size of 14400000 and may result in dropped packets on some NICs Opening USRP channels=1, args: addr=193.10.65.8,type=x300,num_recv_frames=1500,num_send_frames=1500,master_clock_rate=184.32e6 Waiting PHY to initialize ... done! Attaching UE... . . . Found Cell: Mode=FDD, PCI=1, PRB=50, Ports=1, CFO=-2.7 KHz . Found Cell: Mode=FDD, PCI=1, PRB=50, Ports=1, CFO=-4.1 KHz Found PLMN: Id=00101, TAC=7 Random Access Transmission: seq=10, tti=3061, ra-rnti=0x2 Random Access Complete. c-rnti=0x46, ta=1 RRC Connected Network attach successful. IP: 172.16.0.2 Software Radio Systems RAN (srsRAN) 18/10/2021 14:6:55 TZ:0 Warning: Detected Radio-Link Failure RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure") Random Access Transmission: seq=42, tti=4091, ra-rnti=0x2 Reestablishment OK RRC Connected Random Access Complete. c-rnti=0x47, ta=1 Warning: Detected Radio-Link Failure RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure") Selected cell no longer suitable: Going to RRC IDLE RRC IDLE . . . Found Cell: Mode=FDD, PCI=1, PRB=50, Ports=1, CFO=-4.0 KHz S-TMSI match in paging message Service Request with cause mt-Access. Random Access Transmission: seq=27, tti=8871, ra-rnti=0x2 RRC Connected Random Access Complete. c-rnti=0x49, ta=0 Service Request successful. /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3 /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation Warning: Detected Radio-Link Failure RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure") Random Access Transmission: seq=20, tti=971, ra-rnti=0x2 Random Access Complete. c-rnti=0x4a, ta=0 Reestablishment OK RRC Connected /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3 /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3 /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3 /home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation

Sometimes it also does

RRC Connection Reestablishment to PCI=286, EARFCN=3350 (Cause: "otherFailure") Random Access Transmission: seq=0, tti=1611, ra-rnti=0x2 Random Access Complete. c-rnti=0xf055, ta=4 Reestablishment Reject. Going to RRC IDLE RRC IDLE

Setup Details

srsRAN version 21.04.0 UHD version 3.15 Ubuntu version 18.04.1 for srsrenb, 20.04.2 for srsue 1GbE connection between USRP and host laptop X310 hardware with UBX daughterboards for both enb and ue SISO model of operation I have set CPU to performance I have set network buffers using sudo sysctl -w net.core.rmem_max=2426666 sudo sysctl -w net.core.wmem_max=2426666

Expected Behavior

UE stays connected to the eNB

Actual Behaviour

UE keeps disconnecting and reconnecting

Steps to reproduce the problem

I run i Mbps iperf traffic downlink over the network and just wait after setting it up.

Additional Information

The UE additionally also reports some errors srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3

I do not know if these are related. I have made a separate issue post about the aforementioned errors Also I have played around with setting different TX and RX gains and not succeeded it taking this error away.

enb_conf.txt epc_conf.txt ue_conf.txt

akhila-rao avatar Oct 20 '21 16:10 akhila-rao

Can you provide logs for both eNB and UE in info? Also make sure you don't have lates/underrungs, etc

andrepuschmann avatar Oct 25 '21 19:10 andrepuschmann

The enb log file was huge (450 MB from 1 hour of enb being up) and the interface did not accept it even after I zipped it. So I am attaching the first 30K lines and the last 30K lines as separate log files for enb. I hope we can work with this.

I rarely have any lates or underruns. Once is several setups I might see like 10 lates in the entire run duration. But this rarely happens.

ue.log epc.log enb_tail.log enb_head.log

akhila-rao avatar Oct 26 '21 09:10 akhila-rao

A few things: it seems you've configured a metrics interval which is extremely short. I see this a lot in the logs:

09:57:59.768522 [RLC    ] [I] lcid=0, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.768527 [RLC    ] [I] lcid=1, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.768532 [RLC    ] [I] lcid=2, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.768536 [RLC    ] [I] lcid=3, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.768550 [PDCP   ] [I] lcid=1, rx_rate_mbps=0.13 (real=0.13), tx_rate_mbps=0.16 (real=0.16)
09:57:59.768554 [PDCP   ] [I] lcid=2, rx_rate_mbps=0.02 (real=0.02), tx_rate_mbps=0.05 (real=0.05)
09:57:59.768559 [PDCP   ] [I] lcid=3, rx_rate_mbps=0.56 (real=0.56), tx_rate_mbps=143.45 (real=143.57)
09:57:59.768577 [ENB    ] [W] Interval less than 100ms, skipping measurement.
09:57:59.778509 [RLC    ] [I] lcid=0, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.778518 [RLC    ] [I] lcid=1, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.778522 [RLC    ] [I] lcid=2, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.778527 [RLC    ] [I] lcid=3, rx_rate_mbps=0.00 (real=0.00), tx_rate_mbps=0.00 (real=0.00)
09:57:59.778541 [PDCP   ] [I] lcid=1, rx_rate_mbps=0.13 (real=0.13), tx_rate_mbps=0.16 (real=0.17)
09:57:59.778545 [PDCP   ] [I] lcid=2, rx_rate_mbps=0.02 (real=0.02), tx_rate_mbps=0.05 (real=0.05)
09:57:59.778550 [PDCP   ] [I] lcid=3, rx_rate_mbps=0.56 (real=0.56), tx_rate_mbps=143.45 (real=143.85)
09:57:59.778570 [ENB    ] [W] Interval less than 100ms, skipping measurement.

Set that back to the default value (1s). Then also post your stdout output, i.e. the console trace when you start and hit t+Enter.

Also, I encourage to use the lastest srsRAN version which is 21.10.

andrepuschmann avatar Nov 26 '21 10:11 andrepuschmann

I have tried it with a lower log rate and it still has the same behaviour. I actually need logging every 10ms for my application which is why I have kept it on. Is it not possible to log at this rate and still maintain a stable system ? I didn't see it overloading the CPU or anything which is why I have kept it at this level. Also I had issues with an older version and was advised to switch to the latest version which I did. I know there is an even newer version now, but I can't keep changing versions everytime there is a new one because every time I do it is a different set of problems that I have to begin solving from scratch again. Do you have any advice for me if I do not want to keep updating the versions ? Do you think it is likely that this problem will be solved if I update the version ? Also Is it not possible to print logs every 10 ms and still have a stable system ? Thanks again. Sorry If I sound frustrated. I have just been unsuccessful in setting up a stable system for a very long time now.

akhila-s-rao avatar Nov 26 '21 10:11 akhila-s-rao

Try to solve your issues step-by-step. I asked you to set the default metrics interval to 1s because otherwise the logs are overwhelmed with it. We can't help if we can't diagnose your issue.

andrepuschmann avatar Nov 26 '21 15:11 andrepuschmann