AQtion icon indicating copy to clipboard operation
AQtion copied to clipboard

AQC113 card unstable under Linux 6.14.5

Open NyaomiDEV opened this issue 10 months ago • 3 comments

Sometimes I would boot up my computer only to find that the PCIe card I am using, which to be exact is a TP-Link TX401, is spamming PCIe AER errors to my kernel log. Those logs are innocuous but the spamming will eventually wear down my SSD, so I have to reboot twice or thrice in order for the driver to finally correctly handle the card.

The spam:

kernel: pcieport 0000:00:03.2: AER: Correctable error message received from 0000:11:00.0
kernel: atlantic 0000:11:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
kernel: atlantic 0000:11:00.0:   device [1d6a:04c0] error status/mask=00000001/0000e000
kernel: atlantic 0000:11:00.0:    [ 0] RxErr                  (First)

The card in lspci:

11:00.0 Ethernet controller: Aquantia Corp. AQtion AQC113 NBase-T/IEEE 802.3an Ethernet Controller [Antigua 10G] (rev 03)
        Subsystem: Aquantia Corp. Device 0001
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 144
        IOMMU group: 3
        Region 0: Memory at fc000000 (64-bit, non-prefetchable) [size=512K]
        Region 2: Memory at fc0a0000 (64-bit, non-prefetchable) [size=4K]
        Region 4: Memory at fbc00000 (64-bit, non-prefetchable) [size=4M]
        Expansion ROM at fc080000 [disabled] [size=128K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75W TEE-IO-
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 16GT/s, Width x4, ASPM not supported
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 16GT/s, Width x4
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp+ 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                         AtomicOpsCtl: ReqEn-
                         IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
                         10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: Upstream Port
        Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
                Vector table: BAR=2 offset=00000000
                PBA: BAR=2 offset=00000200
        Capabilities: [d0] Vital Product Data
                Product Name: Marvell AQtion Network Adapter
                Read-only fields:
                        [PN] Part number: 00B1E113
                        [V0] Vendor specific: MAC Addr: <Redacted>
                        [V1] Vendor specific: Bundle Version: 1.5.38
                        [V2] Vendor specific: Fw Version: 1.2.122
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                End
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
                        ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr+ HeaderOF+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [148 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed+ WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [168 v1] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [178 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [198 v1] Physical Layer 16.0 GT/s <?>
        Capabilities: [1bc v1] Lane Margining at the Receiver
                PortCap: Uses Driver-
                PortSta: MargReady+ MargSoftReady-
        Capabilities: [1d4 v1] Latency Tolerance Reporting
                Max snoop latency: 1048576ns
                Max no snoop latency: 1048576ns
        Capabilities: [1dc v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=10us PortTPowerOnTime=14us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=32768ns
                L1SubCtl2: T_PwrOn=14us
        Capabilities: [1ec v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2ec v1] Data Link Feature <?>
        Capabilities: [2f8 v1] Precision Time Measurement
                PTMCap: Requester+ Responder- Root-
                PTMClockGranularity: Unimplemented
                PTMControl: Enabled- RootSelected-
                PTMEffectiveGranularity: Unknown
        Capabilities: [304 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
        Kernel driver in use: atlantic
        Kernel modules: atlantic

There are reports of people having issues with AQC113C though I don't think mine is that model of the chip - so I find myself pretty lonely in this. The card was bought new and it operates fine otherwise.

NyaomiDEV avatar May 08 '25 18:05 NyaomiDEV

I'm also running 6.14.5 (just upgraded to 6.14.6) and have a TP-Link TX401. I haven't seen any issues so far.

zeroepoch avatar May 15 '25 08:05 zeroepoch

I'm also running 6.14.5 (just upgraded to 6.14.6) and have a TP-Link TX401. I haven't seen any issues so far.

You need to have PCIe AER Capability enabled in BIOS to see those messages, and of course if you ever saw those messages that'd mean that the card would've worked anyway.

Further research I put into this, seems that this new Aquantia chip is less supported and sometimes the driver initializes it slightly wrong. A reboot fixes it, that's what I've been doing lately.

But it only seldom happens.

NyaomiDEV avatar May 15 '25 17:05 NyaomiDEV

Got it. I'm pretty sure I don't have PCIe AER enabled in my BIOS since I don't recall seeing that option after going over the various options a few times. I was just providing some evidence that it works in case that was a data point, but sounds like you already know when it can and cannot work.

Further research I put into this, seems that this new Aquantia chip is less supported and sometimes the driver initializes it slightly wrong. A reboot fixes it, that's what I've been doing lately.

I've also noticed that driver development for this chip is not very active. This repo hasn't been updated in many years and the latest Linux kernel source is still quite similar other than small compatibility fixes over time.

On my own grips, WOL is not working, while it's working great on Windows (covered in another issue here).

zeroepoch avatar May 15 '25 20:05 zeroepoch