mt76 icon indicating copy to clipboard operation
mt76 copied to clipboard

mt76: lock mt76_queue before access

Open ptpt52 opened this issue 3 years ago • 7 comments

This may fix potential tx hang

ptpt52 avatar Apr 09 '21 15:04 ptpt52

@nbd168

xback avatar Apr 09 '21 15:04 xback

How does it fix a potential tx hang?

nbd168 avatar Apr 09 '21 17:04 nbd168

no, it would never happens on this code, but it should be lock.

ptpt52 avatar Apr 09 '21 18:04 ptpt52

The dma_map_single is intentionally left outside of the lock for performance reasons

nbd168 avatar Apr 09 '21 18:04 nbd168

The dma_map_single is intentionally left outside of the lock for performance reasons

true but we still have to lock before access q maybe do some changes to this to skip the dma_map_single lock.

ptpt52 avatar Apr 09 '21 18:04 ptpt52

diff --git a/dma.c b/dma.c
index 6ea58aec..8d1d21cb 100644
--- a/dma.c
+++ b/dma.c
@@ -333,6 +333,11 @@ mt76_dma_tx_queue_skb_raw(struct mt76_dev *dev, struct mt76_queue *q,
        buf.len = skb->len;
 
        spin_lock_bh(&q->lock);
+       if (q->queued + 1 >= q->ndesc - 1) {
+               dma_unmap_single(dev->dev, buf.addr, buf.len, DMA_TO_DEVICE);
+               spin_unlock_bh(&q->lock);
+               return -ENOMEM;
+       }
        mt76_dma_add_buf(dev, q, &buf, 1, tx_info, skb, NULL);
        mt76_dma_kick_queue(dev, q);
        spin_unlock_bh(&q->lock);

what about this changes? re-check the q->queued. @nbd168

ptpt52 avatar Apr 09 '21 18:04 ptpt52

Sorry @nbd168 for insulting you.

sorry

ghost avatar Jan 13 '22 22:01 ghost

@ptpt52 Should it "goto error'" instead of "return -ENOMEM;", otherwise it will miss out on a dev_kfree_skb(skb);

Brain2000 avatar Feb 24 '23 23:02 Brain2000

@ptpt52 Also it now looks like dev->dev is now dev->dma_dev

Brain2000 avatar Feb 24 '23 23:02 Brain2000

I think this PR should be dropped.

ptpt52 avatar Feb 25 '23 01:02 ptpt52

I think this PR should be dropped.

I spotted this same potential issue a week before I read this thread. Since we both noticed the same thing, I thought it might be worth a shot, and manually patched your code change into my local build.

I'm in the process of tracking down an mt76 kernel panic and to the point where I'm grasping for straws poring through the source code, making manual modifications on a weekly basis, hoping that something will work.

Brain2000 avatar Feb 25 '23 02:02 Brain2000

@Brain2000
fine. if you get the crash log?

ptpt52 avatar Feb 25 '23 03:02 ptpt52

@ptpt52 Yes, that's on my todo list. I'm hooking up the serial port this week, because the ramoops doesn't work because the Belkin RT3200 crashes in such a way that it must be fully power cycled to get it back online, wiping the ram...

I could try the mtdoops but I don't know how to split the partition properly so it can safely save to one of the partitions, and I can't find a lot of documentation surrounding that subject.

Brain2000 avatar Feb 25 '23 04:02 Brain2000