dma_ip_drivers icon indicating copy to clipboard operation
dma_ip_drivers copied to clipboard

BUG: libQDMA: libqdma_init() fails with DEBUG_SPINLOCK enabled

Open mclayton1000 opened this issue 4 years ago • 10 comments

Hello everyone, hi @karenx-xilinx , hi @houlz0507 , hi @sonals , hi @akum-xilinx , hi @pranjalv-xilinx, hi @sujathabanoth-xlnx ,

on Linux with DEBUG_SPINLOCK option enabled i get:

[ 250.590576] libqdma_init: dma req. opaque data size too big 136 > 128.

This happens in libqdma_init() and is due to the fact that the structure qdma_sgt_req_cb has a field of type wait_queue_head_t which is a structure that contains a field of type spinlock_t. spinlock_t size depends on DEBUG_SPINLOCK configuration option.

A possible fix is to turn the field wq of structure qdma_sgt_req_cb into a pointer to avoid this issue.

Best regards.

mclayton1000 avatar Nov 20 '20 09:11 mclayton1000

Do you have a patch for this? Curious to be able to try it out.

hmaarrfk avatar Dec 01 '20 05:12 hmaarrfk

Hi @hmaarrfk ,

thanks for the reply. The basic patch is quite simple.

In libqdma/qdma_descq.h just turn wq field in struct qdma_sgt_req_cb into a pointer. Then fix the accesses to wq in: libqdma/qdma_st_c2h.c libqdma/libqdma_export.c libqdma/qdma_descq.c

Finally, in qdma_request_submit(), dynamically alloc cb:

2316 //cb = qdma_req_cb_get(req);
2317 cb = kmalloc(sizeof(*cb), GFP_KERNEL);
2318 if (cb == NULL)
2319 return -ENOMEM;

Obviously cb must then be freed.

Best regards.

mclayton1000 avatar Dec 02 '20 10:12 mclayton1000

So is it that the implementation of linux/wait changes based on kernel options:

#include <linux/wait.h>

#define qdma_wait_queue                 wait_queue_head_t
#define qdma_waitq_init                 init_waitqueue_head
#define qdma_waitq_wakeup               wake_up_interruptible
#define qdma_waitq_wait_event           wait_event_interruptible
#define qdma_waitq_wait_event_timeout   wait_event_interruptible_timeout

hmaarrfk avatar Dec 02 '20 14:12 hmaarrfk

Hi @hmaarrfk ,

from Linux kernel source code.

From file include/linux/wait.h, the definition of wait_queue_head_t is:

struct wait_queue_head {
        spinlock_t              lock;
        struct list_head        head;
};
typedef struct wait_queue_head wait_queue_head_t;

From file include/linux/spinlock_types.h, the definition of spinlock_t is:

typedef struct spinlock {
        union {
                struct raw_spinlock rlock;

#ifdef CONFIG_DEBUG_LOCK_ALLOC
#define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map))
                struct {
                        u8 __padding[LOCK_PADSIZE];
                        struct lockdep_map dep_map;
                };
#endif
        };
} spinlock_t;

The spinlock_t structure dimension depends on kernel configuration and so every structure which includes it.

mclayton1000 avatar Dec 02 '20 14:12 mclayton1000

I see, so you are proposing a startegy that would allow you to be blind to the implementation of spinlock so as to make it easier to debug.

hmaarrfk avatar Dec 02 '20 16:12 hmaarrfk

Honestly, I don't think that xilinx is too active on these forums. Sometimes, they will review submitted patches as PRs. You can try to do that since you seem to have it already figured out.

I've tagged this as a potential way for us to debug things. But I don't have time to look into this more deeply myself.

hmaarrfk avatar Dec 02 '20 16:12 hmaarrfk

Hi @hmaarrfk ,

I think that if the dimension of a structure must be of a certain size as reported in the following message:

[ 250.590576] libqdma_init: dma req. opaque data size too big 136 > 128.

the dimension of the structure should be independent from the kernel configuration.

Turning that field into a pointer is a possible solution.

mclayton1000 avatar Dec 02 '20 16:12 mclayton1000

Honestly, I don't think that xilinx is too active on these forums. Sometimes, they will review submitted patches as PRs. You can try to do that since you seem to have it already figured out.

I've tagged this as a potential way for us to debug things. But I don't have time to look into this more deeply myself.

Thank you for your reply and support. I already signal this problem on XRT repository. Let's see if anyone from Xilinx respond or not.

mclayton1000 avatar Dec 02 '20 16:12 mclayton1000

xref: https://github.com/Xilinx/XRT/issues/4487

hmaarrfk avatar Dec 02 '20 16:12 hmaarrfk

Hello,

My name is Mark Harfouche. I am not affiliated with Xilinx in any way. Over the years of using QDMA, I've been wanted better community organization.

I've created a fork of dma_ip_drivers which I intend to maintain and work with the community at large to improve.

The fork can be found https://github.com/hmaarrfk/dma_ip_drivers

For now, I am stating the main goals of the repository in https://github.com/hmaarrfk/dma_ip_drivers/issues/2

If you are interested in working together, feel free to open an issue or PR to my fork.

Best,

Mark

hmaarrfk avatar Aug 22 '22 04:08 hmaarrfk