GGNN_Reasoning icon indicating copy to clipboard operation
GGNN_Reasoning copied to clipboard

raise RuntimeError('each element in list of batch should be of equal size') RuntimeError: each element in list of batch should be of equal size

Open alice-cool opened this issue 4 years ago • 6 comments

alice-cool avatar Mar 18 '21 13:03 alice-cool

I think maybe the irregular list A in the dataset.py the get_item() makes the mistake.

alice-cool avatar Mar 18 '21 14:03 alice-cool

in dataset.py

A = [[] for k in range(self.n_node)] 
for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))

it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

alice-cool avatar Mar 19 '21 08:03 alice-cool

    @staticmethod
    def find_max_syz(data, num):
        max_syz = 0
        for i in range(len(data)):
            listnum = [0 for k in range(num)]
            for j in range(len(data[i]['graph'])):
                listnum[data[i]['graph'][j][0]] = listnum[data[i]['graph'][j][0]]+1
            if max_syz < max(listnum):
                max_syz = max(listnum)

        return max_syz

    @staticmethod
    def find_max_node_id(data):
        max_num_id = 0
        for i in range(len(data)):
            for triple in data[i]["graph"]:
                if triple[0] > max_num_id:
                    max_num_id = triple[0]
                if triple[2] > max_num_id:
                    max_num_id = triple[2]
        return max_num_id

self.n_node_types = self.find_max_node_id(data)

            A = [[] for k in range(self.n_node_types)]

            for triple in data[i]["graph"]:
                A[triple[0]].append((triple[1], triple[2]))

            print("syz:",self.syz_num)
            #padding syz
            for i in range(len(A)):
                if A[i]==[]:
                    for k in range(self.syz_num):
                        A[i].append((0, 0))
                elif len(A[i]) < self.syz_num:
                    cc = self.syz_num - len(A[i])
                    for k in range(cc):
                        A[i].append((0, 0))

            A_list.append(A)
            data_idx.append(i)

alice-cool avatar Mar 19 '21 09:03 alice-cool

in dataset.py

A = [[] for k in range(self.n_node)] 
for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))

it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

Hi! Sorry about the trouble. I have not been maintaining this repo for like two years, so I don't actually remember this kind of debugging-level details at this point. I would suggest you trying some toy data under the data/ directory and see how it works. But for the code you pasted here, I don't quite think the claim "last two lines say that the len of A will be modified by the id of node" makes sense. The last two lines only update the content of A, without changing the size of A (i.e., len(A)). Hope it helps!

entslscheia avatar Mar 20 '21 03:03 entslscheia

Thanks for   your help I will try

---Original--- From: "Yu @.> Date: Sat, Mar 20, 2021 11:09 AM To: @.>; Cc: @.@.>; Subject: Re: [entslscheia/GGNN_Reasoning] raise RuntimeError('each element in list of batch should be of equal size') RuntimeError: each element in list of batch should be of equal size (#2)

in dataset.py A = [[] for k in range(self.n_node)] for triple in data[i]["graph"]: A[triple[0]].append((triple[1], triple[2]))
it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

Hi! Sorry about the trouble. I have not been maintaining this repo for like two years, so I don't actually remember this kind of debugging-level details at this point. I would suggest you to try some toy data under the data directory and see how it works. But for the code you pasted here, I don't quite think the claim "last two lines say that the len of A will be modified by the id of node" makes sense. The last two lines only update the content of A, without changing the size of A (i.e., len(A)). Hope it helps!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

alice-cool avatar Mar 20 '21 03:03 alice-cool

I run my modified code that throw the memory error. SO my way will produce a big sparse A list. I said the len(A)will be modified because if evey graph at most has 3 edges. if using you code, it said the A list initialize the length as 3. If the set of samples of graph data includes 100 different nodes, that is the node of id will be up to 100. So in the loop because A[triple[0]], so the triple[0] will be 100. So len(A)will be replaced by 100. It is just my opinion. Thanks for your help

for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))

alice-cool avatar Mar 22 '21 01:03 alice-cool