HIST icon indicating copy to clipboard operation
HIST copied to clipboard

Model artitecture problem

Open mzh-lin opened this issue 3 years ago • 11 comments

Dear authors,

I have some questions about the differences between codes and formula in the part of predefined concept module. It seems that in the paper, we use market capital as the initial weight of the predefined concepts. I can understand these steps in code:

        market_value_matrix = market_value.reshape(market_value.shape[0], 1).repeat(1, concept_matrix.shape[1])
        stock_to_concept = concept_matrix * market_value_matrix
        
        stock_to_concept_sum = torch.sum(stock_to_concept, 0).reshape(1, -1).repeat(stock_to_concept.shape[0], 1)
        stock_to_concept_sum = stock_to_concept_sum.mul(concept_matrix)

        stock_to_concept_sum = stock_to_concept_sum + (torch.ones(stock_to_concept.shape[0], stock_to_concept.shape[1]).to(device))
        stock_to_concept = stock_to_concept / stock_to_concept_sum
        hidden = torch.t(stock_to_concept).mm(x_hidden)
        
        hidden = hidden[hidden.sum(1)!=0]

But i cant understand the following steps, which i cant find in the paper formula,

        stock_to_concept = x_hidden.mm(torch.t(hidden))
        # stock_to_concept = cal_cos_similarity(x_hidden, hidden)
        stock_to_concept = self.softmax_s2t(stock_to_concept)
        hidden = torch.t(stock_to_concept).mm(x_hidden)

Does It aim to modify the weight of stock to concept? but i dont find the answer in the paper, can you give me some hints about this? I'll appreciate your responses.

mzh-lin avatar Nov 28 '21 08:11 mzh-lin

I think this part code corresponds to section4.2.2, it aims to correct the Predefined concepts’ Representations.

BoruiXu avatar Dec 01 '21 06:12 BoruiXu

The reply of @a919480698 is right.

Wentao-Xu avatar Dec 01 '21 12:12 Wentao-Xu

Dear Authors,

I have a question about this part of the code, too. Why is Line 2 commented? When we are reproducing your results, do we have to uncomment this line?

        stock_to_concept = x_hidden.mm(torch.t(hidden))
        # stock_to_concept = cal_cos_similarity(x_hidden, hidden)
        stock_to_concept = self.softmax_s2t(stock_to_concept)
        hidden = torch.t(stock_to_concept).mm(x_hidden)

Michelia-L avatar Dec 05 '21 04:12 Michelia-L

Dear Authors,

I have a question about this part of the code, too. Why is Line 2 commented? When we are reproducing your results, do we have to uncomment this line?

        stock_to_concept = x_hidden.mm(torch.t(hidden))
        # stock_to_concept = cal_cos_similarity(x_hidden, hidden)
        stock_to_concept = self.softmax_s2t(stock_to_concept)
        hidden = torch.t(stock_to_concept).mm(x_hidden)

I think maybe Line 1 and Line 2 mean different similarity methods. The author mentioned the Line2 method(cosine) in his paper. I tried to uncomment Line 2 and comment Line 1, also can reproduce the results.

BoruiXu avatar Dec 05 '21 04:12 BoruiXu

Dear Authors, I have a question about this part of the code, too. Why is Line 2 commented? When we are reproducing your results, do we have to uncomment this line?

        stock_to_concept = x_hidden.mm(torch.t(hidden))
        # stock_to_concept = cal_cos_similarity(x_hidden, hidden)
        stock_to_concept = self.softmax_s2t(stock_to_concept)
        hidden = torch.t(stock_to_concept).mm(x_hidden)

I think maybe Line 2 and Line 3 mean different similarity methods. The author mentioned the Line2 method(cosine) in his paper. I tried to uncomment Line 2 and comment Line 3, also can reproduce the results.

Do you mean Line 2 and Line 1 ? I think Line 3 is just a normalization step.

Michelia-L avatar Dec 05 '21 04:12 Michelia-L

YES. sorry, I actually mean Line 2 and Line 1

BoruiXu avatar Dec 05 '21 04:12 BoruiXu

YES. sorry, I actually mean Line 2 and Line 1

Thank you! I am not sure if the author use attention mechanism instead of calculating cosine similarity to describe the degree of connection between stocks and concepts. Does Line 1 correspond to "attention mechanism"? The section 4.4 in paper noted that "we apply the attention mechanism to learn the importance of each concept for a stock." But it cant be found in either the code or the formula of the paper.

I think the author uses Line 2 (cosine) to realize the attention mechanism. Calculating cosine similarity is is an implementation of attention mechanism. But I am not sure

BoruiXu avatar Dec 05 '21 08:12 BoruiXu

YES. sorry, I actually mean Line 2 and Line 1

Thank you! I am not sure if the author use attention mechanism instead of calculating cosine similarity to describe the degree of connection between stocks and concepts. Does Line 1 correspond to "attention mechanism"? The section 4.4 in paper noted that "we apply the attention mechanism to learn the importance of each concept for a stock." But it cant be found in either the code or the formula of the paper.

I think the author uses Line 2 (cosine) to realize the attention mechanism. Calculating cosine similarity is is an implementation of attention mechanism. But I am not sure

Thank you anyway!

Michelia-L avatar Dec 05 '21 12:12 Michelia-L

I think this part code corresponds to section4.2.2, it aims to correct the Predefined concepts’ Representations.

i got it! Thanks very much! i am still a little confused about the following step of formula<7> in the section4.2.2 I can understand this part aim to implement the formula<6> in the section4.2.2

stock_to_concept = x_hidden.mm(torch.t(hidden))
# stock_to_concept = cal_cos_similarity(x_hidden, hidden)
stock_to_concept = self.softmax_s2t(stock_to_concept)
hidden = torch.t(stock_to_concept).mm(x_hidden)

and then the formula<7> want to add a fully connected layer, but the following code seems jump to section 4.4, I can not find the fully connected layer about the formula<7>.

concept_to_stock = cal_cos_similarity(x_hidden, hidden)
concept_to_stock = self.softmax_t2s(concept_to_stock) 

e_shared_info = concept_to_stock.mm(hidden) 
e_shared_info = self.fc_es(e_shared_info) 

do we need to implement the formula <7> before constructing the concept_to_stock? @Wentao-Xu thanks!

mzh-lin avatar Dec 06 '21 02:12 mzh-lin

Dear Authors, I have a question about LeakyReLU activation function, it seems that the function is used three times in your paper, but in the code , it's used only once. image image

sdumyh avatar Apr 27 '22 06:04 sdumyh

Hi, there may be some errors in the paper's equations, please take the code as the correct standard.

Wentao-Xu avatar Apr 27 '22 08:04 Wentao-Xu