amazon-dsstne
amazon-dsstne copied to clipboard
why the main key data exist in its recommendation result?
we have a product recommendation app based on DSSNTE, we found some error data that the main sku_no exist in its recommendation result like below. 1801760 4894204,0.926:4894201,0.926:4894202,0.926:4894203,0.925:4894205,0.925:4894200,0.925:1801760,0.777:4530481,0.574:4898690,0.549:4898693,0.548
we checked the input file which doesn't exist the main sku_no in the left exist in its feature data in the right.
The JSON config we used is below. { "Version": 0.7, "Name": "AE", "Kind": "FeedForward", "SparsenessPenalty": { "p": 0.5, "beta": 2.0 },
"ShuffleIndices": false,
"Denoising": {
"p": 0.4
},
"ScaledMarginalCrossEntropy": {
"oneTarget": 1.0,
"zeroTarget": 0.0,
"oneScale": 30.0,
"zeroScale": 1.0
},
"Layers": [{
"Name": "Input0",
"Kind": "Input",
"N": "auto",
"DataSet": "gl_input",
"Sparse": true
}, {
"Name": "Hidden",
"Kind": "Hidden",
"Type": "FullyConnected",
"Source": ["Input0"],
"N": 512,
"Activation": "Sigmoid",
"Sparse": true
}, {
"Name": "Output",
"Kind": "Output",
"Type": "FullyConnected",
"Source": ["Hidden"],
"DataSet": "gl_output",
"N": "auto",
"Activation": "Sigmoid",
"Sparse": true
}],
"ErrorFunction": "ScaledMarginalCrossEntropy"
}
How did you create the datasets here?
1801760 is the sku_no of product. we use history order lines to build the data set. if two sku_no were brought in same order, then we think it has relevance. the more times it happened, the more relevance. For example: Order A has 3 sku_no SKU1, SKU2, SKU3
Order B has 2 sku_no SKU2, SKU4
OrderC has 3 sku_no SKU1, SKU3, SKU5.
then we build the data set as below. SKU1 SKU2,1: SKU3,2:SKU5,1 SKU2 SKU1,1: SKU3,1:SKU4,1 SKU3 SKU1,2:SKU2,1:SKU5,1 SKU4 SKU2,1 SKU5 SKU1,1:SKU2,1