wepon comments

Results 21 comments of


                                            wepon

[April 2015] Revamp Benchmarks, move to Titan-X (Digits box)

Good job! Thanks

应该是“代码是最好的文档”吧

谢谢指正．看不懂可能是因为不熟悉Eigen和ctypes的用法吧，这两个其实看下文档掌握基本用法就行了，用到时候再去查. - https://docs.python.org/2/library/ctypes.html - https://zhuanlan.zhihu.com/p/20152309 - https://www.cnblogs.com/houkai/p/6347408.html

[tgboost-python]( https://github.com/wepe/tgboost/tree/tgboost-python) 这个分支的实现，不支持类别特征处理，也就是把任何输入当成数值型特征，所以需要用户自己预处理类别特征。[master分支](https://github.com/wepe/tgboost) 的实现支持类别特征处理： > Handle categorical feature, TGBoost order the categorical feature by their statistic (Gradient_sum / Hessian_sum) on each tree node, then conduct split finding as numeric feature.

gbm.py 中有俩个 min_child_weight

谢谢提醒，我多写了一行。min_sample_split是分裂所需的最少sample数。

运行cnn.py时报错

这份代码的backend使用的是theano，最新版keras默认后端tf。检查一下 `.keras/keras.json` 这份文件，参考 : https://github.com/wepe/MachineLearning/issues/13#issuecomment-252790897

运行cnn.py时报错

C盘下找`.keras/keras.json`，文件内容必须是： ``` { "image_dim_ordering": "th", "epsilon": 1e-07, "floatx": "float32", "backend": "theano" } ```

你好，你在解决方案里面讲到的排序特征，不是特别理解

同一维特征的比较

你好，你在解决方案里面讲到的排序特征，不是特别理解

这点我们没考虑，但是确实存在你说的问题，工程上可能不会这么做，一般做离散化吧

你好，请教案例中特征工程的一些技巧

1, 2. 特征之间可能会有冗余，可以考虑把几种特征放在一起训练，再刷选一遍。我们当时怎么做的我不太记得了。 3. 判别性差的特征，在训练过程中是有可能不被选中的。训练完成后，模型可以输出feature score，这里面没出现的特征，说明在整个迭代过程中没被选中过。 4. 具体保留多少要看具体数据，可以通过交叉验证去确定。

ImportError: No module named keras.preprocessing.image

https://keras.io/preprocessing/image/ ，一直是有的，先检查一下keras有没有正确安装。