qlib icon indicating copy to clipboard operation
qlib copied to clipboard

请问能把处理器分个类吗?

Open quantcn opened this issue 3 years ago • 3 comments

您好,请问能把内置的如下处理器分个类吗:哪些是共享处理器,哪些是学习处理器,哪些是推理处理器? 并指出哪些是处理特征的哪些是处理标签的? DropnaProcessor: processor that drops N/A features. DropnaLabel: processor that drops N/A labels. TanhProcess: processor that uses tanh to process noise data. 用于特征还是标签? ProcessInf: processor that handles infinity values, it will be replaced by the mean of the column. 用于特征还是标签? Fillna: processor that handles N/A values, which will fill the N/A value by 0 or other given number. 用于特征还是标签? MinMaxNorm: processor that applies min-max normalization. 用于特征还是标签? ZscoreNorm: processor that applies z-score normalization. 用于特征还是标签? RobustZScoreNorm: processor that applies robust z-score normalization.用于特征还是标签? CSZScoreNorm: processor that applies cross sectional z-score normalization.用于特征还是标签? CSRankNorm: processor that applies cross sectional rank normalization.用于特征还是标签? CSZFillna: processor that fills N/A values in a cross sectional way by the mean of the column.用于特征还是标签?

quantcn avatar Sep 13 '22 09:09 quantcn

I don't think this is quite neccessary, as any of the Processor can be applied to either type of column.

pop0121 avatar Sep 13 '22 10:09 pop0121

您是说比如ZscoreNorm,CSZScoreNorm这两个可以同时当做推理处理器,或学习处理器吗?

quantcn avatar Sep 13 '22 12:09 quantcn

In fact, DropnaLabel and CSRankNorm are usually used in the example implementations.

I assume that, if you want to have some learning objectives other than Rank IC, you may also need other processors.

pop0121 avatar Sep 13 '22 13:09 pop0121

I think this is neccessary.i can google what is mean of CSRankNorm,but i do not know when to use.

lerit avatar Sep 27 '22 14:09 lerit

some relate issues: https://github.com/microsoft/qlib/issues/789 https://github.com/microsoft/qlib/pull/879 https://github.com/microsoft/qlib/issues/602

lerit avatar Sep 27 '22 14:09 lerit

In most benchmark examples, CSRankNorm is used to create the RankIC learning label.

But in other cases, you may also use this processor to make the feature column more robust against outliers and non-linearity.

pop0121 avatar Sep 28 '22 01:09 pop0121

This issue is stale because it has been open for three months with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days

github-actions[bot] avatar Dec 27 '22 03:12 github-actions[bot]