tfx icon indicating copy to clipboard operation
tfx copied to clipboard

Example value for preprocessing_fn param for Transform

Open sidoki opened this issue 5 years ago • 8 comments

Hello folks,

Anyone can help to provide example of value to provide for preprocessing_fn param for Transform? From the documentation:

The path to python function that implements a 'preprocessing_fn'.

What kind of path? is it like 'library_name.module_name.func_name'? For module_file, it's clear that we need to provide the file path for python file where the preprocesing_fn is implemented. How about this preprocessing_fn? i can't find clear documentation about this one.

Thanks in advance!

sidoki avatar Oct 16 '20 10:10 sidoki

@rmothukuru this issue should be on the TFX repo because it's referring to TFX documentation (of a preprocessing_fn).

@sakinaljana please take a look at this guide and let us know if you have any further questions - https://www.tensorflow.org/tfx/tutorials/transform/simple?hl=en#transform_create_a_preprocessing_function

zoyahav avatar Oct 19 '20 08:10 zoyahav

Thanks @zoyahav for the response, however seems still not answering my question. My question is more about what is the value to pass through preprocessing_fn param in Transform class?

For module_file it's clear to put file path like

Transform(module_file="path/to/module/file/where/preprocessing_fn/located.py")

How about preprocessing_fn? what i should fill? From the documentation it seems string value, but not sure what to fill.

Transform(preprocessing_fn=???)

Thanks!

sidoki avatar Oct 19 '20 10:10 sidoki

From https://www.tensorflow.org/tfx/api_docs/python/tfx/components/Transform#args:

The path to python function that implements a 'preprocessing_fn'. See 'module_file' for expected signature of the function. Exactly one of 'module_file' or 'preprocessing_fn' must be supplied.

It looks like the value should be a Python module path to the preprocessing_fn in the python runtime, example: https://github.com/tensorflow/tfx/blob/cc17b3713a055fdf170680af69eecf534501db8d/tfx/experimental/templates/taxi/pipeline/configs.py#L50

Another example in test: https://github.com/tensorflow/tfx/blob/cc17b3713a055fdf170680af69eecf534501db8d/tfx/components/transform/component_test.py#L98

zoyahav avatar Oct 19 '20 11:10 zoyahav

ah ok, thanks @zoyahav, let me try ya.

sidoki avatar Oct 19 '20 15:10 sidoki

it works @zoyahav, maybe we should put this somewhere in the documentation? so anyone later can know how to use it

sidoki avatar Oct 22 '20 02:10 sidoki

@zhitaoli would it be possible to update the documentation?

zoyahav avatar Oct 22 '20 08:10 zoyahav

@sakinaljana,

Please refer to this example to pass preprocessing_fn function to Transform Beam API and a same example when working with InteractiveContext. Thank you!

singhniraj08 avatar Sep 26 '22 06:09 singhniraj08

Note that the example above is relevant only to using tf.transform directly, i.e. without a TFX Transform component. When using tf.transform within a TFX Transform component then the snippets in https://github.com/tensorflow/tfx/issues/2672#issuecomment-712059294 apply.

zoyahav avatar Sep 26 '22 10:09 zoyahav

Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!

singhniraj08 avatar Dec 09 '22 04:12 singhniraj08