cnn-text-classification-tf icon indicating copy to clipboard operation
cnn-text-classification-tf copied to clipboard

Warning: tensorflow.contrib.learn.python.learn.preprocessing.text) is deprecated and will be removed in a future version. Instructions for updating: Please use tensorflow/transform or tf.data.

Open nidhikamath91 opened this issue 6 years ago • 10 comments

Hello,

I am using tensorflow on linux and while using tensorflow.contrib.learn.python.learn.preprocessing, I get the below error

WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. WARNING:tensorflow:From /tmp/anyReader-376H566fJpAUSEt/anyReader-376qtSRQxT2gOiq.tmp:67: VocabularyProcessor.init (from tensorflow.contrib.learn.python.learn.preprocessing.text) is deprecated and will be removed in a future version. Instructions for updating: Please use tensorflow/transform or tf.data. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py:154: CategoricalVocabulary.init (from tensorflow.contrib.learn.python.learn.preprocessing.categorical_vocabulary) is deprecated and will be removed in a future version. Instructions for updating: Please use tensorflow/transform or tf.data. WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py:170: tokenizer (from tensorflow.contrib.learn.python.learn.preprocessing.text) is deprecated and will be removed in a future version. Instructions for updating: Please use tensorflow/transform or tf.data.`

How do I eliminate them

nidhikamath91 avatar May 30 '18 09:05 nidhikamath91

The VocabularyProcessor class is deprecated in (I believe) Tensorflow v1.8. The reasons is because they want to encourage you to use the Datasets API. I used this code as a starting point https://github.com/LightTag/BibSample/blob/master/preppy.py

Hope this helps!

vsocrates avatar Jun 22 '18 00:06 vsocrates

Are you done? I am also the problem. The explanation upstairs still feels confused

LilyDreamZhao avatar Sep 12 '18 08:09 LilyDreamZhao

I used the exact code in preprocessing. Txt and it worked. I did not understand the explanation in the new workaround.

On Wed, Sep 12, 2018, 10:31 AM LilyDreamZhao [email protected] wrote:

Are you done? I am also the problem. The explanation upstairs still feels confused

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dennybritz/cnn-text-classification-tf/issues/147#issuecomment-420560217, or mute the thread https://github.com/notifications/unsubscribe-auth/Ac4isGPlU37Hqz-OMb-X3WBy-L2huCyOks5uaMZugaJpZM4US8zt .

nidhikamath91 avatar Sep 12 '18 08:09 nidhikamath91

You can use this:

tokenizer = tf.keras.preprocessing.text.Tokenizer(oov_token="<UNK>")
tokenizer.fit_on_texts(x_text)
x = tokenizer.texts_to_sequences(x_text)

x = tf.keraspreprocessing.sequence.pad_sequences(x, maxlen=max_document_length, padding='post', truncating='post')

ShaneTian avatar Apr 13 '19 08:04 ShaneTian

You can use this:

tokenizer = tf.keras.preprocessing.text.Tokenizer(oov_token="<UNK>")
tokenizer.fit_on_texts(x_text)
x = tokenizer.texts_to_sequences(x_text)

x = tf.keraspreprocessing.sequence.pad_sequences(x, maxlen=max_document_length, padding='post', truncating='post')

Hi please correct that to tf.keras.preprocessing Also please can you help how to transform my below code with tf.keras please:

vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_sequence_length)
x_data = np.array(list(vocab_processor.fit_transform(data)))
vocab_size=len(vocab_processor.vocabulary_)
print(vocab_size)

bhuvanshukla avatar Aug 07 '19 07:08 bhuvanshukla

@bhuvanshukla were you able to resolve your issue? I am also facing the same issue.

KoustubhPhalak avatar Feb 19 '20 09:02 KoustubhPhalak

@bhuvanshukla were you able to resolve this issue ? I am also facing the same issue

anantvir avatar Apr 19 '20 21:04 anantvir

You can use this:

tokenizer = tf.keras.preprocessing.text.Tokenizer(oov_token="<UNK>")
tokenizer.fit_on_texts(x_text)
x = tokenizer.texts_to_sequences(x_text)

x = tf.keraspreprocessing.sequence.pad_sequences(x, maxlen=max_document_length, padding='post', truncating='post')

Hi please correct that to tf.keras.preprocessing Also please can you help how to transform my below code with tf.keras please:

vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_sequence_length)
x_data = np.array(list(vocab_processor.fit_transform(data)))
vocab_size=len(vocab_processor.vocabulary_)
print(vocab_size)

Please correct me if I am wrong. Here is what I learned from lukas' example:

from tensorflow.keras.preprocessing import text, sequence

tokenizer = text.Tokenizer(num_words=VOCAB_SIZE)
tokenizer.fit_on_texts(x_train)
x_train = tokenizer.texts_to_sequences(x_train)
x_train = sequence.pad_sequences(x_train, maxlen=MAX_SEQUENCE_LENGTH)
x_test = tokenizer.texts_to_sequences(x_test)
x_test = sequence.pad_sequences(x_test, maxlen=MAX_SEQUENCE_LENGTH)

In this way, you only get top MAX_SEQUENCE_LENGTH words, so don't need vocab_size.

franklinqin0 avatar Apr 22 '20 09:04 franklinqin0

Thanks Franklin Qin for the suggestion, will try this one. Thanks & Regards,Koustubh Phalak.+91-98677 63453 From: Franklin QinSent: 22 April 2020 15:17To: dennybritz/cnn-text-classification-tfCc: KoustubhPhalak; CommentSubject: Re: [dennybritz/cnn-text-classification-tf] Warning: tensorflow.contrib.learn.python.learn.preprocessing.text) is deprecated and will be removed in a future version. Instructions for updating: Please use tensorflow/transform or tf.data. (#147) You can use this:tokenizer = tf.keras.preprocessing.text.Tokenizer(oov_token="")tokenizer.fit_on_texts(x_text)x = tokenizer.texts_to_sequences(x_text) x = tf.keraspreprocessing.sequence.pad_sequences(x, maxlen=max_document_length, padding='post', truncating='post')Hi please correct that to tf.keras.preprocessingAlso please can you help how to transform my below code with tf.keras please:vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_sequence_length)x_data = np.array(list(vocab_processor.fit_transform(data)))vocab_size=len(vocab_processor.vocabulary_)print(vocab_size)Please correct me if I am wrong. Here is what I came up with:tokenizer = text.Tokenizer(num_words=VOCAB_SIZE)tokenizer.fit_on_texts(x_train)x_train = tokenizer.texts_to_sequences(x_train)x_train = sequence.pad_sequences(x_train, maxlen=MAX_SEQUENCE_LENGTH)x_test = tokenizer.texts_to_sequences(x_test)x_test = sequence.pad_sequences(x_test, maxlen=MAX_SEQUENCE_LENGTH)In this way, you only get top MAX_SEQUENCE_LENGTH words, so don't need vocab_size.—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe. 

KoustubhPhalak avatar Apr 22 '20 10:04 KoustubhPhalak

The VocabularyProcessor class is deprecated in (I believe) Tensorflow v1.8. The reasons is because they want to encourage you to use the Datasets API. I used this code as a starting point https://github.com/LightTag/BibSample/blob/master/preppy.py

Hope this helps!

Your script is quite explicit. If the following function were able to convert to a specific function in TensorFlow 2.x, that would be good. VocabularyProcessor(max_sequence_length, min_frequency=min_word_frequency)

mikechen66 avatar Oct 20 '20 11:10 mikechen66