cae
cae copied to clipboard
Error when run train_classifier.py
I Run train_classifier.py with following command on my PC successfully:
PS F:\PycharmProjects\cae> python .\train_classifier.py -b 128 -o "./out/" -l "(64)5c-2p-(64)3c-2p-(64)3c-2p" -fc 10 -ds "mnist"
However it crashed on the first training step:
Could you have a look at it? Thanks in advance!
I'm taking a look at it it's probably due to some updates.
Hi, I think I've messed somethings up in the master branch. I believe I've fixed them. If the problem continues please checkout to the old
branch. In addition, I see that the README is not clear enough. I have added some more content please read them. Since I've used this code for my own research It's not exactly the same with the paper.
From README:
train_classifier.py
trains a Linear SVM and saves the embeddings of the previously trained Autoencoder. To train the Autoencoder you should use train_autoencoder.py
. Output directory in train_classifier
is the output directory where your Autoencoder is saved (Naming could have been better). Therefore, this model do not include the softmax layer in the network as discussed in the reference so the -fc
parameter given while traning is not connected to a classifier but it is a bottleneck between encoder and decoder.
When I execute python train_autoencoder.py -e 100 -i 10 -b 64 -l "(64)5c-2p-(64)3c-2p-(64)3c-2p" -lr 0.001 -tb 0 -o out -fc 512 -s 1000 -ds mnist
(the parameters here may not be correct :smile: ) I get:
Using TensorFlow backend.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
2018-02-27 18:00:08.661598: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-02-27 18:00:08.778751: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-02-27 18:00:08.779022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.46GiB
2018-02-27 18:00:08.779039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
Forming encoder
Forming decoder
Forming L2 optimizer with learning rate 0.001
Preprocessing
Started training.
Train steps: 3437
NVIDIA-SMI (so we see that it's working)
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| 0% 47C P2 83W / 230W | 7902MiB / 8112MiB | 56% Default |
+-------------------------------+----------------------+----------------------+
I hope that I was helpful and this suits you well. Have a nice day!
Thanks for your reply! I will check it!