ORGAN icon indicating copy to clipboard operation
ORGAN copied to clipboard

When running,I have an error

Open gaojunhui68 opened this issue 7 years ago • 16 comments

When running,I have an error:

Traceback (most recent call last): File "example.py", line 9, in model.train(ckpt_dir='ckpt') File "D:\mypy\ORGAN-master\organ_init_.py", line 763, in train gen_samples, self.train_samples, self.ord_dict, results) File "D:\mypy\ORGAN-master\organ\mol_metrics.py", line 185, in compute_results results[objective] = np.mean(reward(verified_samples, train_data)) File "D:\mypy\ORGAN-master\organ_init_.py", line 743, in batch_reward for sample in samples] File "D:\mypy\ORGAN-master\organ_init_.py", line 743, in for sample in samples] File "D:\mypy\ORGAN-master\organ\mol_metrics.py", line 117, in decode ''.join([ord_dict[o] for o in ords])) File "D:\mypy\ORGAN-master\organ\mol_metrics.py", line 117, in ''.join([ord_dict[o] for o in ords])) KeyError: 'O'

gaojunhui68 avatar Nov 29 '17 00:11 gaojunhui68

Traceback (most recent call last): File "example.py", line 18, in model.train() # Proceeds with the training File "D:\mypy\ORGAN-master\organ_init_.py", line 763, in train gen_samples, self.train_samples, self.ord_dict, results) File "D:\mypy\ORGAN-master\organ\music_metrics.py", line 252, in compute_results results[key] = np.mean(reward(samples, train_samples)) File "D:\mypy\ORGAN-master\organ_init_.py", line 743, in batch_reward for sample in samples] File "D:\mypy\ORGAN-master\organ_init_.py", line 743, in for sample in samples] File "D:\mypy\ORGAN-master\organ\music_metrics.py", line 60, in decode return ' '.join(unpad([ord_dict[o] for o in ords])) File "D:\mypy\ORGAN-master\organ\music_metrics.py", line 60, in return ' '.join(unpad([ord_dict[o] for o in ords])) KeyError: 'b'

gaojunhui68 avatar Nov 29 '17 05:11 gaojunhui68

Hi @gaojunhui68,

Both errors come (obviously) from different sources. Particularly, the first is using the molecular metrics, and the second is using the music metrics.

In both cases, it looks like you are using different training sets in the run and the checkpoint, so the engine is unable to decode (because the internal dictionary does not recognize the features. If you give me more information (i. e., the actual file that you run), I'll be able to give you more information.

Regards, Carlos

couteiral avatar Nov 29 '17 10:11 couteiral

Hi @couteiral,

Yes, the first is using the molecular metrics, and the second is using the music metrics.

For the first, the code of example.py is in bellow:

import organ from organ import ORGAN model = ORGAN('test', 'mol_metrics', params={'PRETRAIN_DIS_EPOCHS': 1}) model.load_training_set('data/toy.csv') model.set_training_program(['novelty'], [1]) model.load_metrics() model.train(ckpt_dir='ckpt')

For the second , the code of example.py is in bellow:

from organ import ORGAN

model = ORGAN('test', 'music_metrics') # Loads a ORGANIC with name 'test', using music metrics model.load_training_set('data/music_small.txt') # Loads the training set model.set_training_program(['tonality'], [50]) # Sets the training program as 50 epochs with the tonality metric model.load_metrics() # Loads all the metrics model.train() # Proceeds with the training

In both cases ,this error occurs in train, after pretrain.

Please help me . Thanks, Junhui Gao

gaojunhui68 avatar Nov 29 '17 13:11 gaojunhui68

Hi @gaojunhui68,

First, the music metrics seem to be bugged. I am afraid I didn't work on them myself, but I'll get in touch with someone involved, and get back to you.

Regarding the molecular metrics, you get a KeyError, which is the error that a Python dictionary raises when a key not in the dictionary is requested. The following is happening: when you try to decode the embedding coordinates to SMILES strings, you are passing the wrong value to the dictionary, and the code crashes.

In particular, you are passing 'O' to the ord_dict, which is the dictionary containing the mapping from the embedding to the SMILES strings, so something is wrong in there. However, I just ran exactly the same code from the actual repo, and I could not found any problem like yours.

Could you share your pretraining files, so I can have a look at them? Also, are you sure that there is nothing wrong with your 'toy.csv' training set?

Cheers, Carlos

couteiral avatar Nov 29 '17 16:11 couteiral

Hi @couteiral,

I had the same error as @gaojunhui68. What I did was:

git clone https://github.com/gablg1/ORGAN.git
pip install -r requirements.txt
python example.py

The error message was

Traceback (most recent call last):
  File "example.py", line 8, in <module>
    model.train(ckpt_dir='ckpt')
  File "/home/yoshikawa/ORGAN/organ/__init__.py", line 763, in train
    gen_samples, self.train_samples, self.ord_dict, results)
  File "/home/yoshikawa/ORGAN/organ/mol_metrics.py", line 183, in compute_results
    results[objective] = np.mean(reward(verified_samples, train_data))
  File "/home/yoshikawa/ORGAN/organ/__init__.py", line 743, in batch_reward
    for sample in samples]
  File "/home/yoshikawa/ORGAN/organ/mol_metrics.py", line 115, in decode
    ''.join([ord_dict[o] for o in ords]))
KeyError: 'O'

I used pyenv. Both anaconda3-5.0.0 and anaconda2-5.0.0 did not work.

n-yoshikawa avatar Dec 13 '17 07:12 n-yoshikawa

Hi @couteiral,

Any update for the previous posts? The error message I got is KeyError: 'N'

I was also running the example.py. It happened at "model.train(ckpt_dir='ckpt')", after finished the pre-training. It looks like it was happened at the same spot like the KeyError: 'O'.

Traceback (most recent call last): File "", line 1, in File "/home/trial0/ORGAN/organ/init.py", line 763, in train gen_samples, self.train_samples, self.ord_dict, results) File "/home/trial0/ORGAN/organ/mol_metrics.py", line 183, in compute_results results[objective] = np.mean(reward(verified_samples, train_data)) File "/home/trial0/ORGAN/organ/init.py", line 743, in batch_reward for sample in samples] File "/home/trial0/ORGAN/organ/init.py", line 743, in for sample in samples] File "/home/trial0/ORGAN/organ/mol_metrics.py", line 115, in decode ''.join([ord_dict[o] for o in ords])) File "/home/trial0/ORGAN/organ/mol_metrics.py", line 115, in ''.join([ord_dict[o] for o in ords])) KeyError: 'N'

Is there anyway to pin point which smiles made the problem? Please let me know if you need any further details.

Thanks in advance! Toushi

toushi68 avatar Apr 09 '18 14:04 toushi68

It looks like this issue is related to the data set. For the toy set, I found there are > 30 entries with empty smiles, i.e. have NumAtom, Name, but the smiles column are empty. From there I further refined the data set with rdkit. With all these trials, I got different KeyError(s), 'C', '[', 'O'. This means the data set still has something wrong! Or a filter is needed before processing the data just like "ORGANIC" does.

toushi68 avatar Apr 10 '18 13:04 toushi68

Hi @couteiral, I found that the error is due to the function mm.decode(ords, rod_dict). The ords may be string or list. Thats why it will occurs an error. I think that there should be 2 different decode() functions.

Could you run the code use your music_small.txt dataset again to help to fix it? Thanks.

yippp avatar Apr 19 '18 08:04 yippp

Hi @couterial,

Have a look of some simple debug. If I insert a print in the decode, like this:

def decode(ords, ord_dict): print (ords) # check return unpad(''.join([ord_dict[o] for o in ords]))

Here are the last few lines printed out before it crashes: ......... [11 1 2 1 1 2 1 1 2 1 4 11 7 21 21 21] [ 1 2 11 1 1 9 11 10 2 11 21 21 21 21 21 21] [ 1 2 1 1 9 8 10 8 21 21 21 21 21 21 21 21] [11 2 1 1 1 3 4 3 9 7 8 10 6 4 21 21] [ 8 1 7 3 4 5 5 3 5 6 4 21 21 21 21 21] N#CC(O)F Traceback (most recent call last): File "/home/trial3/ORGAN/organ/init.py", line 763, in train gen_samples, self.train_samples, self.ord_dict, results) File "/home/trial3/ORGAN/organ/mol_metrics.py", line 185, in compute_results results[objective] = np.mean(reward(verified_samples, train_data)) File "/home/trial3/ORGAN/organ/init.py", line 743, in batch_reward for sample in samples] File "/home/trial3/ORGAN/organ/mol_metrics.py", line 117, in decode return unpad(''.join([ord_dict[o] for o in ords])) KeyError: 'N'

It looks like this is an already decoded smile, which should not be sent back to decode again. Any idea what's going on? Thanks! Toushi68

toushi68 avatar May 01 '18 02:05 toushi68

Hello @couteiral

I ran my code line by line to see where the problem occurs. I see that after importing the dataset with model.load_training_set('data/toy.csv') the error comes up.

Traceback (most recent call last): File "", line 1, in File "/Users/akilhylton/Desktop/ORGAN/organ/init.py", line 242, in load_training_set self.char_dict) for sam in to_use] File "/Users/akilhylton/Desktop/ORGAN/organ/init.py", line 242, in self.char_dict) for sam in to_use] File "/Users/akilhylton/Desktop/ORGAN/organ/mol_metrics.py", line 383, in encode return [char_dict[c] for c in pad(new_smi, max_len)] File "/Users/akilhylton/Desktop/ORGAN/organ/mol_metrics.py", line 383, in return [char_dict[c] for c in pad(new_smi, max_len)] KeyError: '.'

Any solutions?

k105la avatar May 22 '18 14:05 k105la

I have the same error with @ahylton19 .

Traceback (most recent call last): File "example.py", line 5, in model.load_training_set('data/toy.csv') File "/home/yuma_kajihara/projects/ORGAN/organ/init.py", line 242, in load_training_set self.char_dict) for sam in to_use] File "/home/yuma_kajihara/projects/ORGAN/organ/init.py", line 242, in self.char_dict) for sam in to_use] File "/home/yuma_kajihara/projects/ORGAN/organ/mol_metrics.py", line 384, in encode return [char_dict[c] for c in pad(new_smi, max_len)] File "/home/yuma_kajihara/projects/ORGAN/organ/mol_metrics.py", line 384, in return [char_dict[c] for c in pad(new_smi, max_len)] KeyError: '.'

Kajiyu avatar Jun 18 '18 08:06 Kajiyu

I have the same error with @Kajiyu Traceback (most recent call last): File "example.py", line 5, in model.load_training_set('data/toy.csv') File "/media/projects/ORGAN/organ/init.py", line 242, in load_training_set self.char_dict) for sam in to_use] File "/media/projects/ORGAN/organ/init.py", line 242, in self.char_dict) for sam in to_use] File "/media/projects/ORGAN/organ/mol_metrics.py", line 383, in encode return [char_dict[c] for c in pad(new_smi, max_len)] File "/media/projects/ORGAN/organ/mol_metrics.py", line 383, in return [char_dict[c] for c in pad(new_smi, max_len)] KeyError: '.'

Any solutions?

xuzhang5788 avatar Jun 22 '18 21:06 xuzhang5788

we will be updating the repo soon, we expect these changes to be incorporated by monday, latest tuesday...stay tuned, they will fix these issues.

beangoben avatar Jun 22 '18 22:06 beangoben

I have the same error with @xuzhang5788 Seems they are not going to update it?

kristery avatar Dec 04 '18 08:12 kristery

I checked the table of SMILES and it seems that '.' represents one kind of bond and in the mol_metrics.py file they didn't add it. To fix the error please modify line 315 to

chars = chars + ['-', '=', '#', '.']

I guess it works as long as you add '.' to chars.

kristery avatar Dec 04 '18 08:12 kristery

When running,I have an error: Traceback (most recent call last): File "example.py", line 8, in model.train(ckpt_dir='ckpt') File "/home/zy/ORGAN/ORGAN-master/organ/init.py", line 745, in train self.pretrain() File "/home/zy/ORGAN/ORGAN-master/organ/init.py", line 670, in pretrain _, g_loss, g_pred = self.generator.pretrain_step(self.sess,batch) File "/home/zy/ORGAN/ORGAN-master/organ/generator.py", line 210, in pretrain_step outputs = session.run([self.pretrain_updates,self.pretrain_loss,self.g_predictions], File "/home/zy/.conda/envs/zy_2/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 967, in run result = self._run(None, fetches, feed_dict, options_ptr, File "/home/zy/.conda/envs/zy_2/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1166, in _run np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)#np_val = np.asarray(subfeed_val, dtype=subfeed_dtype) File "/home/zy/.local/lib/python3.8/site-packages/numpy/core/_asarray.py", line 83, in asarray return array(a, dtype, copy=False, order=order) TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType' Any solutions? Thank you for your answer.

xueyuanyuan0410 avatar Oct 23 '21 04:10 xueyuanyuan0410