DrugEx
DrugEx copied to clipboard
Package code, make pip installable, and add CLI
Closes #1
This is a really big PR, so very quickly it does a few things:
- Reorganizes repository to follow python community standard
- Adds
setup.cfg
andsetup.py
and update imports so it can be pip installed (+ updated README) - Adds a command line interface using
click
(+ updated README)
As I'm making updates, I'm running the code to see if I can reproduce the results. I've had to change a few things around in the process:
- Apply pep8 and flake8 standards such as documentation style in modules, variable names, etc.
- Reorganize code that lived in
if __name__ == '__main__'
when there were global variables - Add more logging with
tqdm
- Make sure sad computers (like mine) can run on CPU
I'm submitting it as a draft PR. After the conference in Sheffield, I was excited to get back to work and try it out, but it was more than I could handle/my computer could process in one day. I'll continue updating the other scripts so they can be ran with the CLI until I've reproduced everything, then I'll give feedback on what I wasn't able to.
I would be happy to explain any of the changes I made in more detail, or give you some resources that I've read along the way while I was learning all of this packaging stuff.
After some number of hours, I got the following error. I'm pretty sure it's not due to any changes I've made, but maybe you have some ideas.
$ python -m drugex.pretrainer -d data/ -o output/
Exploitation network begins to be trained...
/Users/cthoyt/dev/DrugEx/src/drugex/util.py:119: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'.
df = pd.read_table(df)
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 74, in <module>
output_directory=output_directory,
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 69, in main
def main(input_directory, output_directory, batch_size, cuda, use_tqdm):
File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 34, in _main_helper
if not os.path.exists(net_pr_pickle_path):
File "/Users/cthoyt/dev/DrugEx/src/drugex/model.py", line 419, in fit
best_error = np.inf
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in __next__
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/Users/cthoyt/.virtualenvs/hbp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 560, in <listcomp>
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 135, in __getitem__
encoded = self.voc.encode(self.tokens[i])
File "/Users/cthoyt/dev/DrugEx/src/drugex/util.py", line 83, in encode
arr[i] = self.tk2ix[char]
KeyError: '[BH-]'
Hi,
The reason is that '[BH-]' is not contained in the data/voc.txt, you can add it manually.
Cheers,
Xuhan
------------------ 原始邮件 ------------------ 发件人: "Charles Tapley Hoyt"[email protected]; 发送时间: 2019年6月24日(星期一) 晚上11:10 收件人: "XuhanLiu/DrugEx"[email protected]; 抄送: "Subscribed"[email protected]; 主题: Re: [XuhanLiu/DrugEx] Package code, make pip installable, and add CLI(#3)
After some number of hours, I got the following error. I'm pretty sure it's not due to any changes I've made, but maybe you have some ideas.
$ python -m drugex.pretrainer -d data/ -o output/ Exploitation network begins to be trained... /Users/cthoyt/dev/DrugEx/src/drugex/util.py:119: FutureWarning: read_table is deprecated, use read_csv instead, passing sep='\t'. df = pd.read_table(df) Traceback (most recent call last): File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/cthoyt/dev/DrugEx/src/drugex/pretrainer.py", line 74, in
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Quite a few years later, but we now have an improved version that also includes CLI, packaging and other goodies: https://github.com/CDDLeiden/DrugEx :)