BERN2
BERN2 copied to clipboard
I am running into same issue as @yuetieqi-meow
I am running into same issue as @yuetieqi-meow
No issue encountered in installation, and when I ran the test, in the nohup_bern2.out: `[01/Oct/2022 17:54:12.841469] id: 9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2 [Errno 111] Connection refused [01/Oct/2022 17:54:12.899379] [9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2] GNormPlus 0.0003619194030761719 sec [01/Oct/2022 17:54:13.614587] [9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2] tmVar 2.0 0.7597899436950684 sec [01/Oct/2022 17:54:13.801885] [9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2] Multi-task NER 0.9021279811859131 sec, #entities: 2 Traceback (most recent call last): File "/home/kun/anaconda3/envs/bern2/lib/python3.7/shutil.py", line 566, in move os.rename(src, real_dst) FileNotFoundError: [Errno 2] No such file or directory: './resources/GNormPlusJava/output/9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2.PubTator' -> './resources/tmVarJava/input/9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2.PubTator.PubTator.Gene'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/media/kun/Storage/BERN2/bern2/bern2.py", line 107, in annotate_text output = self.tag_entities(text, base_name) File "/media/kun/Storage/BERN2/bern2/bern2.py", line 376, in tag_entities shutil.move(output_gnormplus, input_tmvar_gene) File "/home/kun/anaconda3/envs/bern2/lib/python3.7/shutil.py", line 580, in move copy_function(src, real_dst) File "/home/kun/anaconda3/envs/bern2/lib/python3.7/shutil.py", line 266, in copy2 copyfile(src, dst, follow_symlinks=follow_symlinks) File "/home/kun/anaconda3/envs/bern2/lib/python3.7/shutil.py", line 120, in copyfile with open(src, 'rb') as fsrc: FileNotFoundError: [Errno 2] No such file or directory: './resources/GNormPlusJava/output/9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2.PubTator'`
No other issues found on other log files:
nohup_disease_normalize.out:
Sieve loading .. 2628 ms, Ready
nohup_gene_normalize.out:
Ready (port 18888)
nohup_gnormplus.out:
Starting GNormPlus Service at 127.0.1.1:18895 Loading Gene Dictionary : Processing ... Loading Gene Dictionary : Processing Time:9.951sec Ready
nohup_multi_ner.out:
`MTNER init_t 13.013 sec.
0it [00:00, ?it/s] 1it [00:00, 34.71it/s]
Prediction: 0%| | 0/1 [00:00<?, ?it/s]
Prediction: 100%|██████████| 1/1 [00:00<00:00, 1.37it/s]
Prediction: 100%|██████████| 1/1 [00:00<00:00, 1.37it/s]nohup_tmvar.out:
Starting tmVar 2.0 Service at 127.0.1.1:18896
Reading POS tagger model from lib/taggers/english-left3words-distsim.tagger ... done [1.5 sec].
Loading tmVar : Processing Time:1.739sec
Ready
input/9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2.PubTator - (PubTator format) : Processing Time:0.521sec
ner success = 9f85ebe7f122e750ca113ce20ef88f448960721e701cd1cc3e1650e2.PubTator`
I noticed when I ran the test case, the nohup_bern2.out nohup_multi_ner.out and nohup_tmvar.out logs are updated, but nohup_gnormplus.out is not (the other two normalize.out log files are not, and I assume because they are not executed because NER encounters error). This sounds like GNormPlus not executed at all? Any ideas on this?
Originally posted by @kunlu-ou in https://github.com/dmis-lab/BERN2/issues/24#issuecomment-1264506002
Update, I did find a file in ./resources/GNormPlusJava/input, but nothing in the output.
Another update, while the server is running, I checked pid using:
ps auxww | grep GNormPlusServer.main.jar | grep -v grep | awk '{print $2}' | sort -r
It is empty. All other services have pids.
I suspect somehow the GNormPlus package did not run and produce results for the following processes. The puzzling thing is no error or feedback in the log file nohup_gnormplus.out
Hi @kunlu-ou
Thanks for reporting the issue. The script we've provided runs GNormPlusServer w/ nohup so that it can run in the background. According to what you've experienced, it seems that GNormPlusServer failed to run in the background.
So, could you run it directly w/o nohup? That would help us to figure out the reason for the issue.
cd resources
cd GNormPlusJava
java -Xmx16G -Xms16G -jar GNormPlusServer.main.jar 18895
Thanks for your response @mjeensung I think I had some interesting findings:
- Running GNormPlusServer alone w/o nohup works fine. See below:
java -Xmx16G -Xms16G -jar GNormPlusServer.main.jar 18895
Starting GNormPlus Service at 127.0.1.1:18895
Loading Gene Dictionary : Processing Time:5.748sec
Ready
And I can verify its pid using ps auxww | grep GNormPlusServer.main.jar | grep -v grep | awk '{print $2}' | sort -r
- After this, I leave the GNormPlusServer running, and I ran the commands in run_bern2.sh in the terminal one by one. Everything seems to be OK, until I ran the
env "PATH=$PATH" nohup python -u server.py \
--mtner_home ./multi_ner \
--mtner_port 18894 \
--gnormplus_home ./resources/GNormPlusJava \
--gnormplus_port 18895 \
--tmvar2_home ./resources/tmVarJava \
--tmvar2_port 18896 \
--gene_norm_port 18888 \
--disease_norm_port 18892 \
--use_neural_normalizer \
--port 8888 \
>> logs/nohup_bern2.out 2>&1 &
After starting server.py, the GNormPlusServer.main.jar process got killed automatically.
3. At this time, if I manually start GNormPlusServer.main.jar using:
java -Xmx16G -Xms16G -jar GNormPlusServer.main.jar 18895
it will start fine, but this will kill the server.py process. And if I then start the server.py process, it will kill the GNormPlusServer.main.jar process.
It seems there is a conflict between the server.py process and GNormPlusServer.main.jar process. When one starts, it automatically kills the other. Any ideas why this could happen?
Hi @kunlu-ou, thanks for sharing.
It seems really odd. Could you see any error logs when one of them was killed? (either from the server.py side or the gnormplus side) Could you try using a different port number for gnormplus instead of using '18895'? Other modules such as tmvar or multiner are still alive?
No errors in any logs when processes were killed, not in nohup_bern2.out, not in nohup_gnormplus.out neither. Other modules look fine. I actually have output file in /multi_ner/output, /resources/tmVarJava/output,
I tried different ports for gnormplus (1889, 19991, 18899), still not working.
I noticed BERN2 server runs on localhost (i.e. 127.0.0.1), but gnormplus starts on 127.0.1.1 Can that be a problem? It is really odd.
Hi @kunlu-ou, sorry for the late response.
GNormPlus on my server also starts on 127.0.1.1, so it wouldn't be a problem.
It seems difficult to figure out the problem without debugging, so it would help if we could find the exact line that's killing GNormPlus when loading BERN2. Could you try overwriting the init() in bern2/bern2.py
with the following codes?
def __init__(self,
gnormplus_home,
gnormplus_port,
tmvar2_home,
tmvar2_port,
mtner_home,
mtner_port,
gene_norm_port,
disease_norm_port,
cache_port,
gnormplus_host='localhost',
tmvar2_host='localhost',
mtner_host='localhost',
cache_host='localhost',
time_format='[%d/%b/%Y %H:%M:%S.%f]',
max_word_len=50,
seed=2019,
use_neural_normalizer=True,
keep_files=False):
import psutil
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0,0
self.time_format = time_format
print(datetime.now().strftime(self.time_format), 'BERN2 LOADING..')
random.seed(seed)
np.random.seed(seed)
if not os.path.exists('./output'):
os.mkdir('output')
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0,1
# delete prev. version outputs
if not keep_files:
delete_files('./output')
delete_files(os.path.join(gnormplus_home, 'input'))
delete_files(os.path.join(tmvar2_home, 'input'))
delete_files(os.path.join('./multi_ner', 'input'))
delete_files(os.path.join('./multi_ner', 'tmp'))
delete_files(os.path.join('./multi_ner', 'output'))
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0,2
# FOR NER
self.gnormplus_home = gnormplus_home
self.gnormplus_host = gnormplus_host
self.gnormplus_port = gnormplus_port
self.tmvar2_home = tmvar2_home
self.tmvar2_host = tmvar2_host
self.tmvar2_port = tmvar2_port
self.mtner_home = mtner_home
self.mtner_host = mtner_host
self.mtner_port = mtner_port
self.max_word_len = max_word_len
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0,3
# FOR NEN
self.normalizer = Normalizer(
gene_port = gene_norm_port,
disease_port = disease_norm_port,
use_neural_normalizer = use_neural_normalizer
)
# (Optional) For caching, use mongodb
try:
client = MongoClient(cache_host, cache_port, serverSelectionTimeoutMS = 2000)
client.server_info()
self.caching_db = client.bern2_v1_1.pmid
except Exception as e:
self.caching_db = None
print(datetime.now().strftime(self.time_format), 'BERN2 LOADED..')
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0,4
I just added the following line to find when GNormPlusServer dies while loading BERN2.
assert len([p.cmdline() for p in psutil.process_iter() if 'GNormPlusServer.main.jar' in p.cmdline()])==0
If you have any follow-up questions, please re-open this issue.