mercury
mercury copied to clipboard
NA BMI: cannot transfer data larger than 16 MB
When transferring large data with the NA BMI plugin, Tang is reporting the following error:
# NA -- Error -- /global/homes/w/wzhang5/software/mercury/src/na/na_bmi.c:1773
[E 15:01:01.028255] src/io/bmi/bmi_tcp/bmi-tcp.c line 1313: Error: BMI message too large!
# na_bmi_get(): BMI_post_recv() failed
# HG -- Error -- /global/homes/w/wzhang5/software/mercury/src/mercury_bulk.c:679
# hg_bulk_transfer_pieces(): Could not transfer data
# HG -- Error -- /global/homes/w/wzhang5/software/mercury/src/mercury_bulk.c:817
# hg_bulk_transfer(): Could not transfer data pieces
# HG -- Error -- /global/homes/w/wzhang5/software/mercury/src/mercury_bulk.c:1477
# HG_Bulk_transfer(): Could not transfer data
Could not read bulk data
[E 15:01:01.029567] [bt] /global/homes/w/wzhang5/software/bmi/build/lib/libbmi.so(BMI_tcp_post_send_list+0x15f) [0x2aaaab80774f]
[E 15:01:01.029603] [bt] /global/homes/w/wzhang5/software/bmi/build/lib/libbmi.so(BMI_post_send+0x50) [0x2aaaab80d3e0]
[E 15:01:01.029607] [bt] /global/homes/w/wzhang5/software/mercury/build/bin/libna.so.0.9.0(+0x67e8) [0x2aaaab3ef7e8]
[E 15:01:01.029610] [bt] /global/homes/w/wzhang5/software/mercury/build/bin/libna.so.0.9.0(+0x6d1b) [0x2aaaab3efd1b]
[E 15:01:01.029613] [bt] /global/homes/w/wzhang5/software/mercury/build/bin/libna.so.0.9.0(NA_Progress+0x254) [0x2aaaab3ed824]
[E 15:01:01.029616] [bt] /global/homes/w/wzhang5/software/mercury/build/bin/libmercury.so.0.9.0(+0x56b9) [0x2aaaaacd36b9]
[E 15:01:01.029619] [bt] /global/homes/w/wzhang5/software/mercury/build/bin/libmercury.so.0.9.0(HG_Core_progress+0xf) [0x2aaaaacd6e6f]
[E 15:01:01.029622] [bt] /global/homes/w/wzhang5/software/SoMeta2/api/build/bin/pdc_server.exe(main+0x2e9) [0x404b89]
[E 15:01:01.029625] [bt] /lib64/libc.so.6(__libc_start_main+0xf5) [0x2aaaacc7aac5]
[E 15:01:01.029628] [bt] /global/homes/w/wzhang5/software/SoMeta2/api/build/bin/pdc_server.exe() [0x404c05]
# NA -- Error -- /global/homes/w/wzhang5/software/mercury/src/na/na_bmi.c:2154
# na_bmi_progress_rma(): BMI_post_send() failed
# NA -- Error -- /global/homes/w/wzhang5/software/mercury/src/na/na_bmi.c:1892
# na_bmi_progress_unexpected(): Could not make RMA progress
# NA -- Error -- /global/homes/w/wzhang5/software/mercury/src/na/na_bmi.c:1812
# na_bmi_progress(): Could not make unexpected progress
# HG -- Error -- /global/homes/w/wzhang5/software/mercury/src/mercury_core.c:2143
# hg_core_progress_na(): Could not make NA Progress
# HG -- Error -- /global/homes/w/wzhang5/software/mercury/src/mercury_core.c:3489
# HG_Core_progress(): Could not make progress
# NA -- Error -- /global/homes/w/wzhang5/software/mercury/src/na/na_bmi.c:1773
# na_bmi_get(): BMI_post_recv() failed
There seems to be a TCP_MODE_REND_LIMIT
limit set to 16M in src/io/bmi/bmi_tcp/bmi-tcp.c
@carns Phil, are you aware of that limit?
Yes, that's right unfortunately. The reason there is a limit at all (conceptually) is that it constrains the amount of data that will be streamed in a socket between control headers. If it is arbitrarily large, then other messages that you would like to send over the socket will be starved.
It doesn't matter for memory usage though, since as the #define name implies this only affects rendezvous mode.
In PVFS we didn't hit this limit, because PVFS itself would chunk up data in to smaller units before issuing BMI operations. Does Mercury have the ability to do that on a bulk transfer by any chance?
OK yes we should be able to do that, either at the HG bulk level by returning the number of bytes transmitted or at the NA plugin level directly
A short term workaround would be to crank up that #define if we don't have chunking capability yet :) BMI should technically work with a larger limit, and will actually perform Ok too until you have multiple transfers on the same address pair simultaneously.