add NetCDF engine
In geoscience especially in meteorology and climate, people use extensively NetCDF format for numerical model output. Many observations including satellite products are in NetCDF format.
As the increasing power of supercomputers, numerical weather prediction (NWP) models and climate models have very high spatiol resolutions and output in a high frequency. These large amount outputs take a significant part of the total run time. If numerical models can use ADIOS2 to handle NetCDF asynchronous reading and writing (even updating), it would improve model efficiency significantly. Have a efficient, scalable, and unified parallel IO as ADIOS would benifit the numerical model community a lot.
I know NetCDF very well and want to make a contribution to add NetCDF engine into ADIOS2. But I do not know the process to add an engine in ADIOS. If someone can show me how to do that, I think I can do the rest.
@YongjunZHENG thanks for sharing the interesting work and your plans. Just wanted to give you a brief overview since for the most part adios2 is open source, still new functionality (especially a new production engine) is always a large task. I'd check with @pnorbert and @chuckatkins of the dev team on the integration challenges with API bindings, hardware, performance, CI and packaging when adding a new dependency. @guj did the HDF5 work which is a nice benchmark for an existing library as a backend (that could be a starting point). We have contributing guidelines in the wiki, which are pretty standard. Also @pnorbert is more aware of some work done on NetCDF converters. Hope this helps.
Hi @YongjunZHENG,
Can you please explain more how do you see ADIOS helping the community? What do you want to achieve by adding a NetCDF engine? What do you expect to provide better than using NetCDF library directly?
One assumption may be that ADIOS could speed up processing (reading and writing NetCDF files). This is not possible. ADIOS can provide good performance and good scalability to its own file format but not to other formats.
A unified I/O API to work with and then choose output targets at runtime and be able to read back an input file regardless of its format (BP, HDF5, CDF), sounds good to us. That's what we provide with the HDF5 engine. It directly uses the HDF5 API and library underneath to provide the compatibility to HDF5 format but obviously cannot provide any better performance or scalability than what HDF5 can achieve itself. We always wanted something similar to NetCDF but had no time to do it.
However, the semantics and limitations of the APIs are different, so there is no way to support 100% of the functionality of another library/format. E.g. there is no update function to existing data in ADIOS. Everything is write once in our data model. Unlimited dimension is also too generic for us, we force users to go step by step forward to produce an array over time, and at read time, the time has to be handled separately, not as an extra spatial dimension.
Another approach may be better suited to a particular community: data schema (naming conventions, meshes) and I/O library targeting the community, and then provide multiple drivers under it to use HDF5, ADIOS2 and PNetCDF. E.g. OpenPMD is being developed for particle physics, PIO/SCORPIO for climate, ASDF for seismology. Of course, each of these is a big quest and takes many years to take a foothold in the community.
OpenPMD https://github.com/openPMD/openPMD-standard ASDF https://seismic-data.org/ SCORPIO https://e3sm.org/scorpio-parallel-io-library/
For the reason explained above, we would be very happy if you added an engine for compatibility with the NetCDF format and its users, but we don't want you to start working on it for the wrong reasons.
Thank you Norbert
Hi @pnorbert ,
Thank you very much for your detailed explanation!
Many numerical weather prediction models and climate models usually read initial files at the beginning of the run, but frequently write history files during the run. Using asynchronous IO, models just send the output data to ADIOS, then return immediately to continue the run; ADIOS takes care of how to efficiently write the data into the NetCDF files. In this way, models can significantly reduce IO time.
As you mentioned, applications directly using PNetCDF or NetCDF based on parallel HDF5 can achieve good performance with proper parallel strategies but it depends on the underlying network and parallel file system. Developers need to understand well the underlying memory, network, storage architectures. By hiding the underly complex software and hardware, ADIOS offers a unified, simple, yet efficient API for IO, especially staging (asynchronous IO). It also reduces developers' learning curve since many developers are physicists.
With regard to the update function, occasionally it would have some inconveniences. For the unlimited dimension, it usually is not a problem. Because the data to be output are huge, we usually output separate files.
I am aware of domain-specific libraries such as PIO and XIOS in our community, which are very promising. Personally, I like to use ADIOS at least for my own work because of its unified and simple API and hierarchical structure.
Thank you Yongjun
Thanks for the explanation. So you can/want to use extra processes that act as IO processes and do the staging from the application processes to the IO processes asynchronously, and then write to NetCDF files? For a scenario like this, you can have two options: write the IO component as a parallel application that reads using ADIOS API and then write output either with PNetCDF or with ADIOS (if ADIOS had a netcdf engine). For the latter, you/we need to add an engine that is capable of writing to NetCDF format.
You have to implement the reading part anyway (although adios2_reorganize utility is already provided for global arrays, so you probably could use that). So the NetCDF engine would be an additional work to support writing the data out into the file system (and provide read capability from netcdf files).
In this case, one has to do the same as with the HDF5 engine: use the parallel/serial NetCDF library and API and implement the ADIOS2 Engine class with it.
That would be great if you could do it, and we will help you. However, I would like to make sure that the staging step from your application to the IO processes performs well enough. There is no point for the file writer if you eventually throw it away because the asynchronous IO does not work for your case as well as you hoped.
I am also very interested in NetCDF output from ADIOS2. In that regard, it appears that there was an old ADIOS bp2ncd utility that could convert ADIOS files to NetCDF files. What is the status of that tool? Is it relevant for ADIOS2 BP files?
I see that there seems to be a replacement for the conversion tools - "adios_reorganize". It appears NetCDF functionality has been dropped, is that correct? Just trying to understand the full state of NetCDF and ADIOS2 interoperability.