NDBC data
https://www.ndbc.noaa.gov/
See an example here: https://github.com/saeed-moghimi-noaa/prep_obs_ca
# Line 250
#coops_ndbc_obs_collector.py
#################
@retry(stop_max_attempt_number=5, wait_fixed=3000)
def get_ndbc(start, end, bbox , sos_name='waves',datum='MSL', verbose=True):
"""
function to read NBDC data
###################
sos_name = waves
all_col = (['station_id', 'sensor_id', 'latitude (degree)', 'longitude (degree)',
'date_time', 'sea_surface_wave_significant_height (m)',
'sea_surface_wave_peak_period (s)', 'sea_surface_wave_mean_period (s)',
'sea_surface_swell_wave_significant_height (m)',
'sea_surface_swell_wave_period (s)',
'sea_surface_wind_wave_significant_height (m)',
'sea_surface_wind_wave_period (s)', 'sea_water_temperature (c)',
'sea_surface_wave_to_direction (degree)',
'sea_surface_swell_wave_to_direction (degree)',
'sea_surface_wind_wave_to_direction (degree)',
'number_of_frequencies (count)', 'center_frequencies (Hz)',
'bandwidths (Hz)', 'spectral_energy (m**2/Hz)',
'mean_wave_direction (degree)', 'principal_wave_direction (degree)',
'polar_coordinate_r1 (1)', 'polar_coordinate_r2 (1)',
'calculation_method', 'sampling_rate (Hz)', 'name'])
sos_name = winds
all_col = (['station_id', 'sensor_id', 'latitude (degree)', 'longitude (degree)',
'date_time', 'depth (m)', 'wind_from_direction (degree)',
'wind_speed (m/s)', 'wind_speed_of_gust (m/s)',
'upward_air_velocity (m/s)', 'name'])
See also: https://pypi.org/project/ndbc-api/ https://github.com/cdjellen/ndbc-api
@AliS-Noaa @aliabdolali
What are your preferred web api to download NDBC data?
Thanks
Hello Saeed,
Here are two ways I usually get the NDBC data:
https://github.com/NOAA-EMC/WW3-tools/blob/develop/ww3tools/downloadobs/wfetchbuoy.py
also this is a good tool as well:
https://pypi.org/project/NDBC/
Cheers, -------------------------------------------------------- Ali Salimi-Tarazouj, Ph.D. Physical Scientist, Coastal Engineer Lynker at NOAA/NWS/NCEP/EMC 5830 University Research Court College Park, Maryland, 20740 Office: (202) 964-0965 Mobile: (302) 588-5505
On Wed, May 8, 2024 at 10:39 AM Saeed Moghimi @.***> wrote:
@AliS-Noaa https://github.com/AliS-Noaa @aliabdolali https://github.com/aliabdolali
What are your preferred web api to download NDBC data?
Thanks
— Reply to this email directly, view it on GitHub https://github.com/oceanmodeling/searvey/issues/137#issuecomment-2100745649, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4J7D7UFM5GI22CYFH3JHM3ZBI2LBAVCNFSM6AAAAABHNCR5C2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQG42DKNRUHE . You are receiving this because you were mentioned.Message ID: @.***>
From a correspondence with one of our colleagues:
Near-real-time observations from NWS Fixed Buoys and NWS C-MAN Stations and from many ROOA operated buoys and coastal stations are available on the ndbc.noaa.gov web site. I don't know if NDBC has an API yet, but one can obtain their obs via HTTPS or DODS/OPeNDAP https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf However, I found that someone has written 'ndbc-api' to "parse whitespace-delimited oceanographic and atmospheric data distributed as text files for available time ranges, on a station-by-station basis" (https://pypi.org/project/ndbc-api/). I also found ndbc.py at https://pypi.org/project/NDBC/. I imagine there are many others out there.
During our meeting on June 5th we discussed the following items/tasks related to NDBC data:
- Using
ndbc-apipacakge, an alternative package or write from sctach- For now let's continue with a third-party package, resolve following issues
- Possible issues with thirdparty package license
-
ndbc-apiis MIT if we end up using it
-
- Is the third-party package already on conda-forge or is there a plan for it to be?
- If not, explore other NDBC packages
- If we help adding the Conda package will the original developer maintain?
- Should we go ahead and just create a conda packages and maintain it? (ideally not)
- Is raw data available (NDBC itself seems to have QC)
- The issue of fetching station data one by one.
Todo:
- [ ] @abdu558 to start a PR when the code is ready
- [x] @abdu558 to contact the third-party pacakge developer and ask about "conda"-related questions Ans: they are open creating and maintaining conda package
- [x] @SorooshMani-NOAA to contact NDBC about raw data [email protected]
- [x] @abdu558 to check if the web API provides the capability of multistation data or not: Ans: upstream package uses plain for loop for multiple stations
Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the _ndbc_api.py as the file name, just use ndbc.py. What do you think?
Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.
@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?
@abdu558, I forgot to ask, what is the state of conda package for ndbc-api? You said they are open to creating the conda package themselves, right?
Yea they did create it and said it would take a few days ish for it to show up
Response from NDBC:
[...] We do not have an API though we are hopeful to develop one in the future.
Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/
Response from NDBC:
[...] We do not have an API though we are hopeful to develop one in the future. Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/
Thanks Soroosh. more on QC here: https://www.ndbc.noaa.gov/faq/qc.shtml There is an exhaustive guide on the QC methodology (2009 version) and all the QC flags summarized in APPENDIX E.
Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the
_ndbc_api.pyas the file name, just usendbc.py. What do you think?Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.
@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?
You answered most of them but one that im not 100% sure of is if when multiple stations:
-
an extra column is added called station id and the data of the different stations are combined to a single data frame
-
outputs a dictionary which maps each id -> a dataframe of the stations data
this is the one that im not 100% sure of
@abdu558 different providers return different data. For example, when you try to retrieve data from a bunch of IOC stations you will end up with dataframes with different number of columns and different column names. E.g.
https://www.ioc-sealevelmonitoring.org/bgraph.php?code=aden&output=tab&period=0.5&endtime=2018-06-07 https://www.ioc-sealevelmonitoring.org/bgraph.php?code=abed&output=tab&period=0.5&endtime=2018-06-07
Merging these will result in with a bunch of columns with NaNs. This is problematic because NaNs are floats and consume quite a bit of RAM. If you are retrieving hundreds/thousands of stations for many years this can quickly become problematic
Furthermore, since you can't really know which column will have data for each station, you will end up calling .dropna() for every station id you want to process. Which can also be problematic, because the provider might return NaNs anyhow and you might want to differentiate between those.
Alternatively, you can just avoid merging in the first place. If somebody wants to merge the dictionary it is trivial to do so. E.g.:
data = {
"st1": pd.DataFrame(index=["2020", "2021"], data={"var1": [111, 222]}),
"st2": pd.DataFrame(index=["2021", "2022", "2023"], data={"var2": [1, 2, 3], "var3": [0, float("nan"), float("nan")]}),
}
merged = pd.concat(data, names=["station_id", "time"]).reset_index(level=0)
print(data)
print(merged)