pvlib-python icon indicating copy to clipboard operation
pvlib-python copied to clipboard

Update CEC module and inverter tables with latest SAM files

Open kandersolar opened this issue 3 years ago • 15 comments

  • [x] Closes #1345
  • [x] I am familiar with the contributing guidelines
  • [x] Tests added
  • ~[ ] Updates entries in docs/sphinx/source/reference for API changes.~
  • [x] Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • ~[ ] New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.~
  • [x] Pull request is nearly complete and ready for detailed review.
  • [x] Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

It's a little concerning how many rows got dropped in each table (compare the line counts in the diff). A very brief and non-comprehensive comparison suggests that the upstream CEC files did indeed drop a lot of rows, so I guess it is what it is.

kandersolar avatar Mar 24 '22 20:03 kandersolar

Not sure why the checks aren't showing up for the latest commit; github's webhooks have been having issues this week, but githubstatus.com doesn't show anything amiss at the moment. Closing and reopening...

kandersolar avatar Mar 24 '22 21:03 kandersolar

I guess this is ready for review. I'm still very confused about the missing rows. For example there's not a single Canadian Solar module in either of these two tables on the CEC website, so they're not in these SAM files either. Am I missing something?

  • https://solarequipment.energy.ca.gov/Home/PVModuleList
  • https://www.energy.ca.gov/media/2367

kandersolar avatar Mar 24 '22 22:03 kandersolar

Could we have a file that is the union of the current and older files? Perhaps with a date or version label added?

adriesse avatar Mar 24 '22 22:03 adriesse

A merged file seems like the best outcome, but it might be difficult to achieve. See https://github.com/pvlib/pvlib-python/issues/761#issuecomment-527606811 and https://sam.nrel.gov/forum/forum-general/3779-missing-pv-modules-and-inverters.html

When they open in a few hours I will call the CEC to ask about these removals; maybe there is some mistake.

kandersolar avatar Mar 25 '22 12:03 kandersolar

The CEC has been removing equipment from their list when certifications expire. https://www.energy.ca.gov/files/equipment-noticed-removal

Some time ago, I had a conversation wtih SAM developers about retaining obsolete listings. The compromise was to retain old versions of the SAM files but I've lost track of that location.

cwhanse avatar Mar 25 '22 13:03 cwhanse

I doubt it's technically difficult. Alternatively pvlib can just add rather than replace new versions of the files.

adriesse avatar Mar 25 '22 13:03 adriesse

I doubt it's technically difficult.

It would seem not, but in fact is an onerous task to merge these files. The NREL process relies on the CEC list's manufacturer and equipment identifiers to assign a key, and CEC changes the format of those fields every few years. To make matters worse, submitters provide the values so when a company or brand changes hands, the module identifier may or may not change. The result is the same module with different keys; to avoid duplicate or competing entries, one would have to merge these lines manually.

Alternatively pvlib can just add rather than replace new versions of the files.

Yes, at the risk of getting user questions about different parameter sets and record structures.

I am convinced the long-term solution is to build out pvlib/pvmodules to have public code that generates the parameters, and to curate the inputs by frequently pulling the CEC list, identifying new items, and adding the content to a pvmodules database, thus accumulating an archival list in a common format.

cwhanse avatar Mar 25 '22 13:03 cwhanse

To make my view clear, @adriesse suggested "Alternatively pvlib can just add rather than replace new versions of the files."

For this PR, I agree with this suggestion.

cwhanse avatar Mar 25 '22 14:03 cwhanse

pvsystem.retrieve_sam currently has no concept of different versions of the same table. Would we add dates to the accepted name values, e.g. retrive_sam(name='cecmod-2021-12-01')? Would name='cecmod' point to the new or the old table?

kandersolar avatar Mar 25 '22 15:03 kandersolar

this might break existing code, but my preference would be to reference the file by name. Maybe not even ship it with pvlib, but tell folks you can retrieve from NREL/SAM GitHub repo: NREL/SAM/deply/lbraries/CEC_Modules.csv

For example:

  1. download the CSV from NREL/SAM gh repo
  2. retrieve_sam(file='path/to/locally/downloaded/CEC_Modules.csv')

Are there missing steps, EG: is the SAM file pre-processed somehow for pvlib?

Another option, would be to leverage the Version column in CEC_Modules.csv and just merge or append the lists together, and force the user to select the version they want as a separate parameter:

retrieve_sam(name='cecmod', version='SAM 2021.12.02')  # use versions=None for latest

there's also a Date field

mikofski avatar Mar 25 '22 15:03 mikofski

tell folks you can retrieve from NREL/SAM GitHub repo

The issue here is that SAM decided to replace the file at each SAM release, so the latest version has the missing/dropped module problem. The archive of parameters for modules dropped by the CEC is the set of files for previous SAM versions.

leverage the Version column

that's a good idea.

cwhanse avatar Mar 25 '22 16:03 cwhanse

I've added a version parameter to switch between the two versions. I've kept the version data files separate instead of stacking them into one table, in part because they don't have exactly the same columns and in part because I just thought it was cleaner. Is this more or less what folks were thinking or am I barking up the wrong tree?

Edit: I couldn't get the CEC office on the phone so I emailed them instead. No reply yet.

kandersolar avatar Mar 25 '22 20:03 kandersolar

Personally I would recommend posting those files somewhere online (zenodo?), removing them from the distribution, and forcing users to manually specify a file path. (I think this is similar to Mark's first suggestion). I'm not too thrilled with complicating the pvlib api for file access. But no worries if you want to go this way.

wholmgren avatar Mar 25 '22 20:03 wholmgren

SAM decided to replace the file at each SAM release

I believe I can crawl their GitHub history to retrieve the older files

the long-term solution is to build out pvlib/pvmodules to have public code that generates the parameters, and to curate the inputs by frequently pulling the CEC list, identifying new items, and adding the content to a pvmodules database

That’s half of what pvfree already does. If only I had more time to work on it. And I feel your pain, merging and parsing constantly shifting model & mfg names is why I made the database to begin with

mikofski avatar Mar 26 '22 04:03 mikofski