pecan icon indicating copy to clipboard operation
pecan copied to clipboard

Refactor `convert_input.R` to Optimise Workflow

Open Sweetdevil144 opened this issue 8 months ago • 0 comments

Tasks

  • [x] Combine multiple queries using PEcAn.DB::db.query to fetch required data at once and make further checks on that basis.

condense database lookups so that they happen all at once in one or a couple steps rather than spread out through every code path

  • [ ] Reuse, Reduce and Pass functional Arguments wherever they can be rather than looking up in DB every time we need the required information

How can we be so sure we have the info available offline to us

I’m thinking less of maintaining the same information in another location and more of thinking about things

  • that can be passed around as function arguments rather than looked up again
  • are currently looked up but never used
  • that are currently used but we can find a way to do without them entirely etc

--> Chris in Slack Link

  • [ ] Big picture goal is to make entire workflow less dependent on PEcAn.DB

Low Priority Tasks

These may be decided after discussion with PEcAn admins. This can be ignored for now, but it can also be tackled.

  • [ ] Improper variable declaration within restricted code blocks which may cause variable out-of-scope errors
  • [ ] Small support as discussed with Chris: remove support of db.site.lat.lon function completly and replace it with query.site(params)['lat','lon']

Suggestion

Maybe we can try this for a better lead

  • [ ] Break the convert_input and do_conversions into several smaller functions. Once this is done, many other steps would become clear to us as for where to optimise the file.
  • [ ] Eliminate Bad system config. For example in convert_input we have } else if (ensemble > 1) { where ensemble was previously declared to be of type bool.
  • [ ] Eliminate other WARN types which shouldn't have been occuring?

Context

The proposed changes are part of the GSoC project aimed at optimizing the PEcAn workflow. Currently, the workflow involves multiple database lookups and queries, which can lead to performance Issues.

Possible Implementation

TBD or as mentioned in above task points.

Sweetdevil144 avatar Jun 19 '24 11:06 Sweetdevil144