pyclaw icon indicating copy to clipboard operation
pyclaw copied to clipboard

Plotting failure from trying to read num_ghosts in example inputs

Open gridley opened this issue 2 years ago • 16 comments

Hi,

I recently was hoping to plot some output from the examples, and I found the solver stage working fine. However, for plotting, it seems there is a slight issue. The fort.tXXXX files do not have the num_ghost field, but this is attempted to be read by the plotting utility in fileio/ascii.py's read_t function.

The fix would be either writing num_ghost in the output files, or not failing when an attempt is made to read this field and the EOF gets reached.

If reproducing this is not straightforward, happy to give specifics on my setup.

gridley avatar Aug 22 '22 20:08 gridley

Usually that has to do with a version discrepancy as we added ghost cell values when we added (awhile ago) binary output. Can you provide a reproduction or at least version numbers that you are at?

mandli avatar Aug 22 '22 23:08 mandli

Sure! This was from git cloning and installing via setup.py yesterday. Couldn't get pip to work. I am afraid that the fortran binaries and the python code are inconsistent versions, but don't know how I could check that.

This came when I tried to plot the example output for the 1D dam break problem. Just giving htmlplot=True on the command line reproduces it.

gridley avatar Aug 23 '22 16:08 gridley

I ran a few tests that made me realize that the pip problems may be indicative of a wider problem. What errors did you get when you tried to install with pip?

mandli avatar Aug 24 '22 00:08 mandli

Here you go!

Collecting clawpack
  Downloading clawpack-5.9.0.tar.gz (5.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.8/5.8 MB 27.8 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Discarding https://files.pythonhosted.org/packages/7b/3a/6b727ea26bf2945946e474ce93988722bdadffe6b9c1c1ec3935ed0806ec/clawpack-5.9.0.tar.gz (from https://pypi.org/simple/clawpack/): Requested clawpack from https://files.pythonhosted.org/packages/7b/3a/6b727ea26bf2945946e474ce93988722bdadffe6b9c1c1ec3935ed0806ec/clawpack-5.9.0.tar.gz has inconsistent version: filename has '5.9.0', but metadata has '0.0.0'
  Downloading clawpack-5.8.2.tar.gz (5.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 33.8 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      error: Multiple top-level packages discovered in a flat-layout: ['pyclaw', 'classic', 'amrclaw', 'geoclaw', 'visclaw', 'riemann', 'clawpack', 'clawutil'].

      To avoid accidental inclusion of unwanted files or directories,
      setuptools will not proceed with this build.

      If you are trying to create a single distribution with multiple packages
      on purpose, you should not rely on automatic discovery.
      Instead, consider the following options:

      1. set up custom discovery (`find` directive with `include` or `exclude`)
      2. use a `src-layout`
      3. explicitly set `py_modules` or `packages` with a list of names

      To find more information, look for "package discovery" on setuptools docs.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

gridley avatar Aug 31 '22 20:08 gridley

@gridley Please install using the command given on this page:

pip install --src=$HOME/clawpack_src --user -e \ git+https://github.com/clawpack/[email protected]#egg=clawpack-v5.9.0 \ --use-deprecated=legacy-resolver

ketch avatar Sep 01 '22 08:09 ketch

The error comes from trying to read the number of ghost cells here: pyclaw/src/pyclaw/fileio/ascii.py in the read_t function

with open(path,'r') as f:
        t = read_data_line(f)
        num_eqn = read_data_line(f, data_type=int)
        nstates = read_data_line(f, data_type=int)
        num_aux = read_data_line(f, data_type=int)
        num_dim = read_data_line(f, data_type=int)
        num_ghost = read_data_line(f, data_type=int)

ghost cells are not in the generated fort file.

CaliShulz avatar Sep 28 '22 14:09 CaliShulz

Do you have an older version of the installation? This extra line was added awhile ago to help with binary IO. It also implies that you were able to install things so cannot be the same error as you were having before.

mandli avatar Sep 28 '22 15:09 mandli

It seems this is actually a real bug, introduced in 6e284b91a975ef580bb60c53cf0119821cc9ec1c.

The PyClaw ascii writer does not write num_ghost to the file, because PyClaw doesn't need it. I guess that GeoClaw and AMRClaw do write num_ghost to the file? Is there a reason for that?

ketch avatar Oct 07 '22 08:10 ketch

Indeed, writing num_ghost to the output is not possible with the current setup in PyClaw, since only the solver knows about num_ghost, and the file writer doesn't have access to the solver. In my view, from a high-level software design perspective, information like num_ghost should not be in the output since it isn't actually a property of the solution (unless you are writing ghost cell values to the output file?). Presumably there is a reason that I don't know about.

Paging @rjleveque and @mandli.

ketch avatar Oct 07 '22 08:10 ketch

We had to start printing num_ghost when we added the binary output option in amrclaw/geoclaw, since the binary dump is the entire q array, including ghost cells. So we need to know how many cells to strip off after reading in to Python.

Perhaps for PyClaw you could just always write out num_ghost = 0 since you never need to strip cells when reading back in? When reading ASCII files I think this number is ignored anyway, although for more consistency we could also write num_ghost = 0 in the ASCII case for amrclaw/geoclaw too, and interpret this as the number of ghost cells in the output (not the computation).

rjleveque avatar Oct 07 '22 13:10 rjleveque

We really need to have this be uniform across PyClaw and the other packages as we rely on this for plotting. We also should have access to the number of ghost cells so we can simply put the correct number in there. I am a bit puzzled as to how this slipped through the cracks though. This has been in the code though for a bit so I am a bit confused.

mandli avatar Oct 07 '22 14:10 mandli

Binary output has been in for a long time but I recently (in the commit @ketch referenced) tried to streamline things and use the same read_t function in all repos, so I must have missed the fact that the pyclaw version wasn’t reading num_ghost.

rjleveque avatar Oct 07 '22 19:10 rjleveque

It slipped through because the PyClaw tests aren't functioning (my fault).

I feel that the root of the problem is that in some cases we are writing to file some values (the ghost cell q values) that we actually don't want. The most elegant solution would be to avoid writing those values, rather than to modify the code everywhere else to remove them later. Is that possible?

There are multiple easy ways to "fix" the problem on the PyClaw end, but cleaning up bad code was the point of the commit from @rjleveque , and I feel like we're now just going to exchange one ugly bit of code for another.

ketch avatar Oct 10 '22 13:10 ketch

It turns out that it's actually useful to have the ghost cell values in the output for some applications, in particular if you want to compute the vorticity of the velocity field as we have done in some tsunami applications. This requires differencing the velocity on patches, and using centered-differences everywhere is good to avoid artifacts at patch edges in AMR simulations, which requires having the ghost cell values.

But that is still only an AMR issue that probably doesn't apply in PyClaw. We could go back to having slightly different versions of the read_t function if that seems preferable for PyClaw.

rjleveque avatar Oct 10 '22 13:10 rjleveque

If MPI is used in PyClaw, does each MPI process write out its own grid? In this case, it might be useful to have ghost cells for the reason Randy mentioned above (computing vorticity).

Another reason for ghost cells is that in any scenario where the output consists of multiple grids, contour lines will generally break at patch boundaries. With the availability of ghost cells, one can imagine improving the graphics to get continuous contour line plots.

donnaaboise avatar Oct 10 '22 15:10 donnaaboise

Okay, you have valid reasons for including the ghost cell values in output. :-)

I will prepare a patch so this doesn't break PyClaw. Probably @rjleveque's suggestion to just write num_ghost=0 is the cleanest.

ketch avatar Oct 11 '22 13:10 ketch