sw4 Memory/computation enhancement through bypassing recording of sides over source precursor

This PR allows for an option to save memory usage of FWI through bypassing recording fields at boundaries before waves reach the boundaries or over a fraction of source precursor. The estimated minimum time to reach boundaries for P- or S-wave is generated during travel time preprocessing and can be updated if the velocity model receives significant changes. A test on the Gaussian anomaly confirms no adverse effect on inversion convergence.

Example usage: mrun skip_precursor=yes (no by default)

Mar 01 '22 21:03 wliuphd

When the new option skip_precursor is activaded, the code does not pass the tests under pytest-sw4mopt/. The first test gradtest/grad.in is reported as a PASS, but looking at the numbers, the relative difference in the gradient is actually big (I should revise the tolerance settings in the python script), the tests under hesstest/hesstest.in and onepoint/inv.in are reported as FAIL. If I understand the code correctly, the adjoint/backward problem is not solved all the way down to time zero, could that be the explanation for the failure to reproduce the reference results ?

Mar 03 '22 18:03 bjorn2

When skip_precursor is turned on, the adjoint/backward problem is not solved all the way down to time zero. That could be the explanation for the failure to reproduce the reference results.

Mar 03 '22 18:03 wliuphd

Wei, could you fix that problem, and see if it helps ?

Mar 03 '22 18:03 bjorn2

The code shall pass all tests with skip_precursor off. Is that the case? Since the reference data for gradtest and hesstest assume time zero, the option with skip_precursor on shall not be compared for those tests?

Mar 03 '22 18:03 wliuphd

Once your new feature is properly implemented all of the current test need to pass when skip_precursor is enabled. Follow Bjorn's advise above. To properly compute the gradient, the adjoint equation must be solved all the way back to time zero. It is incorrect to stop it just because the field is zero in the outflow boundary.

Mar 03 '22 18:03 andersp

The gradient is cut within t0 before reaching zero because the cross-correlation contribution is confined very locally to the source and shall have little effect on velocity perturbation. It is equivalent to muting gradient around source. I will take a look at gradtest and hesstest to see the difference in gradient.

Mar 03 '22 19:03 wliuphd

Wei, just to clarify. I have not studied your code in detail, but assuming that you save memory by not storing the solution at the outflow boundaries for the first time steps when it is zero, then I think it is reasonable that it should be loss-free, ie., that the computed gradient should be identical with and without the feature turned on. If that is not how you designed the algorithm, then we need to discuss more. However, I thought it was a problem with the implementation. The idea that the observed differences in gradients is caused by not solving to time zero in the backward solver, is just a suggestion where to start looking, but it is possible that there is another explanation.

Mar 04 '22 17:03 bjorn2

The original design was lossy to save compute time when the gradient is spatially localized to the source. The new revision (less aggressive) removes such a time stepping truncation by estimating the minimum time reaching boundaries or a default fraction of source t0. The revision has passed tests on grad, hess, midpoint, and inv, even when skip_precursor is enabled.

Mar 22 '22 22:03 wliuphd

@wliuphd I am quite concerned about your changes. In moptmain.C I see code with a lot of ad hoc scaling factors based on t0, but nothing about the source frequency. This must mean that you are assume that the frequency is sufficiently high in the time function. You have to get away from all such assumptions. The only safe way of doing that it to actually check the amplitude of the velocity on the outflow boundary. It is only safe to not record the motion when the amplitude is smaller than some small threshold value. However, and more seriously, I am actually not sure why you are working on this at all? Instead it would be more productive to take a look at our milestones and identify what's needed for material inversion with topography.

Here is a snippet of code from moptmain.C that makes me question your approach: int step_to_record = ft==NULL? int(a_Sources[0]->getTimeOffset()/dt0.38) : int(t_trunc0.9/dt);
if(myrank==0) printf("t_trunc=%g\tstep-to-record=%d out of two truncation criteria %d and %d\n", t_trunc, step_to_record, int(a_Sources[0]->getTimeOffset()/dt0.38), int(t_trunc0.9/dt));

Mar 24 '22 15:03 andersp

The main cutoff criterion is the minimum propagation time (t_trunc) from source to the sides based on the velocity model. The main motivation is to save memory storage for high resolution simulation so we can run 1 Hz FWI using less resources.

Mar 24 '22 18:03 wliuphd

The fundamental problem with your approach is that it doesn't take the spread of the source time function into account. For this reason there will be cases where the truncation will lead to inaccurate gradients. I suggest you withdraw this PR and instead focus on the upcoming milestones

Mar 24 '22 18:03 andersp

The frequency-dependent precursor will still take the minimum propagation time (t_trunc based on the velocity model) to reach boundaries, so the saving on recording before t_trunc ought to not affect the reconstruction of the spread of the source time function and its associated gradient. But I agree with you to put aside this PR since the default value involving scaling may need a second thought. I will move on to the other milestone.

Mar 25 '22 00:03 wliuphd

sw4 sw4 copied to clipboard

Memory/computation enhancement through bypassing recording of sides over source precursor

sw4
sw4 copied to clipboard