openfast OpenFAST computational time for floating systems when using tight coupling

As part of the OC7 Phase II project (floating offshore wind turbine with flexible substructure in SubDyn), we have started using OpenFAST dev-tc.

We are running the same model with the same settings in OpenFAST v 4.1.1 and OpenFAST dev-tc.

We are observing that the computational time is significantly higher when using the tight coupling solution.

Below you can see a comparison for a model with gravity-only conditions, still water, and running 7 platform surge offsets that are 10 seconds long each.

For reference, in all cases we are using executables compiled in 64-bit and single precision.

We are also comparing the computational time when including the rigid-body reference point in SubDyn: https://github.com/OpenFAST/openfast/pull/2782

The same test has been performed for the same system using gravity-only conditions, still water, and a simulation that is 500 seconds long.

For reference, the simulations performed do not have any issues related to convergence or stability. Similar performance has been observed for other floating system configurations (e.g., USFLOWT).

Jul 29 '25 19:07 RBergua

Dear @RBergua,

Thanks for reporting this issue. Can you confirm the solver settings used for v4.1.1 (DT, InterpOrder, NumCrctn, DT_UJac, UJacSclFact) and for dev-tc (DT, InterpOrder, NumCrctn, RhoInf, ConvTol, MaxConvIter, DT_UJac, UJacSclFact)? Have you tried playing with the solver settings in dev-tc to improve the performance, e.g., increasing DT?

@deslaughter -- I presume this impact on performance is related to the size of the Jacobian, which not only includes the inputs of the implicitly coupled modules (ED, SD, HD), which would have existed in loose coupling, but also all of the inputs and accelerations of the tightly coupled modules (ED, SD). Also, any update on including an option for loose coupling in the dev-tc branch?

Best regards,

Jul 29 '25 20:07 jjonkman

Are you using MoorDyn, or MAP++?

Jul 29 '25 22:07 andrew-platt

Hi @jjonkman,

Both versions use the same settings (RhoInf, ConvTol, MaxConvIter only apply to dev-tc):

       0.01   DT              - Recommended module time step (s)
          2   InterpOrder     - Interpolation order for input/output time history (-) {1=linear, 2=quadratic}
          1   NumCrctn        - Number of correction iterations (-) {0=explicit calculation, i.e., no corrections}
        1.0   RhoInf          - Numerical damping parameter for tight coupling generalized-alpha integrator (-) [0.0 to 1.0]
       1e-4   ConvTol         - Convergence iteration error tolerance for tight coupling generalized alpha integrator (-)
         20   MaxConvIter     - Maximum number of convergence iterations for tight coupling generalized alpha integrator (-)          
      99999   DT_UJac         - Time between calls to get Jacobians (s)
    1000000   UJacSclFact     - Scaling factor used in Jacobians (-)

In the USFLOWT project (DLC 6.1) I was unable to remove the correction iteration or increase the time step. The solver would diverge.

Let me test here with the OC7 Phase II model for correction iterations and time steps and come back to you.

Jul 29 '25 22:07 RBergua

In the OC7 Phase II model we are using MAP++, @andrew-platt. In the USFLOWT we are using MoorDyn.

Jul 29 '25 22:07 RBergua

Thanks, for clarifying, @RBergua. Another thing you could try is reducing RhoInf to 0.5 (as a good balance between accuracy and numerical damping) and MaxConvIter to 6 (which is what I thought @deslaughter recommended).

Perhaps @deslaughter has other suggestions.

Jul 29 '25 23:07 jjonkman

@jjonkman The last time that I did performance profiling, the majority of the computation was in building the solver Jacobian, not solving for state and input updates. Though, with 20 convergence iterations, that may not be the case.

@RBergua If you can send me the model, I'll run it through the profiler and see where it's spending the most time. I have a feeling that we're needing to update the Jacobian very frequently and that the HydroDyn Jacobian is dominating the time, but it's best to run it and find out. If you look at the output file for the NumUJac field, you can see how often the Jacobian is being updated (assuming you're outputting at every time step). If it's less than once per second, we're going to be really slow.

Jul 30 '25 15:07 deslaughter

@jjonkman Using RhoInf = 0.5 and MaxConvIter = 6, does not improve much the computational time. I'm still in the ball park of 4,800 seconds for a simulation that is 500 seconds long. OpenFAST v.4.1.1 runs it in 1,820 seconds.

@deslaughter The model does not seem to be recomputing Jacobians. It seems to find a converged solution after 5 iterations. See below for reference. Also, keep in mind that this is not a challenging loading condition (gravity-only in still water conditions). For the USFLOWT (DLC 6.1) it computes Jacobians many times and it becomes incredibly slow.

I'm preparing the model to be shared with you, @deslaughter.

Jul 30 '25 17:07 RBergua

@RBergua Thanks for providing the model, it was helpful to run the profile and see where the time was being spent. I agree that the solver Jacobian is only being calculated at the beginning of the simulation and doesn't really impact the overall runtime. Solving the Jacobian five times per time step is taking about 80% of the simulation and HydroDyn_CalcOutput alone is taking 11%. For this model, the solver Jacobian has 40 states, 4416 inputs, and 4410 outputs. About 4000 of the inputs and outputs are in HydroDyn Morison motion and loads meshes. I don't think there's much of a way to speed it up without reworking the transfer of the platform mesh to the Morison mesh other than reducing the size of the Morison mesh.

@jjonkman Wouldn't we still see an issue here even doing the loose-coupling approach inside this framework because the dYdu, dUdu, and dUdy matrices are still going to be quite large?

Jul 30 '25 19:07 deslaughter

Thanks for the feedback, @deslaughter. Definitedly, the HydroDyn discretization could be improved at our side. It's too fine and it doesn't provide any added value.

I guess the fundamental question is: Why OpenFAST dev-tc is significantly slower compared to OpenFAST v 4.1.1? And I understand that the answer is due to solving the Jacobian five times per time step due to the iterations.

Jul 30 '25 20:07 RBergua

Btw, regarding the question about the sensitivity to time step and correction iterations. I'm not observing differences in computational time between using NumCrctn = 0 or 1. Also I'm not sure about the role of NumCrctn when using MaxConvIter in dev-tc. Finally, if I try to increase the time step significantly (e.g., from 0.01 s to 0.05 s) I end up with convergence issues.

Jul 30 '25 20:07 RBergua

Thanks for these added details.

I agree with @deslaughter that the size of the Jacobian should not be much different between loose and tight coupling...if I understand correctly, the Jacobian size is 4416 in loose coupling and 4416+40/2=4436 in tight coupling. The extra computational cost in tight coupling is thus likely related to the Newton iterations, with loose tight coupling iterating 5 times per time step and loose coupling only ever iterating once, because it does not have a tolerance check. This suggests that increasing ConvTol in tight coupling might be a possible solution to improve the solution time of tight coupling, although I'm not sure if this would lead to a divergence of the states.

@RBergua -- Can you see if increasing ConvTol improves the solution time without suffering divergence? Regarding the role of NumCrctn and MaxConvIter in tight coupling, NumCrctn controls an outer-loop iteration and MaxConvIter controls an inner-loop iteration (i.e., the Newton iteration) within the solver. I would think increasing NumCrctn from 0 to 1 would always increase the computational expense, although perhaps this is minimal if ConvTol is met on the first pass of the Newton iteration within the correction step. But if this is the case, setting NumCrctn = 0 should be OK.

Best regards,

Jul 31 '25 13:07 jjonkman

@jjonkman Increasing the convergence tolerance (e.g., from 1E-4 to 1E-3) allows higher convergence errors and therefore less convergence iterations. This translates in better computational times. But it is still much slower than OpenFAST v4.1.1 since 4 iterations per time step are necessary.

Also, I'm not sure if this larger convergence tolerance will be enough to avoid the solver diverging in more challenging simulation conditions.

Jul 31 '25 17:07 RBergua

@RBergua I was able to take ConvTol up to 7e-2 which resulted in 2 convergence iterations and a time that seemed comparable to OpenFAST v4.1.1 (~1600 seconds on my laptop). I can't speak much to the quality of the solution, some of the outputs are very similar, but some are not. The model didn't experience any convergence issues, but the states probably aren't converging as well. I wonder if the error exceeding the tolerance is due to the large forces at the tower-platform interface. Increasing UJacSclFact may help because the overall mass of the system is much higher.

Jul 31 '25 18:07 deslaughter

I'm leaving this here for traceability. In a floating system with +10 potential flow bodies, it seems that the radiation memory effect has a significant impact in computational time when using the OpenFAST tight coupling.

For example, using RdtnMod = 0 (no memory-effect calculation) in HydroDyn, a simulation 2,568 seconds long takes 14,156 s of computational time.

When I define RdtnMod = 1 (convolution), the same simulation takes 40,419 s. Therefore, when including the radiation memory effect, the computational time increases by a ratio of ~2.9.

For reference, the seetings in OpenFAST tight coupling are:

        0.5   RhoInf          - Numerical damping parameter for tight coupling generalized-alpha integrator (-) [0.0 to 1.0]
       1e-3   ConvTol         - Convergence iteration error tolerance for tight coupling generalized alpha integrator (-)
         15   MaxConvIter     - Maximum number of convergence iterations for tight coupling generalized alpha integrator (-)

I performed this same sensitivity analysis in OpenFAST v 4.1.2 (current official release). The computational time is 7,439 s without radiation memory effect and 9,820 s with. So, a relatively small computational time increase (ratio of ~1.3). This confirms that the large computational time penalty is only observed in the OpenFAST tight coupling versions. This issue seems to be related to the Jacobians.

Nov 13 '25 21:11 RBergua