Kratos icon indicating copy to clipboard operation
Kratos copied to clipboard

add RMS output process

Open miguelmaso opened this issue 3 years ago • 6 comments

📝 Description This output process is based on a custom output in shallow water app. I think it can be interesting for more people since it can be used for generic purposes:

  • Make a convergence analysis with several meshes (different analysis-stages)
  • Tracking the convergence and consumed time of a simulation
  • Comparison of several numerical strategies

An example of usage can be found in that ProjectParameters from the Examples repo. There is a process which computes the error and another process printing the error.

There are some points of discussion:

  • Where to place the process: core or statistics
  • Whether to remove the dependency on statistics app
  • Whether to include more norms (L1, L2, Linf) or always RMS

Multithreading is not supported, it should be handled by the user. There isn't a test, I will add after defining the interface.

🆕 Changelog

  • Add output process

miguelmaso avatar Jul 15 '22 08:07 miguelmaso

Small question: Why do you say multithreading is not supported in this?

Usually I use this process to compare convergence with different meshes and configurations. In that case, there are several analyses printing output to the same file (header is written only once, and the multiple instances of this process open the file in append mode). The matrix of analyses can be executed in serial or in parallel, using multithreading, but it depends on the user. If parallel is chosen, the user should take care on the synchronization of the different processes writing to the file.

miguelmaso avatar Jul 15 '22 13:07 miguelmaso

As far as I think, what is proposed by this PR can be achieved by using https://github.com/KratosMultiphysics/Kratos/blob/master/applications/StatisticsApplication/python_scripts/spatial_statistics_process.py (the json file will be more verbose in this case)

Is it possible to log several simulation into the same file?

Does this process print the number of elements, number of nodes, time step and elapsed time for each simulation?

miguelmaso avatar Jul 15 '22 13:07 miguelmaso

Example of a possible output file:

# RMS for model part 'model_part' over all the domain
#label 	 num_nodes 	 num_elems 	 time_step 	 time 	 computational_time	HEIGHT_ERROR	EXACT_HEIGHT	VELOCITY_ERROR_X	EXACT_VELOCITY_X	MOMENTUM_ERROR_X	EXACT_MOMENTUM_X
rv_none	205	320	0.003713885035815434	0.5009955878415503	1.4812877178192139	0.08443763457037692	1.032771925200107	0.6859292810760017	2.4965563578217855	0.21088240990646087	1.8231852562010373
rv_none	205	320	0.0035061379670925374	1.0006578041898906	2.760655164718628	0.05057606466988017	1.032900632160087	0.6328896179335078	3.0100894611568165	0.2774466718367607	2.1984821741279905
rv_none	1111	2000	0.0014820345216492448	0.5006353128872691	8.004278421401978	0.03812638852778361	1.0327923473045526	0.5484631613412106	2.499571430135307	0.09771124361304509	1.8254231986383578
rv_none	1111	2000	0.0013759711975064315	1.0009243405007073	15.14770770072937	0.02261707823094805	1.0327976642476138	0.339146076112535	3.0111093646813454	0.09560924583660134	2.1990078453345854
rv_none	4221	8000	0.0007415421299297176	0.5005762586918633	49.51382875442505	0.020834116230009678	1.0327953605377793	0.48320775319162895	2.500065036414818	0.046110024019105765	1.825789003408485
rv_none	4221	8000	0.0006814532496336456	1.0004292069005358	97.92517423629761	0.012677700586638257	1.0327955429500142	0.3579278467309017	3.0092113893035064	0.046697134042931306	2.1976172435031125
rv_none	11356	21978	0.00044683486105767667	0.5001078886649574	166.68614292144775	0.013413000156179692	1.0327972308138385	0.23927685045869662	2.492669180020766	0.027556938938745352	1.8286469155623106
rv_none	11356	21978	0.0004098447184079923	1.00026801475676	292.87325739860535	0.00817061582600882	1.0327944471691128	0.30631691238173664	3.0176116552835857	0.029276891243542477	2.1971613872737503
rv_none	101101	200000	0.00014849449352055677	0.5000765966061458	3348.798399209976	0.005078752045754917	1.0327955586744189	0.28615774112837794	2.5042346269147466	0.009438556818494119	1.8288343920393069
rv_none	101101	200000	0.00013540836131398226	1.0000778016973852	6073.097289800644	0.0037587585537864126	1.0327955586942887	0.2653393972366121	3.0078555799131172	0.010597157658604977	2.1966271338222456
gj_none	205	320	0.003363852641861398	0.5001015608483605	1.15057373046875	0.1265676791600171	1.0327673271425972	1.960765868416477	2.5040265957460313	0.5245525998791926	1.8286324804035157
gj_none	205	320	0.003646923103607036	1.000255818282279	2.127110481262207	0.05570544891311507	1.032899935623704	0.8189726773172601	3.008543323509102	0.4785828017834921	2.197351438179642
... (more simulations)

This is very handy, it can be imported as a table (e.g. using pandas), plotting convergence graphs is straightforward.

It is also possible to check if the speed and the accuracy is kept from one release to another.

miguelmaso avatar Jul 15 '22 13:07 miguelmaso

Usually I use this process to compare convergence with different meshes and configurations. In that case, there are several analyses printing output to the same file (header is written only once, and the multiple instances of this process open the file in append mode). The matrix of analyses can be executed in serial or in parallel, using multithreading, but it depends on the user. If parallel is chosen, the user should take care on the synchronization of the different processes writing to the file.

I would say leaving this to user's side is dangerous. what if we have different files for different configurations in a folder? so no more synchronization problems and user does not have to do anything additional. Folder structure output is already supported in existing StatisticsApplication

Is it possible to log several simulation into the same file? Does this process print the number of elements, number of nodes, time step and elapsed time for each simulation?

No, it does not allow writing several simulations into same file because of the synchronization problem. Yes, we can output the number of elements, number of nodes (basically the container size). I did not include it yet, but it is doable in the current configuration.

Time and time step will be given as an output depending on the type of the output control variable

# Spatial statistics process output
# Kratos version               : 9.0."Dev"-e65e0909b7-FullDebug
# Timestamp                    : 2022-07-18 12:13:44.206723
# Method Name                  : distribution
# Norm type                    : magnitude
# Container type               : condition_non_historical
# Modelpart name               : test_model_part
# Output control variable name : STEP
# ----------------------------------------------------------------------
# Headers:
# OutputControlVariableValue PRESSURE_Min PRESSURE_Max VELOCITY_Min VELOCITY_Max LOAD_MESHES_Min LOAD_MESHES_Max GREEN_LAGRANGE_STRAIN_TENSOR_Min GREEN_LAGRANGE_STRAIN_TENSOR_Max
2 6.0 27.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 20.784609690826528 93.53074360871938 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 20.12461179749811 67.08203932499369 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 120.0 330.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125
4 10.0 45.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 34.64101615137755 155.88457268119896 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 33.54101966249684 111.80339887498948 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 200.0 550.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125
6 14.0 63.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 48.49742261192856 218.23840175367854 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 46.95742752749558 156.52475842498527 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 280.0 770.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125
8 18.0 81.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 62.353829072479584 280.59223082615813 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 60.37383539249432 201.24611797498108 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 360.0 990.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125
10 22.0 99.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 76.2102355330306 342.9460598986377 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 73.79024325749306 245.96747752497686 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125 440.0 1210.0 0.0 0.125 0.125 0.125 0.0 0.125 0.125 0.0 0.125 0.125 0.0 0.125
# End of file

Above format also easily can be read by using pandas or numpy for plotting. It is only missing number_of_container_items in the list of columns which can be easily added. if you like, we can introduce writing formating also.

sunethwarna avatar Jul 18 '22 10:07 sunethwarna

Usually I use this process to compare convergence with different meshes and configurations. In that case, there are several analyses printing output to the same file (header is written only once, and the multiple instances of this process open the file in append mode). The matrix of analyses can be executed in serial or in parallel, using multithreading, but it depends on the user. If parallel is chosen, the user should take care on the synchronization of the different processes writing to the file.

I would say leaving this to user's side is dangerous. what if we have different files for different configurations in a folder? so no more synchronization problems and user does not have to do anything additional. Folder structure output is already supported in existing StatisticsApplication

An automatic way for having a file per simulation and later synchronization is to add a time stamp to the file name, e.g.:

"file_name" : "<model_part>-<datetime>.dat"

Time and time step will be given as an output depending on the type of the output control variable

I'm afraid TIME will be given as an output control variable, but not the time step.

Above format also easily can be read by using pandas or numpy for plotting. It is only missing number_of_container_items in the list of columns which can be easily added. if you like, we can introduce writing formating also.

This would be very useful. Moreover, a flexible fields structure would be optimal in order to compare / track simulations. Aside from number_of_container_items the user may be interested in the time step, element size, elapsed time, datetime...

miguelmaso avatar Jul 18 '22 15:07 miguelmaso

I would be in favour of having various ideas from @miguelmaso extending the StatisticsApplication.

mpentek avatar Jul 19 '22 15:07 mpentek