cubiomes-viewer icon indicating copy to clipboard operation
cubiomes-viewer copied to clipboard

Feature Request/Issues (affects latest version): Analyze list of seeds, and export to CSV.

Open Delvin4519 opened this issue 2 years ago • 3 comments

As the title says.

For the GUI analyer, it should be able to analyze either the existing list of seeds in the GUI, or analyze a file of seeds, and it would output the results to a csv file.

Delvin4519 avatar May 10 '22 19:05 Delvin4519

The seed analysis can now use the matching seed list as input in v2.3.0.

Cubitect avatar Aug 07 '22 16:08 Cubitect

Several issues/problems with the matching seed list input in v2.3.0 release.

(Updated to include v2.3.3 release as it the issues are still affected)

  1. It seems to append each seed as multiple lines. This causes several things. One, the filesize increases much faster, since each biome is printed on a new line and written each time. Also, it makes it hard to compare each seed, since each seed takes a varying amount of lines in the text file.

This could be resolved by having each seed printed on a single line when the "matching seed list" is toggled, which would reduce filesize, since each biome/structure would be listed in the first column. Note that biomes/structures that aren't found in the area would need to be marked as 0. so the columns align properly for each seed.

Proper example:

seed ,biome_jungle ,biome_mesa ,biome_iceSpikes ,biome_megaTaiga ,biome_plains
2235,32952,2359,3855,20356,3985
15353,0,3583,3469,34346,439
768496,329085,0,0,23689,0
148294,5982,34986,0,23895,295355

This might be a problem if a user analyzes structure locations, but that option should problably be restricted to number of structures and no structure locations if "matching seed list" is toggled.

  1. If a user attempts to analyze a bunch of seeds, it might stall/lag out the program. The program should automatically export the results to a csv file if the "matching seed list" is larger than 4,096 seeds. Depending on whether it's easier or harder to implement than the other, a matching seed list could just always be exported to a csv file.

  2. For larger analysis, it might be helpful to restrict it from 1:1 biome scale to 1:4 biome scale, to help speed up the analysis time, if high precision (1:1) isn't fully necessary for analysis (1:4 still works).

  3. It seems though if one unselects "restrict to map selections", the program will crash if seed list is selected and in MC 1.7-1.12 version.

Delvin4519 avatar Aug 07 '22 16:08 Delvin4519

Affects v2.4.0 and v2.4.1

Delvin4519 avatar Sep 12 '22 13:09 Delvin4519

The analysis function has been completely redesigned now for v2.5.0 with many of these issues in mind. The biomes and structure analysis now have dedicated tabs since the UI and export requirements are rather distinct.

Cubitect avatar Oct 01 '22 14:10 Cubitect

The analysis function has been completely redesigned now for v2.5.0 with many of these issues in mind. The biomes and structure analysis now have dedicated tabs since the UI and export requirements are rather distinct.

It seems though the analysis of biomes using a seed list function still causes the program to hang in v2.5.0. It's probably best to have the default behavior for analysis of biomes with a seed list to automatically ask the user to export to CSV if there is more than 64 seeds in the seed list, and avoid displaying hundreds of seeds biome analysis in the GUI, otherwise the program will hang.

Delvin4519 avatar Oct 01 '22 15:10 Delvin4519

The analysis is run in a different thread and, when used with the matching seed list, it notifies the UI to update on every seed that finishes and should not block the UI. Furthermore, the analysis can be aborted at any time, even while a single seed is examined. All this works perfectly fine for me, with thousands of seeds as well as on single seed with a lengthy analysis..

Cubitect avatar Oct 01 '22 21:10 Cubitect

I did some more testing, trying to reproduce your hanging issue and I found that with a large seed list, the biome statistics table starts to struggle, but only on Windows. I'm guessing this is what you are experiencing. Qt's implementation of a table on Windows is apparently not very performant can cannot handle that many updates when the table has reached a certain size. A mitigation might by to limit the updates to the table and only update every 10 or 100 seeds later on...

Cubitect avatar Oct 01 '22 22:10 Cubitect

I have changed how the UI is updated when a seed is analyzed. It's a workaround that somewhat goes against Qt advice, but it should have significantly less performance issues and should hopefully no longer cause the tool to become unresponsive. If it still freezes for you, then I may need more detail on the use case.

Cubitect avatar Oct 03 '22 13:10 Cubitect

It's a bit better, but the GUI still slows down after thousands of seeds analyzed, and the analyzer speed then slows as well, so a direct export to CSV is still helpful.

but it would be much more improved if the rows and columns could be swapped for the seed lists, to have seed lists have the seeds by row, and biomes in column.

Also, the separating value should be a "," not ";".


Additional fixes that need to be made for v2.5.2+:

  1. swap rows and colums for seed list analysis
  2. Separator value should be a "," not ";"
  3. Seed values should have a "" appended on either end of the seed value, and be treated as strings, so external programs display seed values correctly
  4. Negative seed values need to be saved as negative 2^63, not positive 2^64

Delvin4519 avatar Oct 05 '22 19:10 Delvin4519

I have tried to address UI performance issue again in the update 2.6.0. The UI will now only receive updates as long as it does not affect the calculations to a significant degree.

Additional fixes that need to be made for v2.5.2+:

  1. The biome statistics are now listed in a table with one row for each seed.

  2. CSV is not a standardized format and ";" is just as well supported as "," and is by and large a better choice for a separator, since a comma is much more common in cells than a semicolon. This is especially true with locales that use a comma for the decimal point, where every floating point number includes a comma.

  3. Assuming you are trying to import the CSV in Excel, I am afraid there is no good way of getting it to import a CSV that just 'works'. The problem is that Excel insists on interpreting the cells in frustrating ways that depend on the system locale and will therefore vary from system to system. Issues that I have encountered include: floating point numbers have completely wrong values (decimal point mix-up) and numbers getting converted to dates and even IP addresses, numbers with 11 digits are converted to scientific notation etc.. Wrapping numbers in double quotes has no affect. The best I have found is that you can wrap cells in single quotes to force them be interpreted as text, but then the single quotes become part of the cell.

I have added two new options to the settings menu, where the column separator and cell quotation can be specified. This is probably the best I can offer regarding this.

Cubitect avatar Nov 13 '22 18:11 Cubitect