pyuvdata icon indicating copy to clipboard operation
pyuvdata copied to clipboard

Performance on simulated SKA data

Open parphyam opened this issue 5 years ago • 9 comments

Hi, I tried using the software for reading some simulated SKA data (obtained using OSKAR simulator in the CASA Measurement Set format (.ms)). It gave me an error given in the attached text file. error.txt The file is not corrupted as far as I can tell (working great in CASA!). I also checked with a file simulated in CASA using MWA coordinates and that worked fine!

parphyam avatar May 06 '20 20:05 parphyam

Looks like the history to string code in pyuvdata assumes the following columns all have the same length (ntimes), which it seems is not always true. Thanks for bringing this to our attention.

I'm personally not sure what the best solution is but hopefully it will be easier to find out now that we know where the issue is.

@bhazelton any ideas on this one? I would naively want to add an if tbrow < len(table) inside the list comprehension that forms the newline string but assumes that ntimes would be the longest column.

app_params = history_table.getcol("APP_PARAMS")["array"]
cli_command = history_table.getcol("CLI_COMMAND")["array"]
application = history_table.getcol("APPLICATION")
message = history_table.getcol("MESSAGE")
obj_id = history_table.getcol("OBJECT_ID")
obs_id = history_table.getcol("OBSERVATION_ID")
origin = history_table.getcol("ORIGIN")
priority = history_table.getcol("PRIORITY")
times = history_table.getcol("TIME")

ntimes = len(times)
tables = [
    app_params,
    cli_command,
    application,
    message,
    obj_id,
    obs_id,
    origin,
    priority,
    times,
]
for tbrow in range(ntimes):
    message_str += str(message[tbrow])
    newline = ";".join([str(table[tbrow]) for table in tables]) + "\n"
    history_str += newline
    if tbrow < ntimes - 1:
        message_str += "\n"

mkolopanis avatar May 07 '20 15:05 mkolopanis

@mkolopanis that sounds reasonable to me, but I'm not really an expert on CASA measurement sets. It is a bit curious that different columns from the same table can have different lengths, I would not have expected that.

bhazelton avatar May 08 '20 18:05 bhazelton

maybe we could just take the max of the lengths of all those columns :thinking:

mkolopanis avatar May 08 '20 23:05 mkolopanis

@parphyam could you provide an example file so that we can develop a fix? Ideally as small a file as possible to recreate the error so that we can include it in our unit tests.

bhazelton avatar May 11 '20 16:05 bhazelton

@parphyam We'd like to fix this, can you get us an example file with this problem? If it's a large file then a link where we can download it from would work fine.

bhazelton avatar May 25 '20 23:05 bhazelton

@parphyam we would like to fix this problem, but we can't tell if our fix will actually fix your problem without an example file. Can you give us a link where we can get a file that causes this error? Otherwise we will close this issue, but we'll be happy to reopen it if you can get us a file.

bhazelton avatar Jul 01 '20 15:07 bhazelton

Hi, Extremely Sorry for replying so late! I can provide an example file, I'll upload and give you the link of the exact file which gave me the error (it may take a couple of days owing to the poor internet in my area, sorry)... Thank you so much...

parphyam avatar Aug 27 '20 16:08 parphyam

@parphyam great! Thanks for letting us know, no problem on the timing.

bhazelton avatar Aug 27 '20 16:08 bhazelton