Simbad multi-object search behaviour
Currently if you query Simbad.query_objects (or Simbad.query_region) with many objects you get a table returned of a different size making comparison between input and output very difficult. This is because Simbad can return multiple rows for a single object (Centaurus for example) or no rows at all for an unrecognised object. The returned object id's aren't necessarily the input id's either so you can't use them to search the returned table. You can do it one at a time but for a few thousand objects it's quite slow.
Would it be possible to change this behaviour and the same for multiple coordinate searches? I'm not sure of the best way to handle this, to return the original search names/coordinates and/or the same number of rows with the same order could work. Even just a row label that identifies results as being in the same search group would be fine. A search by multiple coordinates (or a multiple region criteria) would benefit extremely from the second or third behaviour. As at the moment it is very difficult to distinguish which object belongs to each region search. The same behaviour for Ned would be even better :)
For Example:
rac=['144.696458 -60.09181', '203.426453 -65.99033']
Simbad.query_region(SkyCoord(rac, unit=u.deg),'1d')
Would return:
| MAIN_ID | RA | DEC | OTYPE | GROUP |
|---|---|---|---|---|
| h m s | d m s | |||
| object | unicode13 | unicode13 | object | int |
| --------------- | ------------- | ------------- | ------ | ------ |
| IC 2501 | 9 38 47.146 | -60 5 30.52 | PN | 1 |
| TYC 9003-1531-1 | 13 33 42.8988 | -65 59 11.376 | Star | 2 |
| NGC 5189 | 13 33 32.86 | -65 58 27.1 | PN | 2 |
| TYC 9003-1874-1 | 13 33 27.265 | -65 58 27.9 | Star | 2 |
| TYC 9003-654-1 | 13 33 25.988 | -66 0 14.359 | Star | 2 |
Multi object example:
from astroquery.eso import Eso
from astroquery.simbad import Simbad
#login info
eso = Eso()
login=raw_input('ESO Login: ')
eso.login(login)
#Set simbad
Sim=Simbad()
Sim.add_votable_fields('otype')
Sim.ROW_LIMIT=1e6
Sim.TIMEOUT=500
#Query
eso.ROW_LIMIT=1e12
table=eso.query_instrument('muse', night_flag=0, column_filters={'dp_type':'OBJECT'})
#Remove duplicate names
oname=list(set(table['Object']))
onar=Sim.query_objects(oname)
print len(onar['MAIN_ID']),len(oname)
Somehow this slipped by me, but yes, this should be possible and maybe even straightforward. A PR implementing it would be welcome, otherwise maybe we can tackle this next time there's a hack session.
I have no solution for this to post yet but I'll put it here if I manage to make one before the next hack session :)
I want to query many names in Simbad, to figure out which ones resolve and don't.
This is what I tried: https://gist.github.com/cdeil/ad1ffdd724878f4d72d25a117d92d5a5
It doesn't give me what I want, the issues I have are:
- The number or rows in
tableplus number of entries intable.errorsdoesn't match the number of names I queried!? - And the result table doesn't contain the name I queried, i.e. I can't easily figure out which row corresponds to which query name?
Is there a way to do this currently with query_objects? Or do I have to run one query_object per object? Will SIMBAD block me if I run ~ 100 queries, possibly a few times?
@cdeil query_objects sends the list of names to SIMBAD in a single form. I believe what happened is:
- Several entries (7) resulted in no match, but were recognized as valid names (I'm uncertain about this)
- Several entries (another 7) somehow errored - perhaps they did not parse properly? I'm again not sure why
- Both of the above categories are simply excluded from the results.
This is a good question for the CDS folks. I suggest e-mailing them directly to see if there's a way to get a table returned with blanks for missing fields or something similar.
Hi, for the question number 2 : if you are using scripts in SIMBAD, there is a way to get the names you gave : %OBJECT, for votable fields, you can use : TYPED_ID. SIMBAD blocks if you send more than 6 queries in the same second, and you can query with a list until 10000 names
The list of errors generated by this list of names are all here :
::error:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
[4] Identifier not found in the database : GAL 292.2-00.5 [5] 'PWN G292.15-0.54': No known catalog could be found [10] Identifier not found in the database : GAL 292.2-00.5 [11] Identifier not found in the database : GAL 318.2+00.1 [13] Identifier not found in the database : GAL 292.2-00.5 [14] 'PWN G292.15-0.54': No known catalog could be found [22] 'AX J150436-5824': this identifier has an incorrect format for catalog: AX : ASCA satellite, X-ray
[25] Identifier not found in the database : GAL 327.15-01.04 [26] Identifier not found in the database : GAL 327.1-01.1 [30] 'PWN G18.5-0.4': No known catalog could be found [33] Identifier not found in the database : GAL 018.6-00.2 [52] Identifier not found in the database : GAL 030.8-00.2 [61] Identifier not found in the database : GAL 033.2-00.6 [68] Identifier not found in the database : GAL 042.8+00.6
OK, so we should probably add the %OBJECT column in the query_objects query in astroquery.
Well, my memory is awful. This PR: https://github.com/astropy/astroquery/pull/496 addresses the issue of keeping the input name in the output results.