wikidata icon indicating copy to clipboard operation
wikidata copied to clipboard

Speed of looking up properties

Open az0 opened this issue 7 years ago • 5 comments

I am looping through entities and looking up multiple properties for each (7 in my real project, 3 in the attached toy example). Each property slows it down, so it will take hours to go through all the entities. Is there a way to speed this up please?

from wikidata.client import Client

client = Client()  # doctest: +SKIP

p_givenname = client.get('P735')
p_surname = client.get('P734')
p_dob = client.get('P569')


def get_entity(wikidata_id):
    entity = client.get(wikidata_id, load=True)

    givenname = entity[p_givenname].label
    surname = entity[p_surname].label
    dob = entity[p_dob]
    print ('%s %s %s' % (givenname, surname, dob))

w_ids = ['Q498805',
         'Q482745',
         'Q186',
         'Q1363428',
         'Q299700',
         'Q196223',
         'Q488828',
         'Q490120']


import datetime as dt
n0 = dt.datetime.now()
for w_id in w_ids:
    get_entity(w_id)
n1 = dt.datetime.now()
print ('elapsed time: ', n1 - n0)
print ('record count: ', len(w_ids))

az0 avatar Oct 05 '17 05:10 az0

Am I incorrectly using the library, or is there an issue in the library? I left my program running for days, and it did not finish.

az0 avatar Apr 05 '18 15:04 az0

I would probably use a SPARQL query instead (https://query.wikidata.org/).

k----n avatar Apr 06 '18 18:04 k----n

I used SPARQL to get the IDs such as Q498805 to feed into Wikidata, but I could not figure out how to get all the metadata out of SPARQL.

az0 avatar Apr 07 '18 03:04 az0

Use the "wdt" prefix in the predicate.

SELECT * WHERE
{
     wd:Q498805 wdt:P569 ?o .
     SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

k----n avatar Apr 07 '18 04:04 k----n

@k----n I expanded on that, and it helps. I will continue working with it to add the labels and filters I was trying. Thanks

az0 avatar Apr 07 '18 04:04 az0