sparql.anything
sparql.anything copied to clipboard
How to handle a ndjson (newline delimited json) file?
Hello,
I have this ndjson (its MediaType is application/x-ndjson according to this doc) file with three json objects separted by a newline ( \n ). I would like to construct a graph describing the name and address of each business:
{"business_id":"Pns2l4eNsfO8kk83dixA6A","name":"Abby Rappoport, LAC, CMQ","address":"1616 Chapala St, Ste 2","city":"Santa Barbara","state":"CA","postal_code":"93101","latitude":34.4266787,"longitude":-119.7111968,"stars":5.0,"review_count":7,"is_open":0,"attributes":{"ByAppointmentOnly":"True"},"categories":"Doctors, Traditional Chinese Medicine, Naturopathic\/Holistic, Acupuncture, Health & Medical, Nutritionists","hours":null}
{"business_id":"mpf3x-BjTdTEA3yCZrAYPw","name":"The UPS Store","address":"87 Grasso Plaza Shopping Center","city":"Affton","state":"MO","postal_code":"63123","latitude":38.551126,"longitude":-90.335695,"stars":3.0,"review_count":15,"is_open":1,"attributes":{"BusinessAcceptsCreditCards":"True"},"categories":"Shipping Centers, Local Services, Notaries, Mailbox Centers, Printing Services","hours":{"Monday":"0:0-0:0","Tuesday":"8:0-18:30","Wednesday":"8:0-18:30","Thursday":"8:0-18:30","Friday":"8:0-18:30","Saturday":"8:0-14:0"}}
{"business_id":"tUFrWirKiKi_TAnsVWINQQ","name":"Target","address":"5255 E Broadway Blvd","city":"Tucson","state":"AZ","postal_code":"85711","latitude":32.223236,"longitude":-110.880452,"stars":3.5,"review_count":22,"is_open":0,"attributes":{"BikeParking":"True","BusinessAcceptsCreditCards":"True","RestaurantsPriceRange2":"2","CoatCheck":"False","RestaurantsTakeOut":"False","RestaurantsDelivery":"False","Caters":"False","WiFi":"u'no'","BusinessParking":"{'garage': False, 'street': False, 'validated': False, 'lot': True, 'valet': False}","WheelchairAccessible":"True","HappyHour":"False","OutdoorSeating":"False","HasTV":"False","RestaurantsReservations":"False","DogsAllowed":"False","ByAppointmentOnly":"False"},"categories":"Department Stores, Shopping, Fashion, Home & Garden, Electronics, Furniture Stores","hours":{"Monday":"8:0-22:0","Tuesday":"8:0-22:0","Wednesday":"8:0-22:0","Thursday":"8:0-22:0","Friday":"8:0-23:0","Saturday":"8:0-23:0","Sunday":"8:0-22:0"}}
I tried to replace the media-type with x-ndjson :
PREFIX fx: <http://sparql.xyz/facade-x/ns/>
PREFIX xyz: <http://sparql.xyz/facade-x/data/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix ex: <https://www.example.com/>
construct {
?business a ex:BusinessPlace ;
ex:address ?address ;
ex:name ?name.
}
where {
service <x-sparql-anything:> {
fx:properties fx:location "file:///home/my/app/yelp_sample_valid.json" ;
fx:media-type "application/x-ndjson" .
?root rdf:type fx:root ;
xyz:address ?address ;
xyz:name ?name ;
xyz:business_id ?business_id
BIND(iri(concat(str(ex:), encode_for_uri(?business_id))) AS ?business)
}
}
But this is the error I get:
Traceback (most recent call last):
File "/home/maximeb/.local/bin/sparql-anything", line 8, in <module>
sys.exit(main())
File "/home/maximeb/.local/lib/python3.10/site-packages/pysparql_anything/cli.py", line 82, in main
sa.main(java_args[1])
File "/home/maximeb/.local/lib/python3.10/site-packages/pysparql_anything/sparql_anything_reflection.py", line 70, in main
self.reflection.main(args)
File "jnius/jnius_export_class.pxi", line 876, in jnius.JavaMethod.__call__
File "jnius/jnius_export_class.pxi", line 1059, in jnius.JavaMethod.call_staticmethod
File "jnius/jnius_utils.pxi", line 79, in jnius.check_exception
jnius.JavaException: JVM exception occurred: java.lang.NullPointerException
If I try with the media-type json, the resulting knowledge graph contains only the first json object:
@prefix ex: <https://www.example.com/> .
@prefix fx: <http://sparql.xyz/facade-x/ns/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xyz: <http://sparql.xyz/facade-x/data/> .
ex:Pns2l4eNsfO8kk83dixA6A
rdf:type ex:BusinessPlace ;
ex:address "1616 Chapala St, Ste 2" ;
ex:name "Abby Rappoport, LAC, CMQ" .
My idea is to write a script iterating through each line of the ndjson file, and pass each iteration in the fx:content variable. But maybe you have better solution or idea.