pathling
pathling copied to clipboard
Extension support for primitive elements
Our current implementation of extension support does not include extensions on primitive elements.
This will add this support and assess any impacts on query performance.
I feel as though we should deprioritise this until such time as we find a solid use case in need of it.
Dear @johngrimes, I am currently trying to work with extensions. Is this what you mean by "primitive elements"?
(taken from https://simplifier.net/oncology/operation)
I was expecting to be able to extract the value of the codeable concept, but the "extension_url" is as far as i can get (extension_enabled = True).
Hi @jasminziegler,
Using this example resource: https://simplifier.net/FirstProfile3/Procedure-Operation-example-1/~json
This expression works for me:
extension('http://dktk.dkfz.de/fhir/StructureDefinition/onco-core-Extension-OPIntention').valueCodeableConcept.coding.display
Returns one row with "palliativ".
Hi @johngrimes, thanks for your quick reply! I might have confused the pathling fhir-server implementation and the pathling python api - is this functionality also available in the pathling python api?
As of this morning, yes! 🙂
Here is the newly minted documentation on how to do FHIRPath query using the library: https://pathling.csiro.au/docs/libraries/fhirpath-query
I'd love to hear any feedback you might have!
Awesome, thank you @johngrimes! (We are waiting for the maven artifact and are ready for testing the new and exciting features :) ) edit: @chgl found version 6.2.1 :)
You can use this one: https://central.sonatype.com/artifact/au.csiro.pathling/library-api/6.2.1
Got it upgraded and installed! Nevertheless, I am getting an AttributeError: 'DataFrame' object has no attribute 'extract'. Not sure what I am missing here bec. according to this example https://pathling.csiro.au/docs/libraries/fhirpath-query, reading in data with "pc.read..." will produce a pyspark DataFrame.
'PATHLING_VERSION': '6.2.1', 'APACHE_SPARK_VERSION': '3.3.2', Python 3.10.10 Scala version 2.12.15
Could you please provide your example that you tested with my sample resource?
Here's the code I used:
from pathling import PathlingContext
pc = PathlingContext.create(enable_extensions=True)
data = pc.read.ndjson("/Users/gri306/Desktop")
result = data.extract("Procedure", columns=[
"extension('http://dktk.dkfz.de/fhir/StructureDefinition/onco-core-Extension-OPIntention')"
".valueCodeableConcept.coding.display"
])
result.show(truncate=False)
The code pc.read.ndjson(...)
should return a DataSource. The ndjson
method is only one of a number of data source builder methods.
The extract method should return a DataFrame.
Hi @jasminziegler, just checking back to see if you got it all working.
Hi @johngrimes , thanks for checking back!
We are actually as of now in a hurry to get all our previous operations (without the newly added ones in v. 6.2.1) working with real data from our clinical systems. Due to the huge amount of data, we are facing issues with resources (requires a lot of RAM) - rather a spark issue than a issue on your side. Since we are performing many operations, we are creating tasks of very large size. Next attempt would be to save intermediate tables and see if we can improve performance because we are suspecting that the task graph is being reconstructed from scratch each time we call "a spark action" which results in ever growing task graphs. Happy to hear any ideas on this from your experiences.
After we get this up and running, we will get back to upgrading + testing your new features which we are still excited about and are happy to provide feedback as soon as possible.
Hi @jasminziegler,
From the sounds of it, your partitions might be too large.
Would you be able to share your query plan?
df.explain(True)
The query plan is endlessly long - you are right. Also my stage task size is very large. I am trying to implement checkpoints right now, hopefully that is a useful solution. We do not have any Apache Spark expertise so far at our institution so please apologize my off-topic questions!
Hi @jasminziegler,
Not a problem at all.
Perhaps we should have a call some time - I would love to hear more about what you are doing, and it might help you save some time solving these problems. Send me an email at [email protected] if you are interested.