hapi-fhir icon indicating copy to clipboard operation
hapi-fhir copied to clipboard

Deadlock in case of concurrent users

Open poojgupt opened this issue 4 years ago • 12 comments

Hello,

I recently started using FHIR server and was trying to ingest 10 resources in a bundle, Number of Users = 5, Loop count = 50 thru Jmeter and started seeing Deadlock

deadlock Exception on the console.. on analyzing further deadlock was happening on HFJ_Forced_ID table where it checks if the ResourceType + Forced_ID already exists or not. If not, then it marks the resources for creation.

All my resources, in the bundle, use PUT method and use its own ID (logical identifier). Had gone thru the Google Group as well, and it was suggested to use READ_UNCOMMITTED isolation level. Just wondering if any other way out exists?

Any help/suggestions/feedback.

  • Windows
  • Java 11
  • Hapi FHIR JPA starter
  • SQL Server 2014

Had already gone thru this https://groups.google.com/forum/#!msg/hapi-fhir/lLuRnQZoaU8/0hl9Qg7WAgAJ

Just wondering if there is any other solution?

poojgupt avatar Apr 28 '20 13:04 poojgupt

Can you try this with current snapshot builds of HAPI FHIR 5.0.0 to see if this is resolved?

jamesagnew avatar Apr 28 '20 13:04 jamesagnew

Yes, i did try with 5.0.-SNAPSHOT today and no relief there.. still getting below exception

Failed to call access method: org.springframework.dao.CannotAcquireLockException: could not execute query; SQL [select forcedid0_.RESOURCE_PID as col_0_0_ from HFJ_FORCED_ID forcedid0_ where forcedid0_.RESOURCE_TYPE=? and (forcedid0_.FORCED_ID in (?))]; nested exception is org.hibernate.exception.LockAcquisitionException: could not execute query...

I was trying to insert 2500 resources in total and with 5.0.0-SNAPSHOT i could see 710 resources getting ingested and with 4.2.0 - 750 resources in SQL server.

poojgupt avatar Apr 30 '20 04:04 poojgupt

Are you able to work out what is the minimal set of input data required in order to reproduce this?

jamesagnew avatar Apr 30 '20 12:04 jamesagnew

Attaching the Jmeter script to reproduce the same. I am simply running the script on my Windows Laptop + SQL Server 2014 + Using hapi-fhir-jpaserver-starter module [4.2.0]

jmeter_fhir-10Resources-PUT-generateIDs.zip

The script is testing for 5 concurrent users and 50 times is the loop count for 1 user which means at the end of the script I should be seeing 2500 resources entries.

Configuration done is hapi.properties to connect to SQL Server datasource.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver datasource.url=jdbc:sqlserver://localhost;databaseName=fhir datasource.username=sa datasource.password=admin hibernate.dialect=org.hibernate.dialect.SQLServer2012Dialect

Lucene is disbled and default connection pool settings. No other changes done in the code.

I am sure we should be able to reproduce this easily.

poojgupt avatar Apr 30 '20 12:04 poojgupt

FYI- I don't actually use JMeter and don't have an environment to run this in.. If you're able to break this down to "uploading a bundle containing X at the same time as a second bundle containing Y will trigger this" that definitely improves the changes of this being looked at in the foreseeable future

On Thu, Apr 30, 2020 at 8:48 AM poojgupt [email protected] wrote:

Attaching the Jmeter script to reproduce the same. I am simply running the script on my Windows Laptop + SQL Server 2014 + Using hapi-fhir-jpaserver-starter module [4.2.0]

jmeter_fhir-10Resources-PUT-generateIDs.zip https://github.com/jamesagnew/hapi-fhir/files/4558328/jmeter_fhir-10Resources-PUT-generateIDs.zip

The script is testing for 5 concurrent users and 50 times is the loop count for 1 user which means at the end of the script I should be seeing 2500 resources entries.

Configuration done is hapi.properties to connect to SQL Server datasource.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver datasource.url=jdbc:sqlserver://localhost;databaseName=fhir datasource.username=sa datasource.password=admin hibernate.dialect=org.hibernate.dialect.SQLServer2012Dialect

Lucene is disbled and default connection pool settings. No other changes done in the code.

I am sure we should be able to reproduce this easily.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jamesagnew/hapi-fhir/issues/1819#issuecomment-621812023, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2N7HOZNIFREII52NM7CCLRPFXQBANCNFSM4MSZ7PWQ .

jamesagnew avatar Apr 30 '20 14:04 jamesagnew

Yes the understanding on this issue is perfectly right. Uploading a bundle (with HTTP request method PUT, not sure about POST did not try) with X entries and second bundle with Y entries causing the deadlock with concurrent threads.

poojgupt avatar Apr 30 '20 15:04 poojgupt

I'm sorry, maybe I'm not being clear. I don't have any understanding, that's what I'm saying.

I am sure that some permutation of X+Y causes an issue, but I'm asking if you can figure out what is the minimum thing that X and Y need to be in order to reproduce this.

We have loads of unit and integration tests that fire tons of data concurrently in all kinds of ways so there is no fundamental issue where HAPI can't handle concurrent operations. I would need to know what it specific to your use case that triggers this.

jamesagnew avatar Apr 30 '20 20:04 jamesagnew

Sorry about the confusion. This is the data [10 resources in a bundle] that I am uploading using doing HTTP post from the client.

request.data.zip

To reproduce this issue, if we can upload the attached data with 5 concurrent users and each user ingesting the data 50 times which means 1 user = 500 records/resources in the DB multiplied by 5 users = 500 x 5 = 2500 records.

**** some extra information **** may be helpful understanding the request data

If we open this data xml, all the data primarily would remain same except the logical identifiers of the resources. The logical identifiers are dynamically generated for each run.

<Patient xmlns="http://hl7.org/fhir">
			<id value="**${PATIENT_ID}**" />

e.g. in the above snippet of the request data. PATIENT_ID would be generated at run time dynamically and this same ID we are concatenating in other IDs needed for other resources in the bundle. e.g. for Observation its ID will be 39156-<above generated PATIENT_ID>

		<id value=**"39156-${PATIENT_ID}"** />

This is the logic that we are using to generate the identifiers hence if we can generate just the PATIENT_ID dynamically.

poojgupt avatar May 01 '20 04:05 poojgupt

Are those patient IDs guaranteed to be unique for each individual upload, or is there a chance that multiple concurrent uploads are uploading the same Patient ID?

Also, does removing all of the the ifNoneExist sections from the elements have any effect?

On Fri, May 1, 2020 at 12:16 AM poojgupt [email protected] wrote:

Sorry about the confusion. This is the data [10 resources in a bundle] that I am uploading using doing HTTP post from the client.

request.data.zip https://github.com/jamesagnew/hapi-fhir/files/4562452/request.data.zip

To reproduce this issue, if we can upload the attached data with 5 concurrent users and each user ingesting the data 50 times which means 1 user = 500 records/resources in the DB multiplied by 5 users = 500 x 5 = 2500 records.

**** some extra information **** may be helpful understanding the request data

If we open this data xml, all the data primarily would remain same except the logical identifiers of the resources. The logical identifiers are dynamically generated for each run.

e.g. in the above snippet of the request data. PATIENT_ID would be generated at run time dynamically and this same ID we are concatenating in other IDs needed for other resources in the bundle. e.g. for Observation its ID will be 39156-

This is the logic that we are using to generate the identifiers hence if we can generate just the PATIENT_ID dynamically.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jamesagnew/hapi-fhir/issues/1819#issuecomment-622239828, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2N7HP6GV2OLJ256RNI5KDRPJEKVANCNFSM4MSZ7PWQ .

jamesagnew avatar May 01 '20 10:05 jamesagnew

Yes, the patient IDs are definitely unique as we are using Jmeter and it does its job correctly. Plus I have also verified the logs for any duplicates.

I can try the ifNoneExists and revert to see if it has any impact.

poojgupt avatar May 01 '20 10:05 poojgupt

Apparently, ifNoneExist did not have any effect.

poojgupt avatar May 01 '20 10:05 poojgupt

Same case. HAPI FHIR SERVER 6.4.0 Postgres 13

realizm avatar Jan 31 '24 00:01 realizm