mysterious E275 error message repeatedly showing up in KG2 logfiles of services under heavy load
We frequently see error messages like this in the /tmp/RTX_OpenAPI_kg2.elog and /tmp/RTX_OpenAPI_kg2beta.elog files:
*******ERROR: self.response has no job_id attr! E2752023-09-22T09:15:00.336429 ERROR: (3095) [OverLimit] Query could not be run due to exceeded limits
I think it is coming from this line of code: https://github.com/RTXteam/RTX/blob/8c435f02a1162982bbe4b0d13e1c5dd6d084ecc8/code/ARAX/ARAXQuery/ARAX_query.py#L268
I think this condition is actually expected to occur whenever we respond to a client request with a HTTP status code of 429, since in that case we are not processing the query and thus there is no job_id. I think. If that is true, then could we please update the error message so it doesn't look like something anomalous that we need to debug? Maybe change to an INFO message or something.
Perhaps I'm wrong and this is truly an unexpected (or not easily explainable) condition, in which case, we should dig deeper. :-)
I'll check @saramsey
Yes, the E275 is my fault. I did not understand why this condition was happening and thus put in this lame message. I was thinking it was not because of OverLimit, because I think this is the relevant code for over limits: https://github.com/RTXteam/RTX/blob/8c435f02a1162982bbe4b0d13e1c5dd6d084ecc8/code/ARAX/ARAXQuery/ARAX_query.py#L329-L333
and I don't think it should be hitting the E275 code during over limits, but I'm not certain.
If would be good if we can understand what is happening and handle it more gracefully. I did not get to it.
Thank you @edeutsch for giving the historical perspective here.
Can we close out this issue now, @edeutsch ?
I was planning on labeling this [technical debt] and investigating a bit better and fixing it. I think it is not resolved.
Removed error message, closing.