qos fails with `Attribute is not defined: QOS_POLICY`
Found in log files of prod system.
java.lang.IllegalStateException: Attribute is not defined: QOS_POLICY
at org.dcache.vehicles.FileAttributes.guard(FileAttributes.java:335)
at org.dcache.vehicles.FileAttributes.getQosPolicy(FileAttributes.java:777)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:136)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:129)
at org.dcache.qos.local.clients.LocalQoSRequirementsClient.fileQoSRequirementsRequested(LocalQoSRequirementsClient.java:81)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.fileQoSStatusChanged(FileQoSStatusHandler.java:470)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.lambda$handleAddCacheLocation$0(FileQoSStatusHandler.java:195)
at org.dcache.util.BoundedExecutor$Worker.run(BoundedExecutor.java:247)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Casuse:
FileAttributes attributes = descriptor.getAttributes();
if (attributes.isDefined(FileAttribute.QOS_POLICY) && attributes.getQosPolicy() == null) {
/*
* This is a lazily discovered change, so
* as a matter of consistency it calls for removal
* of the pnfsid from the engine's tracking tables.
*/
engineDao.delete(update.getPnfsId());
return super.fetchRequirements(update, descriptor);
}
return fetchRequirements(update, descriptor);
}
@Override
public FileQoSRequirements fetchRequirements(FileQoSUpdate update, FileQoSRequirements descriptor)
throws QoSException {
FileAttributes attributes = descriptor.getAttributes();
String name = attributes.getQosPolicy();
When policy is not defined, then fetchRequirements is called, which invokes attributes.getQosPolicy()
TheString name = attributes.getQosPolicy();` called when
reviewing
@kofemann did you mean to finish that sentence? Otherwise, I dont think it makes sense
Close by 8b9dfb399767db4c4483ff107a24eed47f633a61
Though the original error is not there, the issue still pops up:
29 Jan 2025 14:46:00 (qos-engine) [] Thread Thread[pool-357-thread-1,5,qos-engine-threads] died
java.lang.NullPointerException: null
at java.base/java.util.Objects.requireNonNull(Objects.java:209)
at java.base/java.util.Optional.of(Optional.java:113)
at org.dcache.vehicles.FileAttributes.toOptional(FileAttributes.java:834)
at org.dcache.vehicles.FileAttributes.getQosPolicyIfPresent(FileAttributes.java:773)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:124)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:116)
at org.dcache.qos.local.clients.LocalQoSRequirementsClient.fileQoSRequirementsRequested(LocalQoSRequirementsClient.java:81)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.fileQoSStatusChanged(FileQoSStatusHandler.java:470)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.lambda$handleAddCacheLocation$0(FileQoSStatusHandler.java:195)
at org.dcache.util.BoundedExecutor$Worker.run(BoundedExecutor.java:247)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Hello @kofemann as reported in todays Tier 1 meeting, I am observing similar error. The dcache version is dcache-9.2.32-1.noarch, JAVA version Red_Hat-17.0.13.0.11-1.
in the
qos-engine:
29 Jan 2025 09:05:59 [pool-7-thread-294] [] Uncaught exception in thread pool-7-thread-294java.lang.NullPointerException: null
at java.base/java.util.Objects.requireNonNull(Objects.java:209)
at java.base/java.util.Optional.of(Optional.java:113)
at org.dcache.vehicles.FileAttributes.toOptional(FileAttributes.java:834)
at org.dcache.vehicles.FileAttributes.getQosPolicyIfPresent(FileAttributes.java:773)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:127)
at org.dcache.qos.services.engine.provider.PolicyBasedQoSProvider.fetchRequirements(PolicyBasedQoSProvider.java:119)
at org.dcache.qos.local.clients.LocalQoSRequirementsClient.fileQoSRequirementsRequested(LocalQoSRequirementsClient.java:81)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.fileQoSStatusChanged(FileQoSStatusHandler.java:470)
at org.dcache.qos.services.engine.handler.FileQoSStatusHandler.lambda$handleAddCacheLocation$0(FileQoSStatusHandler.java:195)
at org.dcache.util.BoundedExecutor$Worker.run(BoundedExecutor.java:247)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
And qos-verifier
(qos-verifier@dcdncore05qosDomain) admin > verify history
2025/01/29 07:42:14 (0000E0C85F1E79A144609C00F07E63854477 POOL_STATUS_UP)(last adjustment: VOID)(parent dcdn007_1, retried 0) CacheException: Processing for 0000E0C85F1E79A144609C00F07E63854477 failed during verify. NullPointerException: null
2025/01/29 07:42:14 (000068DE798C227E41FABB771202AD93BD9C POOL_STATUS_UP)(last adjustment: VOID)(parent dcdn007_1, retried 0) CacheException: Processing for 000068DE798C227E41FABB771202AD93BD9C failed during verify. NullPointerException: null
I do not see this on the integration instance on Java 17 and dcache-9.2.20-1.noarch
Hope this information helps. Carlos
The problem here is that null is a valid value for FileAttribute.QOS_POLICY. Writing attributes.setQosPolicy(null); is valid. However, null is not a valid value to be encapsulated within Optional<String>.
Philosophically, there are two kinds of "unknown value" for QOS_POLICY: the value is simply unknown (FileAttributes#setQosPolicy has not been called) or it is known that the file doesn't have a policy (FileAttributes#setQosPolicy has been called, with a null argument).
Therefore, the method FileAttributes.getQosPolicyIfPresent cannot work return an Optional<String>.
There are three solutions (I can think of):
- the two unknowns (above) are combined. The
getQosPolicyIfPresentmethod is updated so it returnsOptional.empty()ifFileAttributes#setQosPolicyhas not been called or ifFileAttributes#setQosPolicywas called with anullargument. - The signature is updated to return
Optional<Optional<String>>. ThegetQosPolicyIfPresentmethod returnsOptional.empty()ifFileAttributes#setQosPolicyhas not been called. It returnsOptional.of(Optional.empty())ifFileAttributes#setQosPolicywas called with anullargument, otherwise it returnsOptional.of(Optional.of(policy))ifFileAttributes#setQosPolicywas called with the non-null argumentpolicy. - Remove
nullas a valid value for QoS policy. Instead, there would be a standard/place-holder value used instead. For example, the stringDEFAULT.
I'd find 1. dangerous, as missing information is treated as if the file has the default policy.
In the short-term, 2. might be the best approach, but perhaps 3. is worth considering.
Reviewing
@khys95 this is fixed now, isn't it? Can we close the issue?