lakeFS icon indicating copy to clipboard operation
lakeFS copied to clipboard

API ObjectStats.size_bytes is always returned by lakeFS but is optional

Open arielshaqed opened this issue 2 weeks ago • 0 comments

[Reported by a user]

API field ObjectStats.size_bytes is not required. However it is always returned by lakeFS calls.[^1] So using it is more cumbersome; we added documentation in #7923 to say as much, but obviously code trumps documentation.

See below for why changing this will kinda break some 1.0 SDK compatibility guarantees, even though it is in favour of the clients.

Alternatives:

  • Add new variants of all API calls, that return a type that guarantees this field;
  • Do it in 2.0, which will be allowed to break compatibility - especially for something so minor;
  • Do it now because we skirt the compatibility guarantees but never actually break them.

1.0 SDK compatibility?

Obviously guaranteeing a field keeps the JSON wire format compatible - the API is not affected.

The Python generated SDK (clients/python) will have this diff:

@@ -32,7 +32,7 @@ class ObjectStats(BaseModel):
     checksum: StrictStr = Field(...)
-    size_bytes: Optional[StrictInt] = Field(None, description="The number of bytes in the object.  lakeFS always populates this field when returning ObjectStats.  This field is optional _for the client_ to supply, for instance on upload. ")
+    size_bytes: StrictInt = Field(..., description="The number of bytes in the object.  lakeFS always populates this field when returning ObjectStats.  This field is optional _for the client_ to supply, for instance on upload. ")
     mtime: StrictInt = Field(..., description="Unix Epoch in seconds")

This is a type change. It should be invisible to Python standard typing tools. However tools that look for unnecessary comparisons may complain about code that needlessly checks whether size_bytes is present.

The Java generated SDK (clients/java) will have this diff:

@@ -255,7 +255,7 @@ public class ObjectStats {
    * The number of bytes in the object.  lakeFS always populates this field when returning ObjectStats.  This field is optional _for the client_ to supply, for instance on upload. 
    * @return sizeBytes
   **/
-  @javax.annotation.Nullable
+  @javax.annotation.Nonnull
   public Long getSizeBytes() {
     return sizeBytes;
   }
@@ -462,6 +462,7 @@ public class ObjectStats {
     openapiRequiredFields.add("checksum");
+    openapiRequiredFields.add("size_bytes");
     openapiRequiredFields.add("mtime");
   }

Again, the return type of ObjectStats.getSizeBytes() will change from @Nullable to @Nonnull. This will cause FindBugs (and therefore some IDEs) to start warning about this value. However code will continue to compile.

So nothing actually breaks, but some correct code could now be flagged by some toolchains.

[^1]: Many moons ago this message was also used to upload things to the server - and then this field might be missing. Those days are long gone.

arielshaqed avatar Jun 26 '24 12:06 arielshaqed