opendal icon indicating copy to clipboard operation
opendal copied to clipboard

bug: behavior tests try to create huge files with large `write_total_max_size`

Open erickguan opened this issue 9 months ago • 7 comments

Describe the bug

When setting a large write_total_max_size in a native_capability, the behavior tests attempt to allocate extremely large memory regions.

See the functions in core/tests/behavior/utils.rs.

Steps to Reproduce

  1. Add write_total_max_size: Some(100*1024*1024*1024) to a service's capability. For example, in core/src/services/fs/backend.rs.
  2. Run the behavior tests:
    OPENDAL_FS_ROOT=/tmp/a OPENDAL_TEST=fs cargo test behavior::test \
      --features tests,services-fs \
      -- \
      --show-output
    
  3. Observe the test output:
running 146 tests
memory allocation of 83268712952 bytes failed
memory allocation of 83385328405 bytes failed
memory allocation of 86869557384 bytes failed
memory allocation of 71047600905 bytes failed

Expected Behavior

Behavior tests should run normally. They should not attempt to allocate huge memory regions or create excessively large files.

Additional Context

No response

Are you willing to submit a PR to fix this bug?

  • [ ] Yes, I would like to submit a PR.

erickguan avatar Mar 23 '25 20:03 erickguan

It seems that even S3 didn't set this value.

I feel like it doesn't make sense to provide such a capability directly in services. It also overlaps somewhat with write_multi_max_size.

We need to re-think about this capability.

Xuanwo avatar Mar 24 '25 02:03 Xuanwo

It seems that even S3 didn't set this value.

I feel like it doesn't make sense to provide such a capability directly in services. It also overlaps somewhat with write_multi_max_size.

We need to re-think about this capability.

for now only d1 and nebula service use it. maybe we can change it in other side but I do not think nebula really need it ...

yihong0618 avatar Mar 24 '25 02:03 yihong0618

It seems that even S3 didn't set this value.

I feel like it doesn't make sense to provide such a capability directly in services. It also overlaps somewhat with write_multi_max_size.

We need to re-think about this capability.

I wrote for drop the write_multi_max_size, and make a special case for d1 because i searched code base it seems only used in behavior tests?

yihong0618 avatar Mar 24 '25 03:03 yihong0618

This issue can connect to https://github.com/apache/opendal/issues/5846.

Maybe what we need is simply to control our test cases instead of relying on a public API capabilities API. It should be good if we have something like OPENDAL_TEST_MAX_SIZE.

Xuanwo avatar Mar 24 '25 03:03 Xuanwo

This issue can connect to #5846.

Maybe what we need is simply to control our test cases instead of relying on a public API capabilities API. It should be good if we have something like OPENDAL_TEST_MAX_SIZE.

cool

yihong0618 avatar Mar 24 '25 03:03 yihong0618

This can be done by dropping write_multi_max_size since #5859 is not a good fix

after dropping write_multi_max_size

two ways to fix the flaky tests (for d1 and nebula)

  1. use special case for them in gen_bytes
  2. OPENDAL_TEST_MAX_SIZE

I am not quite sure which one is better, OPENDAL_TEST_MAX_SIZE will bring new env and special case is not so easy to change

yihong0618 avatar Mar 25 '25 00:03 yihong0618

I tested Cloudflare’s D1 database, and I found that the maximum size is just over 2 MB. However, Cloudflare’s documentation does not accurately reflect this limit.

Considerations on the Use Cases

Users pay for their storage, so someone will eventually need to look into the maximum limit. Service providers set caps for various reasons—such as sensible defaults or implementation limitations—and the service will return errors if the limit is exceeded. When working with a service, it's easy to know this. OpenDAL could generate an error if someone wants that.

That said, it is frustrating that the maximum size limit can be inconvenient when copying data across services.

Tests

I agree with using an environment variable. Additionally, GitHub’s runners have limited memory, so we cannot use all the available memory at once without risking an out-of-memory (OOM) error.

erickguan avatar Mar 26 '25 08:03 erickguan