fluid icon indicating copy to clipboard operation
fluid copied to clipboard

[BUG] Fluid native mount points won't be synced in DataLoad

Open TrafalgarZZZ opened this issue 4 years ago • 1 comments

What is your environment(Kubernetes version, Fluid version, etc.)

Describe the bug Users can define the property loadMetadata: true to ask DataLoad load metadata before it loads the actual data. For fluid-native mountpoints like pvc:// and local://, a POSIX-compliant metadata sync (e.g. run du -sh) is additionally needed before the DataLoad loads their metadata in the ddc engine.

However, the current implementation of DataLoad will not sync metadata of these fluid-native mountpoints because the Pod of the DataLoad job mistakenly mounts the wrong folder. You can see the chart here. The .path used in the chart is actually the absolute path in ddc engine, so in most case the real local folder that needs to sync is not mounted.

What you expect to happen: To ensure the ddc engine can see files and folders under some fluid-native mountpoints, a POSIX-compliant sync operation is needed.

How to reproduce it Here is a quick example:

# dataset.yaml
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: test
spec:
  mounts:
    - mountPoint: local:///mnt/test1
      name: test1
    - mountPoint: local:///mnt/test2
      name: test2
---
apiVersion: data.fluid.io/v1alpha1
kind: AlluxioRuntime
metadata:
  name: test
spec:
  replicas: 1
  tieredstore:
    levels:
      - mediumtype: SSD
        path: /var/lib/docker/alluxio
        quota: 2Gi
        high: "0.95"
        low: "0.7"
# dataload.yaml
apiVersion: data.fluid.io/v1alpha1
kind: DataLoad
metadata:
  name: ya-dataload
spec:
  dataset:
    name: test
    namespace: default
  loadMetadata: true
  target:
    - path: /test1
      replicas: 1
    - path: /test2
      replicas: 1

Additional Information

TrafalgarZZZ avatar Jun 07 '21 08:06 TrafalgarZZZ

Any progress?

xieydd avatar Aug 17 '21 03:08 xieydd