scio icon indicating copy to clipboard operation
scio copied to clipboard

SMB module is pulling all storage implementations

Open RustedBones opened this issue 3 years ago • 0 comments

Depending on scio-smb pulls transitively all storage implementation dependencies for:

  • parquet
  • json
  • avro
  • tensorflow

TensorFlow dependencies alone are ~200Mb.

Users should only have the desired storage implementation in their classpath

SMB should either:

  • be available in the storage implementation themselves
  • mark storage impl dependencies as provided
  • have specific impl modules like

RustedBones avatar Apr 25 '22 16:04 RustedBones