crawlee-python icon indicating copy to clipboard operation
crawlee-python copied to clipboard

Add support for single-file dataset representation in memory storage client

Open vdusek opened this issue 1 year ago • 1 comments

  • Represent the dataset as a single file (or at least add an option for it) instead of a directory containing one file per item in the memory storage client.
  • There are already the export_to functions, which allow it. But that does not change the point that datasets are represented as file-per-item directories. So I would like to at least open it.
  • Definitely, further discussion should precede the implementation.

vdusek avatar Nov 15 '24 12:11 vdusek

Also, this should probably be done after #92.

One more point - this makes sense for storage backends such as S3 where the client pays for each write.

janbuchar avatar Nov 15 '24 12:11 janbuchar