django-cassandra-engine icon indicating copy to clipboard operation
django-cassandra-engine copied to clipboard

Batch Save using Dataframe

Open oneandonlyonebutyou opened this issue 5 years ago • 0 comments

So let's assume I get 1000s items in the form of JSON and trying to verify the format and save them all .

I did the following :

        df = pd.DataFrame(request.data)
        df["device_uuid"] = device_uuid
        df["serializer"] = None
        df["serializer"] = df.apply(
            lambda row: RawLogSerializer(data=row.to_dict()), axis=1
        )
        df.apply(
            lambda row: row["serializer"].is_valid(raise_exception=True), axis=1
        )
        logs = df.apply(lambda row: row["serializer"].save(), axis=1).tolist()

Then I am saving one by one which I know it is wrong (not optimal ) ...

        [l.save() for l in logs]

How would I should do this correctly?

I asked this question before and the response was :

  with BatchQuery() as b:
       YourModel.batch(b).create(...)

I do not know How it would look like in my example...

Thanks again

oneandonlyonebutyou avatar Dec 27 '18 22:12 oneandonlyonebutyou