scrapy-djangoitem icon indicating copy to clipboard operation
scrapy-djangoitem copied to clipboard

Should be possible update existing model instance

Open craiga opened this issue 6 years ago • 2 comments

I'm scraping a listing of events and storing them in an Event model. This model inherits from model_utils.models.TimeStampedModel which gives auto-populated created and updated fields, which I find useful.

Unfortunately, there's no way for me to reuse an existing, populated instance of Event in my pipeline without manually copying each field in the DjangoItem to my model, so created is being updated every time.

There should be a way to do this, perhaps by allowing setting of .instance to my model instance then moving attribute setting into .save().?

craiga avatar Jul 02 '19 20:07 craiga

FWIW, My pipeline looks like this:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            event_id = event.id
            event = item.save(commit=False)
            event.id = event_id

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

Ideally, I'd like to be able to do something like:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            item.instance = event

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

craiga avatar Jul 02 '19 20:07 craiga

Workaround is to set created in the pipeline:

class EventDjangoPipeline(object):
    def process_item(self, item, spider):
        try:
            event = models.Event.objects.get(url=item["url"])
            item["created"] = event.created
            event_id = event.id
            event = item.save(commit=False)
            event.id = event_id

        except models.Event.DoesNotExist:
            pass

        item.save()
        return item

craiga avatar Jul 02 '19 20:07 craiga