couchdb-python
couchdb-python copied to clipboard
AttachmentField for schema
From [email protected] on January 23, 2010 00:29:44
It would be nice to be able to have an AttachmentField type available in the schema that stores a Python string as a document attachment. This would be great for more efficient storage of large text snippets.
I tried figuring out how to implement this myself, but my head is still reeling from trying to understand how the schema stuff works :)
Original issue: http://code.google.com/p/couchdb-python/issues/detail?id=112
From [email protected] on January 23, 2010 00:43:38
I recently did something similar on a project of mine where I used a UserDict object to wrap the json object and implemented lazy fetching on access to any attachments.
Seems like something I should take a crack at here (not the UserDict, but lazily fetched attachment objects).
From [email protected] on January 27, 2010 15:01:11
I've just spent a lot of time reading and understanding the code, especially the metaclass and descriptor bits and think I'm starting to get a handle on it now.
One thing about lazy loading that kind of throws a wrench in to the works is that the instances of Document would need to retain a reference to the database they were last loaded from in order for it to work. In that case I'm not sure how to handle the behavior in the case the database could not be reached any more.
From [email protected] on January 28, 2010 02:51:15
By "database could not be reached" do you mean the Python reference or the actual HTTP resource? Tracking the Python ref should be achievable using a weakref. The HTTP resource could be unavailable at any time so best to just fail in the usual way.
Perhaps more importantly, keeping hold of a database reference for attachment fields may affect the current API.
At the moment, you save a schema document by calling Document.store(database) but lazy attachment retrieval would implicitly use the database the document was loaded from. Probably ok; I can't think of any reason to do something else.
What about storing the document's new/updated attachments? When does that happen - when the attachment is set or when Document.store(db) is called? Should Document.store(database) then hang on to this database arg's reference for future attachment gets? Do attachment updates get sent as inline attachments or does Document.store make multiple HTTP requests?
From [email protected] on January 28, 2010 03:08:14
As soon as I posted the above I realised that saving attachments when Document.store is called is not a good idea for a couple of reasons:
- Sending the attachment inline may unexpectedly affect memory usage if it turns out to be a big chunk of text.
- Making multiple requests could mean that the initial document update works an attachment update fails. Document.store would have to raise an exception but it would be up to the application to work out what failed, how many requests had succeeded, etc.
Perhaps if the field type was called SmallChunkOfTextStoreAsAnAttachmentField the inline attachment option might work out ok. ;-)
From kxepal on February 05, 2012 19:14:13
@Matt,
Attachments could be streamed[1] on document saving and loading if Content-Type: multipart/related
header is specified, so I suppose we could use this trick. However, looks like it's not recommended to use this way to load document with all attachments and I'd like to agree with it. How to fix this case? Replace attachment data with some read(db)
function that retrieves attachment data, implicitly using document id, rev and target attachment name? Surely, some tests are needed, but looks like there is nothing impossible(:
[1] http://wiki.apache.org/couchdb/HTTP_Document_API#Multiple_Attachments