ZODB icon indicating copy to clipboard operation
ZODB copied to clipboard

ValueError: could not convert string to int, better type checks for OOBTree keys under Python 3?

Open zopyx opened this issue 5 years ago • 14 comments

Plone 5.2RC3, ZODB 5.5.1, Python 3.7

During a Plone migration over plone.restapi I received this random ZODB error, not sure how to reproduce:

2019-06-18 00:02:33,247 WARNING [waitress:354][waitress] application-written content was ignored due to HTTP response that may not contain a message-body: (204 No Content)
2019-06-18 00:02:35,161 ERROR   [ZODB.Connection:809][waitress] Couldn't load state for BTrees.OOBTree.OOBTree 0x52b7
Traceback (most recent call last):
  File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/Connection.py", line 795, in setstate
    self._reader.setGhostState(obj, p)
  File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 633, in setGhostState
    state = self.getState(pickle)
  File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 626, in getState
    return unpickler.load()
ValueError: could not convert string to int
2019-06-18 00:02:35,281 ERROR   [Zope.SiteErrorLog:251][waitress] 1560808955.19451550.22340673761002505 http://localhost:30081/plone_portal/lw/geschiedenis/en/POST_application_json_
Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 142, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 295, in publish_module
  Module ZPublisher.WSGIPublisher, line 229, in publish
  Module ZPublisher.mapply, line 85, in mapply
  Module ZPublisher.WSGIPublisher, line 57, in call_object
  Module plone.rest.service, line 23, in __call__
  Module plone.restapi.services, line 19, in render
  Module plone.restapi.services.content.add, line 93, in reply
  Module plone.restapi.serializer.dxcontent, line 138, in __call__
  Module Products.CMFPlone.CatalogTool, line 429, in searchResults
  Module Products.CMFCore.indexing, line 95, in processQueue
  Module Products.CMFCore.indexing, line 222, in process
  Module Products.CMFCore.indexing, line 50, in reindex
  Module Products.CMFCore.CatalogTool, line 367, in _reindexObject
  Module Products.CMFPlone.CatalogTool, line 351, in catalog_object
  Module Products.ZCatalog.ZCatalog, line 498, in catalog_object
  Module Products.ZCatalog.Catalog, line 369, in catalogObject
  Module Products.PluginIndexes.unindex, line 240, in index_object
  Module Products.PluginIndexes.unindex, line 276, in _index_object
  Module Products.PluginIndexes.unindex, line 216, in insertForwardIndexEntry
  Module ZODB.Connection, line 795, in setstate
  Module ZODB.serialize, line 633, in setGhostState
  Module ZODB.serialize, line 626, in getState
ValueError: could not convert string to int

zopyx avatar Jun 18 '19 04:06 zopyx

Andreas Jung wrote at 2019-6-17 21:28 -0700:

Plone 5.2RC3, ZODB 5.5.1, Python 3.7 During a Plone migration over plone.restapi I received this random ZODB error: ... 2019-06-18 00:02:35,161 ERROR [ZODB.Connection:809][waitress] Couldn't load state for BTrees.OOBTree.OOBTree 0x52b7 Traceback (most recent call last): File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/Connection.py", line 795, in setstate self.reader.setGhostState(obj, p) File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 633, in setGhostState state = self.getState(pickle) File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 626, in getState return unpickler.load() ValueError: could not convert string to int 2019-06-18 00:02:35,281 ERROR [Zope.SiteErrorLog:251][waitress] 1560808955.19451550.22340673761002505 http://localhost:30081/plone_portal/lw/geschiedenis/en/POST_application_json Traceback (innermost last): ... Module Products.PluginIndexes.unindex, line 240, in index_object Module Products.PluginIndexes.unindex, line 276, in _index_object Module Products.PluginIndexes.unindex, line 216, in insertForwardIndexEntry Module ZODB.Connection, line 795, in setstate Module ZODB.serialize, line 633, in setGhostState Module ZODB.serialize, line 626, in getState ValueError: could not convert string to int

The index likely contains both "int" and "string" values. Under Python 3, BTrees can no longer have keys of different types: BTrees are based on an order and Python 3 (unlike Python 2) does not order objects of different types.

d-maurer avatar Jun 18 '19 05:06 d-maurer

The index likely contains both "int" and "string" values. Under Python 3, BTrees can no longer have keys of different types: BTrees are based on an order and Python 3 (unlike Python 2) does not order objects of different types.

That does not explain how this corruption of self._index can occur.

zopyx avatar Jun 18 '19 05:06 zopyx

Andreas Jung wrote at 2019-6-17 22:26 -0700:

The index likely contains both "int" and "string" values. Under Python 3, BTrees can no longer have keys of different types: BTrees are based on an order and Python 3 (unlike Python 2) does not order objects of different types.

That does not explain how this corruption of self._index can occur.

I assume that there are indexed objects which provide index values of different types (at least "int" and "str"). Unter Python 2, this was no problem - and therefore might have been undetected. Up to you to verify the assumption.

The problem might come from the migration instead. The migration uses heuristics to promote Python 2's str to Python 3's str. I do not know what it does if the heuritics fail, maybe it converts Python 2's str to Python 3 bytes, instead. Important operations on Python 3 bytes return int which might explain a convertion to int (even though this does not seem likely).

I would approach the problem as follows: Load the pickles (!) stored for OID 0x52b7. Use "pickletools.dis" to analyse the pickles. Watch out especially for different types related to the BTrees keys.

d-maurer avatar Jun 18 '19 05:06 d-maurer

Andreas Jung wrote at 2019-6-17 23:05 -0700:

Load the pickles (!) stored for OID 0x52b7.

How do you load the raw pickle?

I look at how the ZODB does it. ZODB.Connection: ... def get(self, oid): ... p, _ = self._storage.load(oid) obj = self._reader.getGhost(p)

Note that the pickle returned by self._storage.load actually consists of 2 adjacent pickles - the first describing the class, the second the state.

You will likely want to use ZODB.utils.p64 to convert the number "0x52b7" into a (true) OID.

d-maurer avatar Jun 18 '19 06:06 d-maurer

(Pdb) pickletools.dis(pickle)
    0: \x80 PROTO      3
    2: c    GLOBAL     'BTrees.OOBTree OOBTree'
   26: q    BINPUT     0
   28: .    STOP
highest protocol among opcodes = 2
(Pdb) for x in pickletools.genops(pickle): print(x)
(<pickletools.OpcodeInfo object at 0x7f1b86dad048>, 3, 0)
(<pickletools.OpcodeInfo object at 0x7f1b86da7d08>, 'BTrees.OOBTree OOBTree', 2)
(<pickletools.OpcodeInfo object at 0x7f1b86da7ac8>, 0, 26)
(<pickletools.OpcodeInfo object at 0x7f1b86dad0a8>, None, 28)

zopyx avatar Jun 18 '19 06:06 zopyx

The analysis is like this:

The issue is clearly an index corruption that happened within a Plone migration after a series of plone.restapi calls.

The affected field index is department which is defined in two content-types as a Choice field connected to a vocabulary. However the affected Python pickle of the underlaying index structure indicates that a content object was indexed which has a parent folder with ID department.

So the problem is caused clearly caused on the application level through acquistion. However I raise the question how the ZODB - OOBTrees under Python 3 in particular - could be improved in order to avoid this kind of corruption. The OOBTree implementation would have to perform type checks...

zopyx avatar Jun 18 '19 06:06 zopyx

Andreas Jung wrote at 2019-6-17 23:38 -0700:

(Pdb) pickletools.dis(pickle)
   0: \x80 PROTO      3
   2: c    GLOBAL     'BTrees.OOBTree OOBTree'
  26: q    BINPUT     0
  28: .    STOP
highest protocol among opcodes = 2
(Pdb) for x in pickletools.genops(pickle): print(x)
(<pickletools.OpcodeInfo object at 0x7f1b86dad048>, 3, 0)
(<pickletools.OpcodeInfo object at 0x7f1b86da7d08>, 'BTrees.OOBTree OOBTree', 2)
(<pickletools.OpcodeInfo object at 0x7f1b86da7ac8>, 0, 26)
(<pickletools.OpcodeInfo object at 0x7f1b86dad0a8>, None, 28)

I have told you that the returned pickle consists of 2 adjacent pickles (first class, second state). Above you show the not very interesting first pickle. You must look at the second one.

Apparently, the first pickle is 29 bytes long. This would mean that the second pickle starts at byte 30.

Alternatively, you could wrap your pickle into a file like object (likely io.BytesIO) and call "dis" 2 times on it.

d-maurer avatar Jun 18 '19 06:06 d-maurer

I have told you that the returned pickle consists of 2 adjacent pickles (first class, second state). Above you show the not very interesting first pickle. You must look at the second one.

Apparently, the first pickle is 29 bytes long. This would mean that the second pickle starts at byte 30.

Alternatively, you could wrap your pickle into a file like object (likely io.BytesIO) and call "dis" 2 times on it.

No further analysis needed since the reason for the corruption appears to be clear now.

zopyx avatar Jun 18 '19 07:06 zopyx

Andreas Jung wrote at 2019-6-17 23:54 -0700:

... However I raise the question how the ZODB - OOBTrees under Python 3 in particular - could be improved in order to avoid this kind of corruption. The OOBTree implementation would have to perform type checks...

With a complete Python 3 history, the corruption would have been prevented: you would not have been able to mix different key types in the same BTree. The corruption could only happen because Python 2 has been less strict than Python 3 and you are now interpreting a Python 2 generated BTree under Python 3.

Should you really want to support Python 2 created BTrees under Python 3, then BTrees would need to implement the old Python 2 comparison for objects of different types. I doubt however, that there is general interest.

d-maurer avatar Jun 18 '19 07:06 d-maurer

There is no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3

zopyx avatar Jun 18 '19 07:06 zopyx

Andreas Jung wrote at 2019-6-18 00:27 -0700:

There was no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3

That is very strange. In order to insert a key into a tree, the new key must be compared with (some of) the current keys (to find out where the new key is stored). Python 3 cannot compare objects of different types:

>>> from BTrees.OOBTree import OOBTree
>>> t=OOBTree()
>>> t[1] = 1
>>> t[''] = ''
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'int' and 'str'

Maybe, you can use dm.historical to find out which transaction has introduced the inconsistency and the states before and after this transaction. This should allow to analyse further how the inconsistency could happen.

d-maurer avatar Jun 18 '19 07:06 d-maurer

Andreas Jung wrote at 2019-6-18 00:27 -0700:

There was no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3

You might have been hit by the issue described in "https://github.com/zopefoundation/zodbpickle/pull/47": "... manifests in a random byte replacing a correct one in a pickle, ..."

d-maurer avatar Jun 18 '19 17:06 d-maurer

I stumbled on a similar issue here. I have a field index myIndex and depending on the portal_type I sometimes add a list of strings to it and sometimes just a string. This seems not to work anymore. This happened while migrating a project from Plone 4 to Plone 5.2.2 with Python 3.8. According to your responses so far it seems that I can no longer mix these types within one index, am I right? I now have to use different indexes for each portal_type. But this is a bit of an issue because the index directly accesses a field from the schema. So I have to rename the field in the schema and in every other place in my code?

NicolasGoeddel avatar Nov 27 '20 17:11 NicolasGoeddel

NicolasGoeddel wrote at 2020-11-27 09:08 -0800:

I stumbled on a similar issue here. I have a field index myIndex and depending on the portal_type I sometimes add a list of strings to it and sometimes just a string. This seems not to work anymore. This happened while migrating a project from Plone 4 to Plone 5.2.2 with Python 3.8. According to your responses so far it seems that I can no longer mix these types within one index, am I right?

You are right but this is not a ZODB issue but one of BTrees (and Python 3).

BTrees maintain their key/value pairs essentially in a key ordered structure. Therefore, it is important that all keys can be (completely) ordered.

With Python 2, objects of unrelated types could be (artificiantly) ordered. Python 3 has abolished this hack. Therefore, all keys in a BTree must now have order compatible types. As an exception, None can cohabitate with other keys (even it Python 3 does not define such an ordering).

d-maurer avatar Nov 27 '20 18:11 d-maurer