pydio-sync-java icon indicating copy to clipboard operation
pydio-sync-java copied to clipboard

Client sometimes resyncs although no changes done

Open segdy opened this issue 10 years ago • 2 comments

Hi,

My client installation is fully synchronized with the server. Still, suddenly(?) it wants to retransfer everything although everything is in sync and exists on both ends. The log says:

8546 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: 26018 [PydioScheduler_Worker-1] INFO root - Parsed 503 nodes from stream 26018 [PydioScheduler_Worker-1] INFO root - Loaded part: 503 26018 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Fotos/... 35049 [PydioScheduler_Worker-1] INFO root - Parsed 511 nodes from stream 35049 [PydioScheduler_Worker-1] INFO root - Loaded part: 511 35065 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Fotos/... 39065 [PydioScheduler_Worker-1] INFO root - Parsed 102 nodes from stream 39065 [PydioScheduler_Worker-1] INFO root - Loaded part: 102 39065 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Fotos/... 42394 [PydioScheduler_Worker-1] INFO root - Parsed 16 nodes from stream 42394 [PydioScheduler_Worker-1] INFO root - Loaded part: 16 42394 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Fotos/... 47363 [PydioScheduler_Worker-1] INFO root - Parsed 117 nodes from stream 47363 [PydioScheduler_Worker-1] INFO root - Loaded part: 117 47363 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Office/... 50769 [PydioScheduler_Worker-1] INFO root - Parsed 10 nodes from stream 50769 [PydioScheduler_Worker-1] INFO root - Loaded part: 10 50769 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Office/... 54738 [PydioScheduler_Worker-1] INFO root - Parsed 4 nodes from stream 54738 [PydioScheduler_Worker-1] INFO root - Loaded part: 4 54738 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Office/... 58004 [PydioScheduler_Worker-1] INFO root - Parsed 6 nodes from stream 58004 [PydioScheduler_Worker-1] INFO root - Loaded part: 6 58004 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Shared/... 61535 [PydioScheduler_Worker-1] INFO root - Parsed 6 nodes from stream 61535 [PydioScheduler_Worker-1] INFO root - Loaded part: 6 61566 [PydioScheduler_Worker-1] INFO root - Saving nodes 61879 [PydioScheduler_Worker-1] INFO root - Nodes saved: 53333 ms 62223 [PydioScheduler_Worker-1] INFO root - LOCAL DIFFS: 1317 62223 [PydioScheduler_Worker-1] INFO root - REMOTE DIFFS: 1267 66556 [PydioScheduler_Worker-1] INFO root - Downloading 1 bytes 66556 [PydioScheduler_Worker-1] INFO root - Uploading 3636489 bytes

LOCAL DIFFS and REMOTE DIFFS - huge number! How does the client at all decice when to transfer files? According to checksums, timestamps or something different? Ideally I would expect if I have the same files on each side and THEN setup pydio sync that not everything is retransferred!

I have also installed the PECL rsync extension. How can I check if it is indeed active? In the repository settings, the "File Hasher" plugin was already enabled (with "Cache full tree" = false). If I set it to true, the client ends in an infinite loop: The log has about 1000s entries of:

1937418 [PydioScheduler_Worker-1] INFO root - Taking remote content for node: /Fotos/Fotos1/Fotos2 1940543 [PydioScheduler_Worker-1] INFO root - Parsed 503 nodes from stream 1940543 [PydioScheduler_Worker-1] INFO root - Loaded part: 503

and eventually - after 1hr of 100% CPU usage running out of steam:

1806043 [PydioScheduler_Worker-1] ERROR org.quartz.core.JobRunShell - Job sync.2 threw an unhandled Exception: java.lang.OutOfMemoryError: GC overhead limit exceeded at com.sun.org.apache.xpath.internal.VariableStack.reset(VariableStack.java:141) at com.sun.org.apache.xpath.internal.VariableStack.(VariableStack.java:51) at com.sun.org.apache.xpath.internal.jaxp.JAXPVariableStack.(JAXPVariableStack.java:46) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.eval(XPathExpressionImpl.java:120) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.eval(XPathExpressionImpl.java:105) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:187) at com.scireum.open.xml.XMLNodeImpl.queryXMLNode(XMLNodeImpl.java:61) at io.pyd.synchro.SyncJob$7.process(SyncJob.java:1636) at com.scireum.open.xml.SAX2DOMHandler.nodeUp(SAX2DOMHandler.java:56) at com.scireum.open.xml.SAX2DOMHandler.endElement(SAX2DOMHandler.java:107) at com.scireum.open.xml.XMLReader.endElement(XMLReader.java:95) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:183) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1342) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2770) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) at com.scireum.open.xml.XMLReader.parse(XMLReader.java:196) at io.pyd.synchro.SyncJob.parseNodesFromStream(SyncJob.java:1652) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1571) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) 1806043 [PydioScheduler_Worker-1] ERROR org.quartz.core.ErrorLogger - Job (sync.2 threw an exception. org.quartz.SchedulerException: Job threw an unhandled exception. [See nested exception: java.lang.OutOfMemoryError: GC overhead limit exceeded] at org.quartz.core.JobRunShell.run(JobRunShell.java:224) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at com.sun.org.apache.xpath.internal.VariableStack.reset(VariableStack.java:141) at com.sun.org.apache.xpath.internal.VariableStack.(VariableStack.java:51) at com.sun.org.apache.xpath.internal.jaxp.JAXPVariableStack.(JAXPVariableStack.java:46) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.eval(XPathExpressionImpl.java:120) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.eval(XPathExpressionImpl.java:105) at com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:187) at com.scireum.open.xml.XMLNodeImpl.queryXMLNode(XMLNodeImpl.java:61) at io.pyd.synchro.SyncJob$7.process(SyncJob.java:1636) at com.scireum.open.xml.SAX2DOMHandler.nodeUp(SAX2DOMHandler.java:56) at com.scireum.open.xml.SAX2DOMHandler.endElement(SAX2DOMHandler.java:107) at com.scireum.open.xml.XMLReader.endElement(XMLReader.java:95) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:183) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1342) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2770) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) at com.scireum.open.xml.XMLReader.parse(XMLReader.java:196) at io.pyd.synchro.SyncJob.parseNodesFromStream(SyncJob.java:1652) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1571) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586) at io.pyd.synchro.SyncJob.takeRemoteSnapshot(SyncJob.java:1586)

What can be the problem here? Is the rsync extension working as it should?

Thank you!

segdy avatar May 12 '14 08:05 segdy

Hi segdy, seems like the initial db may have been corrupted. Can you maybe try to delete this task and recreate it? It will restart synchro from scratch, and probably make a better reconciliation of the local and remote files (will be ignore if they are found on both sides). FYI we are rewriting a new sync client and this java-based one will soon be abandonned. -c

cdujeu avatar May 29 '14 09:05 cdujeu

I tried that already, it then transfers all files again although they exist already at the client (see the log above).

Just to understand the situation: How does pydio sync/reconcile and how to debug that? Does it work on timestamps, checksums etc.?

Furthermore, I am so hard waiting got the new sync client :) may I ask what is the current ETA? (I think I read end of May some time ago which is passed already ...)

segdy avatar Jun 26 '14 03:06 segdy