IndexedCollection is not Serializable
What steps will reproduce the problem?
1. Try to serialize IndexedCollection
What is the expected output? What do you see instead?
Serializable IndexedCollection
What version of the product are you using? On what operating system?
1.0.3 on Mac OS X
Please provide any additional information below.
Support for serialization would be great. In this case user would be able to
setup some indexes, serialize the entity and later retrieve it already with
indexes.
Original issue reported on code.google.com by [email protected] on 5 Feb 2013 at 2:47
Thanks mitja for the request. I can see how this could be useful.
It's probably the case that most of the in-memory indexes, could be rebuilt in
memory faster than they could be deserialized from disk. So only the objects in
the collection would need to be serialized/deserialized, and then the indexes
re-added. It sounds like a case for adding a readObject() method.
In the meantime, say with version 1.0.3, you could use the
IndexedCollectionSerializer class below, to serialize an indexed collection.
The main catch, is when deserialized, you *need to re-add the indexes!*. See
the SerializerDemo class below for an example.
I'll think about better serialization support for the next release. Thanks!
--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import com.googlecode.cqengine.index.radix.RadixTreeIndex;
import java.io.File;
public class SerializerDemo {
public static void main(String[] args) {
// *************** Build some collection... ***************
IndexedCollection<Foo> myCollection = CQEngine.newInstance();
addIndexesToMyCollection(myCollection);
// Add some objects...
myCollection.add(new Foo("bar"));
myCollection.add(new Foo("baz"));
// *************** Serialize the collection... ***************
IndexedCollectionSerializer.serialize(myCollection, new File("foo.dat"));
// *************** Deserialize the collection... ***************
IndexedCollection<Foo> myDeserializedCollection = IndexedCollectionSerializer.deserialize(new File("foo.dat"));
// Need to add indexes again to the deserialized collection!!...
addIndexesToMyCollection(myDeserializedCollection);
// ************ myDeserializedCollection should now have the same state as myCollection *******
}
static void addIndexesToMyCollection(IndexedCollection<Foo> indexedCollection) {
indexedCollection.addIndex(RadixTreeIndex.onAttribute(Foo.NAME));
}
}
--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import com.googlecode.cqengine.attribute.Attribute;
import com.googlecode.cqengine.attribute.ReflectiveAttribute;
import java.io.Serializable;
public class Foo implements Serializable {
public final String name;
Foo(String name) {
this.name = name;
}
public static final Attribute<Foo, String> NAME = ReflectiveAttribute.forField(Foo.class, String.class, "name");
}
--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class IndexedCollectionSerializer {
public static <O> void serialize(IndexedCollection<O> indexedCollection, File destination) {
OutputStream os = null;
try {
os = new BufferedOutputStream(new FileOutputStream(destination));
List<O> objectsList = new ArrayList<O>(indexedCollection);
ObjectOutputStream oos = new ObjectOutputStream(os);
oos.writeObject(objectsList);
oos.flush();
}
catch (Exception e) {
throw new IllegalStateException(e);
}
finally {
if (os != null) {
try { os.close(); } catch (Exception ignore) {}
}
}
}
public static <O> IndexedCollection<O> deserialize(File source) {
ObjectInputStream ois = null;
try {
ois = new ObjectInputStream(new BufferedInputStream(new FileInputStream(source)));
@SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
List<O> objectsList = (List<O>) ois.readObject();
return CQEngine.copyFrom(objectsList);
}
catch (Exception e) {
throw new IllegalStateException(e);
}
finally {
if (ois != null) {
try { ois.close(); } catch (Exception ignore) {}
}
}
}
}
--------------------------------------------------------------------------------
Original comment by [email protected] on 6 Feb 2013 at 6:55
- Changed state: Accepted
- Added labels: Type-Enhancement
- Removed labels: Type-Defect
Niall,
first of all thank you for very fast response. Unfortunately this is not my use
case. I would use to store IndexedCollection to some Memcache engine(needs to
implement Serializable) and not to file. So due the fact that you stated that
serializing indexes would take to much time, it would be helpful to avoid
"CQEngine.copyFrom" step, so that IndexedCollection could be "stored" directly
without copying and again retrieved without copying, even if indexes must be
added again.
Otherwise I need to say: Great project!!
Original comment by [email protected] on 6 Feb 2013 at 7:18
Usually memcache is used to serialize a single object per key. Whereas in this
case you will store an entire collection against a single key? Will this be
retrieved on application startup, or are you planning to do this for each
request?
Memcache is basically "remote RAM". Which indeed is usually faster than local
disk, but slower than local RAM. It might be worth looking at a distributed
cache which supports local RAM with distributed eviction, instead of going
across the network every time. Also take a look at Kryo as an alternative to
Java serialization. It is much faster and does not require classes to implement
the Serializable interface. I've not tested it with IndexedCollection, but I've
had good results from it in the past.
Nonetheless, even if Kryo works with IndexedCollection right now, there are
still a few optimizations to CQEngine which could improve serialization. I will
add support to serialize the indexed collection without copyFrom in the next
release.
Original comment by [email protected] on 8 Feb 2013 at 12:54
Not exactly what was requested, but IndexedCollection can now be persisted to
off-heap memory or to a file on disk. It does not rely on Java serialization.
Original comment by [email protected] on 20 Apr 2015 at 8:25
- Changed state: Fixed
Reopening this issue, as the current situation is not a complete fix, and could probably be improved.
I don't understand the status of the issue but I would like to contribute with my use case.
I just need to snapshot a collection along with its already constructed indexes to disk in order to simply survive an application restart.
My collection is fed and indexed with 200.000 records from SQL that mostly never change. Not a huge figure but SQL source could be slow in some scenarios. So I simply would like to:
- Load all SQL records for the first time
- Snapshot them to an opaque binary file
- Restart/kill the application
- Have the application check if the file is up to date with database (
SELECT MAX (last_update)...) - Load the collection and the indexes from disk to memory
I don't understand if disk store satisfies my scenario. Peeking around the code, it looks like that disk store internally uses SQLite from a disk source, which I don't think loads the data store to memory. What I want to achieve for superfast queries is to snapshot the indices along with the data if possible. Once indices are built, there is no need to recompute them when underlying data has not changed.
Many thanks
FWIW - I use the following and it survives restarts without having to do any additional management
ConcurrentIndexedCollection<Job>(DiskPersistence.onPrimaryKeyInFile(Job.ID, new File(dbdir.toString(), "jobs.dat")));
If you only use disk persistence, and you only add disk indexes, then the collection and indexes will persist between restarts.
You will need to programmatically "add" the disk indexes to the collection again at startup, but they will detect that they were persisted previously so they won't be rebuilt.
However if you add non-disk indexes to the collection, then the state of those indexes will be completely lost after a restart. So when you add those indexes to the collection again after a restart, they will be completely rebuilt.
Hope that helps, Niall
On Fri, 12 Apr 2019, 17:46 Jayaram Sreevalsan, [email protected] wrote:
FWIW - I use the following and it survives restarts without having to do any additional management
ConcurrentIndexedCollection<Job>(DiskPersistence.onPrimaryKeyInFile(Job.ID, new File(dbdir.toString(), "jobs.dat")));
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/npgall/cqengine/issues/12#issuecomment-482643954, or mute the thread https://github.com/notifications/unsubscribe-auth/ACuJisjBKHBfe22V0J33fsIMo5JSBAALks5vgLhQgaJpZM4GVYrp .