HyperDex icon indicating copy to clipboard operation
HyperDex copied to clipboard

java client doesn't work after network split

Open umatomba opened this issue 10 years ago • 1 comments

Hi Robert,

trying to test the 5 node cluster again....

Space ycsb configuration:

space usertable
key k
attributes field0, field1, field2, field3, field4,
           field5, field6, field7, field8, field9
tolerate 2 failure

Workload used:

recordcount=1000000
operationcount=500000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=true
readproportion=0.1
updateproportion=0.9
scanproportion=0
insertproportion=0
requestdistribution=zipfian
threadcount=1

After the cluster setup the config looks like:

server 1998750160347642936 192.168.100.3:2012 AVAILABLE
server 3667604389049738774 192.168.100.5:2012 AVAILABLE
server 6903549772667245983 192.168.100.2:2012 AVAILABLE
server 12441010121126734978 192.168.100.1:2012 AVAILABLE
server 12691374307851476699 192.168.100.4:2012 AVAILABLE

Now we can see all 1,000,000 elements in usertable, fine:

>>> c = hyperdex.client.Client('192.168.100.1', 1982)
>>> c.count('usertable', {'k': hyperdex.client.Regex('.*')})
1000000L

Lets start the test:

java -Djava.library.path=/usr/local/lib/ com.yahoo.ycsb.Client -t -db org.hyperdex.ycsb.HyperDex -P /hyperdex
_build/ycsb-0.1.4/workloads/workloada -p "hyperdex.host=192.168.100.1" -p "hyperdex.port=1982" -s

Making the network split for 10 sec every 30 sec on nodes 4,5 we see this log:

 10 sec: 23924 operations; 2388.82 current ops/sec; [UPDATE AverageLatency(us)=440.79] [READ AverageLatency(us)=107.13]
 20 sec: 49520 operations; 2559.34 current ops/sec; [UPDATE AverageLatency(us)=415.98] [READ AverageLatency(us)=97]
 30 sec: 72777 operations; 2325.7 current ops/sec; [UPDATE AverageLatency(us)=466.64] [READ AverageLatency(us)=91.18]
 40 sec: 93333 operations; 2055.39 current ops/sec; [UPDATE AverageLatency(us)=524.7] [READ AverageLatency(us)=96.18]
 50 sec: 130622 operations; 3728.9 current ops/sec; [UPDATE AverageLatency(us)=275.01] [READ AverageLatency(us)=72.35]
 60 sec: 146083 operations; 1546.1 current ops/sec; [UPDATE AverageLatency(us)=595.3] [READ AverageLatency(us)=72.74]
 70 sec: 146148 operations; 6.5 current ops/sec; [UPDATE AverageLatency(us)=193092.31] [READ AverageLatency(us)=19839.25]
 80 sec: 146286 operations; 13.8 current ops/sec; [UPDATE AverageLatency(us)=78855.65] [READ AverageLatency(us)=757.45]
 90 sec: 146352 operations; 6.6 current ops/sec; [UPDATE AverageLatency(us)=14269.27] [READ AverageLatency(us)=119.57]
 100 sec: 146437 operations; 8.5 current ops/sec; [UPDATE AverageLatency(us)=239552.85] [READ AverageLatency(us)=3672.67]
 110 sec: 146688 operations; 25.1 current ops/sec; [UPDATE AverageLatency(us)=41942.16] [READ AverageLatency(us)=2646.36]
 120 sec: 146783 operations; 9.5 current ops/sec; [UPDATE AverageLatency(us)=79886.51] [READ AverageLatency(us)=995.67]
 130 sec: 146783 operations; 0 current ops/sec;
 140 sec: 146783 operations; 0 current ops/sec;
 150 sec: 146783 operations; 0 current ops/sec;
 160 sec: 146783 operations; 0 current ops/sec;
 170 sec: 146783 operations; 0 current ops/sec;
 180 sec: 146783 operations; 0 current ops/sec;
 190 sec: 146783 operations; 0 current ops/sec;
 200 sec: 146783 operations; 0 current ops/sec;
 210 sec: 146783 operations; 0 current ops/sec;
 220 sec: 146783 operations; 0 current ops/sec;
 230 sec: 146783 operations; 0 current ops/sec;
 240 sec: 146783 operations; 0 current ops/sec;
 250 sec: 146783 operations; 0 current ops/sec;
 260 sec: 146783 operations; 0 current ops/sec;
 270 sec: 146783 operations; 0 current ops/sec;
 280 sec: 146783 operations; 0 current ops/sec;
 290 sec: 146783 operations; 0 current ops/sec;
 300 sec: 146783 operations; 0 current ops/sec;
 310 sec: 146783 operations; 0 current ops/sec;

And now, after the cluster restart, java client VM restart (it's one separate VM) we can't make the ycsb test with this cluster:

root@dbtest0:/hyperdex_build/tests/5nodes/teste# java -Djava.library.path=/usr/local/lib/ com.yahoo.ycsb.Client -t -db org.hyperdex.ycsb.HyperDex -P /hyperdex
_build/ycsb-0.1.4/workloads/workloada -p "hyperdex.host=192.168.100.5" -p "hyperdex.port=1982" -s > out5
Loading workload...
Starting test.
host: 192.168.100.1
port: 1982
 0 sec: 0 operations;
m_client: org.hyperdex.client.Client@6342f024

 10 sec: 204 operations; 20.37 current ops/sec; [UPDATE AverageLatency(us)=536.47] [READ AverageLatency(us)=131.33]
 20 sec: 204 operations; 0 current ops/sec;
 30 sec: 204 operations; 0 current ops/sec;
 40 sec: 204 operations; 0 current ops/sec;
 50 sec: 204 operations; 0 current ops/sec;
 60 sec: 204 operations; 0 current ops/sec;
 70 sec: 204 operations; 0 current ops/sec;

It looks as java-client problem, because the python client works fine...

Any ideas?

Thank you, Vladimir

umatomba avatar Jan 30 '15 13:01 umatomba

thats really annoying, especially as we have not much interest in the Java Client apart of the YCSB Tests. Is there any suggestion about the most robust Clients to use?

hwinkel avatar Jan 30 '15 14:01 hwinkel