seldon-server icon indicating copy to clipboard operation
seldon-server copied to clipboard

No result for Movielens 100K Worked Example

Open ghost opened this issue 8 years ago • 8 comments

I am trying to run ml100k example. All things are all right but when I tape seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4 and get this result

connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED]
response code 200
{"size":0,"requested":4,"list":[]}

which is different from what is mentioned in the tutorial.

Could you help me please? Nadia

ghost avatar Jun 30 '16 17:06 ghost

I have also an another question if possible: why seldon use both zookeeper and etcd while things are possible relying only on just etcd? Thank you in advance for your answer.

ghost avatar Jun 30 '16 17:06 ghost

Can you rerun the ml100k job:

cd kubernetes/conf/examples/ml100k kubectl create -f ml100k-import.json

And provide the logs?

Regards zookeeper and etcd. We only use zookeeper in Seldon. Etcd is part of Kubernetes.

ukclivecox avatar Jul 01 '16 07:07 ukclivecox

I'm not getting the official tutorial output either . I'm running Kubernetes locally via MiniKube . `$ kubectl create -f ml100k-import.json

job "ml100k-import" created`

`$kubectl get jobs -l name=ml100k-import

NAME DESIRED SUCCESSFUL AGE ml100k-import 1 1 4m`

Than when i run seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4 I got this error :

raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='seldon-server', port=80): Max retries exceeded with url: /js/recommendations?type=1&item=50&limit=4&user=1&consumer_key=3BOJZ988Y840JTK5JE6C&jsonpCallback=j (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fbdec89bf90>: Failed to establish a new connection: [Errno 110] Connection timed out',)) error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

This is it all :

`$ bin/seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4 Traceback (most recent call last): File "/opt/conda/bin/seldon-cli", line 4, in connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED] import('pkg_resources').run_script('seldon==2.0.2', 'seldon-cli')

File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 742, in run_script

File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 1667, in run_script File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/EGG-INFO/scripts/seldon-cli", line 5, in seldon.cli.start_seldoncli() File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/init.py", line 3, in start_seldoncli cli_main.main() File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cli_main.py", line 351, in main cmds[cmd](opts,command_data, command_args) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 206, in cmd_api actions["default"](gopts,command_data, opts) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 191, in action_call call_js(gopts,command_data,opts,auth) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_api.py", line 127, in call_js r = requests.get(url,params=params) File "/opt/conda/lib/python2.7/site-packages/requests/api.py", line 69, in get return request('get', url, params=params, *_kwargs) File "/opt/conda/lib/python2.7/site-packages/requests/api.py", line 50, in request response = session.request(method=method, url=url, *_kwargs) File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 468, in request resp = self.send(prep, *_send_kwargs) File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 576, in send r = adapter.send(request, *_kwargs) File "/opt/conda/lib/python2.7/site-packages/requests/adapters.py", line 423, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='seldon-server', port=80): Max retries exceeded with url: /js/recommendations?type=1&item=50&limit=4&user=1&consumer_key=3BOJZ988Y840JTK5JE6C&jsonpCallback=j (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fbdec89bf90>: Failed to establish a new connection: [Errno 110] Connection timed out',)) error: error executing remote command: error executing command in container: Error executing in Docker Container: 1 `

When i run $Kubectl get pods i notice seldon-server-553220162-23o53 0/2 Pending

$kubectl describe pod seldon-server-553220162-23o53 `FirstSeen LastSeen Count From SubobjectPath Type Reason Message


12m 6s 48 {default-scheduler } Warning FailedScheduling pod (seldon-server-553220162-23o53) failed to fit in any node fit failure on node (minikubevm): Insufficient Memory`

Knowing that i run minikube start --memory=6000 , when i describe nodes i get :

`Name: minikubevm Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=minikubevm Taints: CreationTimestamp: Tue, 02 Aug 2016 15:43:49 +0900 Phase:
Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available Ready True Tue, 02 Aug 2016 18:13:33 +0900 Tue, 02 Aug 2016 15:43:49 +0900 KubeletReady kubelet is posting ready status Addresses: 10.0.2.15,10.0.2.15 Capacity: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 1 memory: 5958432Ki pods: 110 Allocatable: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 1 memory: 5958432Ki pods: 110 System Info: Machine ID:
System UUID: 3A273620-856B-4504-80F4-1792529E648D Boot ID: 22e7f8e6-3524-47ff-a47d-411a7a829c4a Kernel Version: 4.4.14-boot2docker OS Image: Boot2Docker 1.11.1 (TCL 7.1); master : 901340f - Fri Jul 1 22:52:19 UTC 2016 Operating System: linux Architecture: amd64 Container Runtime Version: docker://1.11.1 Kubelet Version: v1.3.3 Kube-Proxy Version: v1.3.3 ExternalID: minikubevm Non-terminated Pods: (17 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


default influxdb-grafana-xl5ec 0 (0%) 0 (0%) 0 (0%) 0 (0%) default kafka-controller-lxjw3 0 (0%) 0 (0%) 0 (0%) 0 (0%) default kafka-stream-impressions 0 (0%) 0 (0%) 0 (0%) 0 (0%) default kafka-stream-predictions 0 (0%) 0 (0%) 0 (0%) 0 (0%) default memcached1-9wl8j 0 (0%) 0 (0%) 260Mi (4%) 0 (0%) default memcached2-bck0z 0 (0%) 0 (0%) 260Mi (4%) 0 (0%) default mysql 0 (0%) 0 (0%) 3Gi (52%) 0 (0%) default seldon-control 0 (0%) 0 (0%) 0 (0%) 0 (0%) default spark-master-controller-x6cqg 0 (0%) 0 (0%) 0 (0%) 0 (0%) default spark-worker-controller-87px0 0 (0%) 0 (0%) 0 (0%) 0 (0%) default spark-worker-controller-e5ba6 0 (0%) 0 (0%) 0 (0%) 0 (0%) default td-agent-server 0 (0%) 0 (0%) 0 (0%) 0 (0%) default zookeeper-1 0 (0%) 0 (0%) 0 (0%) 0 (0%) default zookeeper-2 0 (0%) 0 (0%) 0 (0%) 0 (0%) default zookeeper-3 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system kube-addon-manager-minikubevm 5m (0%) 0 (0%) 50Mi (0%) 0 (0%) kube-system kubernetes-dashboard-lnr8r 0 (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md) CPU Requests CPU Limits Memory Requests Memory Limits


5m (0%) 0 (0%) 3642Mi (62%) 0 (0%) No events.`

So does that mean i can't run this demo on Minikube ? Thank you .

Moonba avatar Aug 02 '16 01:08 Moonba

It does look like a memory issue. Can you try increasing e.g. minikube start --memory=10000

ukclivecox avatar Aug 02 '16 10:08 ukclivecox

Yes , it worked with 10GB memory allocation . I got the desired output after executing seldon-cli api --client-name ml100k --endpoint /js/recommendations --item 50 --limit 4

but when i wanted to go through it step by step and reached the step to setup the schema using JSON: seldon-cli attr --action apply --client-name ml100k --json attr.json I got the error :

IOError: [Errno 2] No such file or directory: 'attr.json' error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

Same error even though the file does exist under the seldon-server/docker/examples/ml100k directory , i retried with writing the whole path in the command but still same error msg that No such file or directory ! ?

Same error with the following steps , but instead its "Invalid file[items.csv]" connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED] Invalid file[items.csv] error: error executing remote command: error executing command in container: Error executing in Docker Container: 1

My machine is a MAC OSX ELCAPITAN , 16GB memory, so to "ensure we have a namesever for external DNS (seems to be required for local Docker running of Kubernetes)" i can't use sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf (permission denied) so i manually added 8.8.8.8 nameserver to /etc/hosts and executed sudo networksetup -setdnsservers Wi-Fi 8.8.8.8 so it added nameserver 8.8.8.8 to etc/resolv.conf file automatically .

I also tried sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf on Minikube machine .

Here's the whole log for the JSON schema:

`connecting to zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 [SUCCEEDED] Traceback (most recent call last):

File "/opt/conda/bin/seldon-cli", line 4, in import('pkg_resources').run_script('seldon==2.0.2', 'seldon-cli') File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 742, in run_script File "/opt/conda/lib/python2.7/site-packages/setuptools-18.5-py2.7.egg/pkg_resources/init.py", line 1667, in run_script File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/EGG-INFO/scripts/seldon-cli", line 5, in seldon.cli.start_seldoncli() File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/init.py", line 3, in start_seldoncli cli_main.main() File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cli_main.py", line 351, in main cmds[cmd](opts,command_data, command_args) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 234, in cmd_attr actions[action](command_data, opts) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 172, in action_apply store_json(command_data,opts) File "/opt/conda/lib/python2.7/site-packages/seldon-2.0.2-py2.7.egg/seldon/cli/cmd_attr.py", line 106, in store_json f = open(opts.json) IOError: [Errno 2] No such file or directory: './attr.json' error: error executing remote command: error executing command in container: Error executing in Docker Container: 1`

I can't spot a problem somewhere else ?

`$kubectl get services --all-namespaces NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE

default kafka-service 10.0.0.87 9092/TCP 6h default kubernetes 10.0.0.1 443/TCP 1d default memcached1 10.0.0.20 11211/TCP 6h default memcached2 10.0.0.35 11211/TCP 6h default monitoring-grafana 10.0.0.31 80/TCP 6h default monitoring-influxdb 10.0.0.151 8083/TCP,8086/TCP 6h default mysql 10.0.0.161 3306/TCP 6h default seldon-server 10.0.0.84 80/TCP 6h default spark-master 10.0.0.119 7077/TCP 6h default spark-webui 10.0.0.166 8080/TCP 6h default td-agent-server 10.0.0.177 24224/TCP,24224/UDP 6h default zookeeper-1 10.0.0.244 2181/TCP,2888/TCP,3888/TCP 6h default zookeeper-2 10.0.0.2 2181/TCP,2888/TCP,3888/TCP 6h default zookeeper-3 10.0.0.86 2181/TCP,2888/TCP,3888/TCP 6h kube-system kube-dns 10.0.0.10 53/UDP,53/TCP 1d kube-system kubernetes-dashboard 10.0.0.207 80/TCP 1d `

Same error with the Movie Lens 10M demo also .

Thank you in advance .

Moonba avatar Aug 03 '16 04:08 Moonba

I think this is because you are trying to run a command in the seldon-control container and it does not have access to the local files on your system. One way would be to move the required files to a location that seldon-control can access such as the /seldon-data if you are running with hostPath.

We should make this clearer in the documentation.

ukclivecox avatar Aug 03 '16 09:08 ukclivecox

I'm running commands from my terminal as an admin not from seldon-control container bash terminal . (and even when i tried that , $ kubectl exec -ti seldon-control -- /bin/bash i end up with the same error logs ) Yes ,I'm using HostPath for persistent storage . DATA_VOLUME="hostPath": {"path": "/seldon-data"}

To create the default HostPath kubernetes conf files set for /seldon-data do the following: cd kubernetes/conf make clean conf Note : HostPath only makes sense for demo/testing where you have a Kubernetes cluster with a single minion where all containers can share the location on the host.

You will need to create the host path folder on your single kubernetes minion.

I'm sorry i didn't get the last line , when i create /seldon-data directory on my minikube vm it tells me File already exists .

I get that seldon-data is a volume shared between seldon-server container and seldon-control container : `seldon-server me$ kubectl exec seldon-server-553220162-63xiz ls /seldon-data

conf grafana influxdb logs mysql seldon-models

seldon-server me$ kubectl exec seldon-control ls /seldon-data

conf grafana influxdb logs mysql seldon-models `

Also the users.csv is empty when created ? items.csv is okay . nothing for users.csv . and error for actions.csv : cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 --output-delimiter=,) > actions.csv cut: illegal option -- - usage: cut -b list [-n] [file ...] cut -c list [file ...] cut -f list [-s] [-d delim] [file ...]

so i edited it to cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 -d ,) > actions.csv and it worked out , the file isn't empty .

Thank you in advance .

Moonba avatar Aug 04 '16 05:08 Moonba

The main thing is you need to create the files in /seldon-data where seldon-control can see them. If you are using host-path with your local folder /seldon-data then you can run commands like below which I tested:

Create a folder /seldon-data/ml100k. Create the atttrs.json in /seldon-data/ml100k with the values as described in the docs. then asusming you have downloaded, unzipped and ran iconv as described in docs for raw data then:

cat <(echo 'id,title,release,url') <(cat ml-100k/u.item.utf8 | awk -F '|' '{printf("%d,"%s","%s","%s"\n",$1,$2,$3,$5)}') > /seldon-data/ml100k/items.csv cat <(echo "id") <(cat ml-100k/u.user | cut -d'|' -f1) > /seldon-data/ml100k/users.csv cat <(echo "user_id,item_id,value,time") <(cat ml-100k/ua.base | cut -f1,2,3,4 --output-delimiter=,) > /seldon-data/ml100k/actions.csv

and:

seldon-cli attr --action apply --client-name ml100k --json /seldon-data/ml100k/attrs.json seldon-cli import --action items --client-name ml100k --file-path /seldon-data/ml100k/items.csv seldon-cli import --action users --client-name ml100k --file-path /seldon-data/ml100k/items.csv seldon-cli import --action users --client-name ml100k --file-path /seldon-data/ml100k/users.csv

You can then run MF and setup algs as described.

seldon-cli import --action actions --client-name ml100k --file-path /seldon-data/ml100k/actions.csv

ukclivecox avatar Aug 08 '16 08:08 ukclivecox