snakebite icon indicating copy to clipboard operation
snakebite copied to clipboard

copyFromLocal not implemented?

Open interskh opened this issue 10 years ago • 50 comments

I notice copyFromLocal exists in commandlineparser.py but not in client.py. Is it not implemented yet?

Thanks!

interskh avatar Dec 04 '13 00:12 interskh

Yes that shouldn't be there.. Put was commented out, but I forgot copyFromLocal. I'll submit a patch this week, because this is confusing.

wouterdebie avatar Dec 04 '13 22:12 wouterdebie

Thanks.

interskh avatar Dec 05 '13 07:12 interskh

So, this means that copyFromLocal/put is not implemented? Do we use 'hadoop fs -copyFromLocal' instead?

I note that in the spotify blog [http://labs.spotify.com/2013/05/07/snakebite/], it states: there are plans to also implement actions that also involve interaction with the DataNode

In addition, the documentation [http://spotify.github.io/snakebite/] has a 'To Do' section where it states: put [paths] dst copy sources from local file system to destination

What is the timeline for this 'put'/'copyFromLocal' feature?

BlondAngel avatar Dec 18 '13 18:12 BlondAngel

Sorry for the late reply, but we haven't prioritized this. Would be nice to have (just like full YARN support).

wouterdebie avatar Mar 04 '14 11:03 wouterdebie

+ 1 I want to use snakebite to replace a several slow steps in our deployment automation, unfortunately we use copyFromlocal a lot. So this is definitely a must have feature for a lot of people.

Thanks for the good work.

sodul avatar Jun 05 '14 18:06 sodul

seconding sodul's comment

carolinux avatar Sep 17 '14 08:09 carolinux

Thanks for an excellent and straightforward client -- just throwing in a makeshift vote for the ability to use put/copyFromLocal to speed up a few data ingress scripts.

briancline avatar Sep 29 '14 04:09 briancline

Great work, keep it up. Would also like to see put/copyfromlocal in the future.

ptrxyz avatar Dec 13 '14 14:12 ptrxyz

Still no word on this? If communicating through protobuf makes it hard to implement features that require direct access to datanodes (such as the put and append operations), it would be wise to have a look at WebHDFS. Using WebHDFS in Snakebite, instead of Protobuf would make it trivial to implement copyFromLocal/put, and other file write operations.

I think it's a shame that such a promising project gets stuck on something that is really needed, like copyFromLocal.

DonDebonair avatar Jan 31 '15 11:01 DonDebonair

@ravwojdyla and I have been discussing this and currently there doesn't seem to be much time to implement this, so it's very hard to give any ETA on this feature. I don't think we want to add WebHDFS support, since that sort of defeats the purpose of snakebite and requires additional infrastructure.

wouterdebie avatar Jan 31 '15 11:01 wouterdebie

I agree with @wouterdebie webhdfs wouldn't have the speed of snakebite. I'm working on implementing put in RPC at the moment, if anyone has any thoughts or progress they can share to accelerate it would be great to work together.

simonellistonball avatar Jan 31 '15 11:01 simonellistonball

Where can I find the RPC documentation?

DonDebonair avatar Jan 31 '15 11:01 DonDebonair

Has there been progress toward implementing put? I was going to take a crack at it for a project I'm working on, and was considering contributing it upstream, but don't want to duplicate effort if someone already has a handle on this.

zachmullen avatar Mar 04 '15 23:03 zachmullen

I'm pretty sure it has not, maybe @ravwojdyla can confirm.

Tarrasch avatar Mar 05 '15 21:03 Tarrasch

I have started working on this feature some time ago - can probably upload what I have right now (it's far from complete). That said if anyone feels like working on this problem please create issues you plan to work on, and if you need help - please ping me/us. Thanks!

ravwojdyla avatar Mar 09 '15 12:03 ravwojdyla

@ravwojdyla I'd love to help, I started to do it but the problem that ended up blocking me was that I couldn't find documentation on what RPCs I should even call to do something like an append, and the ones I tried didn't return what they claimed in the auto-generated protobuf spec... I might be able to help with this effort if you could point me to good documentation about the protocol, but I was unable to find any in sufficient detail.

zachmullen avatar Mar 09 '15 18:03 zachmullen

The problem with Hadoop is that protocols are pretty badly documented. When I started snakebite, I spent a lot of time reading Hadoop code and tcpdumping to figure out what was going on...

wouterdebie avatar Mar 09 '15 18:03 wouterdebie

is there any ETA on when will copyFromLocal/put support would be present?

aman572 avatar May 01 '15 10:05 aman572

+1

tothandor avatar Aug 06 '15 11:08 tothandor

+1

ligao101 avatar Aug 31 '15 22:08 ligao101

+1 :)

mbultrow avatar Sep 18 '15 09:09 mbultrow

in the mean time:

import subprocess

subprocess.check_call(['hdfs', 'dfs', '-put', '/path/to/src', 'path/to/dst'], shell=False]

ctimmins avatar Oct 09 '15 22:10 ctimmins

+1

jtaryma avatar Oct 14 '15 12:10 jtaryma

@ravwojdyla - is there a separate branch for that issue? Did you have a chance to push what you had already done? Thanks!

jwszolek avatar Oct 28 '15 12:10 jwszolek

It looks like a go library similar to snakebite has started making progress on writing to hdfs: https://github.com/colinmarc/hdfs/pull/12

aeroevan avatar Nov 07 '15 12:11 aeroevan

+1

Condla avatar Dec 22 '15 08:12 Condla

An alternative that is relatively snappy is to use httpfs, it is a service that provide an http interface to hdfs. We actually ended up writing our own REST API in groovy to access hdfs and the hbase shell (which has no API).

https://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/index.html

sodul avatar Dec 22 '15 08:12 sodul

+1

tworec avatar Jan 07 '16 12:01 tworec

:+1:

crorella avatar Feb 17 '16 22:02 crorella

Because it was never implemented. On Feb 17, 2016 17:54, Cristian Orellana [email protected] wrote:

—Reply to this email directly or view it on GitHub.

wouterdebie avatar Feb 17 '16 23:02 wouterdebie