pathway icon indicating copy to clipboard operation
pathway copied to clipboard

[QUESTION] How to deploy a Pathway AirByte streaming ETL microservice to Google Cloud run?

Open vikas-velora opened this issue 1 year ago • 9 comments

Hi,

Is there a way to deploy pathway airbyte streaming ETL microservice to Google cloud run? If yes, how to go about it?

thanks.

vikas-velora avatar May 17 '24 12:05 vikas-velora

Hi @vikas-velora apologies for the slow turnaround on your question - our team is verifying if this is the case.

As a general rule, we advocate deployment from source (in this spirit: https://cloud.google.com/run/docs/deploying-source-code), and will provide the easiest recipe that works in this direction. The intended experience is something like this one with Render: https://pathway.com/developers/user-guide/deployment/render-deploy/.

dxtrous avatar May 21 '24 01:05 dxtrous

Got it, thanks. Will await your response.

On Tue, 21 May 2024 at 7:29 AM, Adrian Kosowski @.***> wrote:

Hi @vikas-velora https://github.com/vikas-velora apologies for the slow turnaround on your question - our team is verifying if this is the case.

As a general rule, we advocate deployment from source (in this spirit: https://cloud.google.com/run/docs/deploying-source-code), and will provide the easiest recipe that works in this direction. The intended experience is something like this one with Render: https://pathway.com/developers/user-guide/deployment/render-deploy/.

— Reply to this email directly, view it on GitHub https://github.com/pathwaycom/pathway/issues/53#issuecomment-2121563443, or unsubscribe https://github.com/notifications/unsubscribe-auth/BHP3QYFAAUEZ6CX56ZSZ4MTZDKS7RAVCNFSM6AAAAABH4CO73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGU3DGNBUGM . You are receiving this because you were mentioned.Message ID: @.***>

vikas-velora avatar May 21 '24 02:05 vikas-velora

Adrian,

One question - if we want to dynamically deploy a service based on user inputs, would it be better to deploy on Google cloud functions? Right now, pathway works using yaml config, we want to parameterize it and take inputs like OAuth token and GitHub repo name etc from users.

Thanks, Vikas

On Tue, 21 May 2024 at 7:57 AM, Vikas Singhvi @.***> wrote:

Got it, thanks. Will await your response.

On Tue, 21 May 2024 at 7:29 AM, Adrian Kosowski @.***> wrote:

Hi @vikas-velora https://github.com/vikas-velora apologies for the slow turnaround on your question - our team is verifying if this is the case.

As a general rule, we advocate deployment from source (in this spirit: https://cloud.google.com/run/docs/deploying-source-code), and will provide the easiest recipe that works in this direction. The intended experience is something like this one with Render: https://pathway.com/developers/user-guide/deployment/render-deploy/.

— Reply to this email directly, view it on GitHub https://github.com/pathwaycom/pathway/issues/53#issuecomment-2121563443, or unsubscribe https://github.com/notifications/unsubscribe-auth/BHP3QYFAAUEZ6CX56ZSZ4MTZDKS7RAVCNFSM6AAAAABH4CO73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGU3DGNBUGM . You are receiving this because you were mentioned.Message ID: @.***>

vikas-velora avatar May 21 '24 03:05 vikas-velora

Hi Vikas,

To give a small technical heads-up, there is a way to dockerize the airbyte connector code for the local tests. Precisely, you would need to install Docker in your Dockerfile as follows:

RUN apt update && apt install docker.io -y

And then run with mounting two volumes, as follows:

docker run -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp <your_image_name>

The first volume is required to enable DinD, while the second one is needed because the /tmp is currently used to store the temporary artifacts of the airbyte connector. Note that this wouldn't be so easy to deploy (and I suppose, it's impossible to deploy it at Google Cloud) because of giving access to the Docker socket.

While I've also tried to use the docker:dind image as a base, I've also figured out that it's unusable for our case because of using Alpine Linux as the base for docker:dind which is not supported by Pathway yet. Thus, I think we need to do something different and implement running the airbyte connector without depending on Docker, in GCP. It would need to be done for the Pathway framework.

So, to wrap it up, the way to go will be to run the airbyte connector in the GCP - a feature that must be added to Pathway. I am currently checking this possibility and will be back to you today or in a few days.

zxqfd555 avatar May 21 '24 09:05 zxqfd555

Thanks so much @zxqfd555-pw . We tried multiple ways, and were unable to deploy - at least this confirms that it was not something to do with our knowledge 😊. Will wait for your update.

vikas-velora avatar May 21 '24 09:05 vikas-velora

Hi Vikas!

A quick heads-up: we can eliminate the need for the DinD technique for airbyte connectors by introducing a mode where they run as GCP jobs. I am in the process of implementing it, and we can release the corresponding update next week.

zxqfd555 avatar May 24 '24 10:05 zxqfd555

Thanks for the update.

On Fri, 24 May 2024 at 3:42 PM, Sergey Kulik @.***> wrote:

Hi Vikas!

A quick heads-up: we can eliminate the need for the DinD technique for airbyte connectors by introducing a mode where they run as GCP jobs. I am in the process of implementing it, and we can release the corresponding update next week.

— Reply to this email directly, view it on GitHub https://github.com/pathwaycom/pathway/issues/53#issuecomment-2129164711, or unsubscribe https://github.com/notifications/unsubscribe-auth/BHP3QYEM6DKIL3URLDYNFKTZD4G7JAVCNFSM6AAAAABH4CO73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZGE3DINZRGE . You are receiving this because you were mentioned.Message ID: @.***>

vikas-velora avatar May 24 '24 11:05 vikas-velora

Hi Vikas!

Please note that now you can run airbyte data extraction jobs as Google Cloud Runs, which eliminates the need for DinD. Please refer to the Airbyte connector docs for the details.

zxqfd555 avatar Jun 10 '24 13:06 zxqfd555

Thanks Sergey.

On Mon, 10 Jun 2024 at 6:39 PM, Sergey Kulik @.***> wrote:

Hi Vikas!

Please note that now you can run airbyte data extraction jobs as Google Cloud Runs, which eliminates the need for DinD. Please refer to the Airbyte connector docs https://pathway.com/developers/api-docs/pathway-io/airbyte for the details.

— Reply to this email directly, view it on GitHub https://github.com/pathwaycom/pathway/issues/53#issuecomment-2158328103, or unsubscribe https://github.com/notifications/unsubscribe-auth/BHP3QYCXIDXJLEWUBWL6KILZGWQQDAVCNFSM6AAAAABH4CO73CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGMZDQMJQGM . You are receiving this because you were mentioned.Message ID: @.***>

vikas-velora avatar Jun 10 '24 17:06 vikas-velora