pupflare
pupflare copied to clipboard
Pass through incoming cookies
This adds support for copying cookies from the incoming request to the underlying browser. This is useful when using a cookie jar copied from another browser for scraping.
Also adds a couple of environment variables for customizing browser behavior: you can set a proxy and disable headless mode.
Don't worry I haven't abandon your PR. It's just that I don't have the time nor the hardware to test your PR right now.
@vuori I tested your PR and it works.
But I still don't understand why you want to delegate the work of storing the cookies to Chromium instead of handling that into your program. I built pupflare as a way for just doing requests through a proxy but by having a browser doing the internal request. I think your PR changes the basic behavior which is just proxying the requests. I know Chromium stores by default the cookies that a website send but the Chromium instance is for me just temporary, it doesn't have any persistent storage compared to the program that you would develop.
An example is that someone could misunderstand the behavior added in this PR and thinks that his program just have to send the cookies once and then it wouldn't have to deal with the cookies anymore.
In order to fix this issue I would appreciate if you could like introduce a way to enable this behavior through an environment variable instead of having it by default.
Apart from that, the added environments variables are a useful new feature.
Thanks for the comments. The background is that I'm dealing with a rather paranoid site, and for whatever reason when I try to have pupflare directly pass through the Cookie header, I'm shown as not logged in. But when the cookies are delegated to Chromium, they work as expected. Sticking a console.dir(headers) into the code shows that the Cookie header is being set to the exact value that my real browser sends.
Have you had success with directly passing through the Cookie header?
In the meantime, should I separate the headless/proxy setting to a separate PR so we can deal with only the Cookie question here?
Sorry again for the late reply.
Have you had success with directly passing through the Cookie header?
Yes I did, actually it was initially designed for https://github.com/JimmyLaurent/torrent-search-api which pass the cookies using the HTTP headers.
In the meantime, should I separate the headless/proxy setting to a separate PR so we can deal with only the Cookie question here?
Sure that's a good idea!
I separated the environment variable bits into a separate PR. I'll try to investigate the cookie problem in the near future. One possibility is that the target site is redirecting and the custom header gets lost at that point, but I didn't check yet.
Thank you for splitting the PR into two PRs!
I'll try to investigate the cookie problem in the near future. One possibility is that the target site is redirecting and the custom header gets lost at that point, but I didn't check yet.
Let me know if you need any help about that. I look forward to your investigations.