FilterLists icon indicating copy to clipboard operation
FilterLists copied to clipboard

automatic URL validation

Open collinbarrett opened this issue 7 years ago • 11 comments

All URLs in the Directory should be periodically validated for the following:

  • [ ] Is valid URI syntax
  • [ ] Availability (i.e., HTTP 200 vs 404)
  • [ ] Prefer direct URL if 301/302 redirected
  • [ ] Prefer HTTPS if available
  • [ ] Malware (using VirusTotal API or similar)

collinbarrett avatar Jan 31 '17 15:01 collinbarrett

Partially being done now by the Agent BOT over in #836

collinbarrett avatar Sep 03 '19 14:09 collinbarrett

blocked by #940

collinbarrett avatar Feb 17 '20 02:02 collinbarrett

Maybe integrate https://github.com/funilrys/PyFunceble . Great tip from @DandelionSprout

collinbarrett avatar Aug 11 '20 15:08 collinbarrett

PyFunceble is slightly difficult to install and keep updated, but it works very well once it has been set up. A simple command like PyFunceble -uf [File containing full URLs] should work pretty well.

DandelionSprout avatar Aug 11 '20 15:08 DandelionSprout

Hi there, maintainer of PyFuncebke out here! Thanks for the shootout @collinbarrett and @DandelionSprout !

I was looking for discussions about PyFunceble and found this issue. How can I help reduce the difficulties of using, installing, and keeping PyFunceble up-to-date?

Have you tested the docker images?

Let me know if I can help reduce complexity around PyFunceble!

Have a nice day/night. Stay safe and healthy. Nissar

funilrys avatar Aug 15 '20 23:08 funilrys

On my end, I had big update problems this spring wherein I had problems updating existing builds, and to get PowerShell and/or Cygwin to recognise the existence of the PyFunceble command. My unqualified guess is that it was the result of me trying multiple methods to install and update PyFunceble on my PC (e.g. PIP and pure Python), leading to parallel builds of it in various folders that conflicted with each other.

However, in early July I decided to try to delete every single file and folder on my PC that had the word PyFunceble in them, and then it went far better to get it to work after a reinstall with the pure Python method.

DandelionSprout avatar Aug 16 '20 03:08 DandelionSprout

Just a notice for you guys :-) Would you be interested in PyFunceble as a web API ? 🤔

I'm starting to think of what to prioritize next and such an idea is my box of ideas. I just need some input about how many people or infrastructure will be interested in that.

funilrys avatar Dec 23 '20 14:12 funilrys

@funilrys YES!!!!! That'd be awesome. It'd be incredibly helpful to keep the URLs that are on FilterLists clean and valid. I had thought about trying to get an instance of PyFunceble running on the FilterLists server, but just haven't had the time to learn how it works. If there was a simple Web API we could call, that'd be much easier to integrate.

collinbarrett avatar Dec 23 '20 15:12 collinbarrett

Hi @collinbarrett @DandelionSprout,

I just released beta at https://github.com/PyFunceble/web-worker. It's beta because I want to get as much feedback as possible before releasing it as an official stable version.

Feel free to test, report issues if you find some, or create new discussions if something is not clear or if you have some idea that may be useful to others in the future.

Stay safe and healthy.

funilrys avatar Mar 07 '21 23:03 funilrys

@funilrys nice! would love to test it out. I've been pretty short on time, lately, though, so I'm not sure I'll get around to it real soon.

collinbarrett avatar Mar 08 '21 00:03 collinbarrett

@collinbarrett Take the time you need! Rome was not built in one day - after all. 😉

funilrys avatar Mar 08 '21 10:03 funilrys