ProxyBroker
                                
                                 ProxyBroker copied to clipboard
                                
                                    ProxyBroker copied to clipboard
                            
                            
                            
                        Check if the proxy modifies traffic
A neat feature would be to check if the proxy was injecting ads/javascript (or in any other way modifying traffic)
Yes, this feature is in the TODO. I think the only way to implement this is to compare a checksum. If anyone have other options I'm ready to discuss =)
Yes, this feature is in the TODO.
Well I completely missed it, sorry about that.
I think the only way to implement this is to compare a checksum.
You might also be able to do size and number of files returned as well. Also sending data thru the proxy and seeing if it was changed on it's way to it's destination would be another way to tell if the proxy was messy with the traffic
Theres no real way to compare websites 100% because each html will be rendered different in each request. Checksum will fail so often.
An idea will be to save a list of web elements (the html dom names, not their content) and compare the main structure. You will notice changes for ads or div created between direct and proxy requests.
@erm3nda or you could just request a static web page retard
@erm3nda I partially agree with this comment as checksum doesn't have to be index.html specific. Another method would be to do a checksum on the website's resources; i.e., its main .css file, js files, etc. Whatever the case -- whether it is checksum or specific HTML -- there are only two ways this can be achieved:
- Statistically, by verifying > ~10% of proxies respond with the same DOM/checksum request.
- Request from the local machine that saves either checksum or DOM state to be verified against all proxies. This method, however, exposes the IP of the user using ProxyBroker.
Another thing to mention is the fact that not all websites are rendered the same depending on the language. Take NYTimes.com, for example:
- nytimes.com
- cn.nytimes.com
Both websites look fairly similar, but it seems that their HTML is rendered differently. Therefore, I believe, it would be preferred the client serves a proper Accept-Language header and verifies the Content-Language in the checking of the page on top of whatever other mechanism is being done to verify the page.