safebrowsing icon indicating copy to clipboard operation
safebrowsing copied to clipboard

Questions about Google safebrowsing in China

Open 075KG opened this issue 9 years ago • 7 comments

I am using safebrowsing in a project. However, the server hosting the database and the safebrowsing lists are not accessible in China due to governmont regulation. I have tried to access the server via public proxy servers, but they are blocked with a HTTP 403 response. Do you know how can I get the database, safebrowsing lists and update the database periodly in chinese internet environment?

075KG avatar Jul 25 '16 12:07 075KG

/CC @alexwoz

dsnet avatar Jul 25 '16 20:07 dsnet

The question has been solved. I specified an IP of HongKong in Google Cloud Platform.

But there is another question. Sbserver logs show the "database loaded is stale" when I start the sbserver everytime even if it updates half an hour automatically. And the file is always 6.27MB. Here is my server log.

server log: safebrowsing: 2016/08/16 01:00:20 database.go:108: database loaded is stale safebrowsing: 2016/08/16 01:00:25 database.go:280: database is now healthy Starting server at localhost:8080 safebrowsing: 2016/08/16 01:30:25 safebrowser.go:503: background threat list update safebrowsing: 2016/08/16 02:00:25 safebrowser.go:503: background threat list update safebrowsing: 2016/08/16 02:30:25 safebrowser.go:503: background threat list update safebrowsing: 2016/08/16 03:00:25 safebrowser.go:503: background threat list update

075KG avatar Aug 16 '16 07:08 075KG

Glad to hear you're able to connect properly.

The warning you are getting is entirely normal behavior. I regret using the terminology "database", since it is really just a local filter. IIRC, the filter is only valid for about 30 minutes or so and then needs to refreshed. The warning on L108 indicates that that is the case, while the message on L280 indicates that the filter has now been sync'd with the SafeBrowsing API servers.

Saving the filter state to disk was intended for a use case where a user starts and shuts down the SafeBrowsing fairly often (in a time span shorter than the filter expiration time) and want the filter state to persist across restarts.

dsnet avatar Aug 16 '16 17:08 dsnet

Here's some server logs I do not understand.

server log: safebrowsing: 2016/08/18 18:56:15 safebrowser.go:503: background threat list update safebrowsing: 2016/08/18 18:56:15 safebrowser.go:366: inconsistent database: safebrowsing: threat list is stale safebrowsing: 2016/08/18 18:56:15 safebrowser.go:366: inconsistent database: safebrowsing: threat list is stale safebrowsing: 2016/08/18 18:56:15 safebrowser.go:366: inconsistent database: safebrowsing: threat list is stale ... safebrowsing: 2016/08/18 18:56:20 safebrowser.go:366: inconsistent database: safebrowsing: threat list is stale safebrowsing: 2016/08/18 18:56:42 cache.go:132: hash 9d57cec427573574b34bbd5bc1c637afa8c1227537820d1fd8d970a5d2d7a47c: unsafe for 1 threat(s) safebrowsing: 2016/08/18 18:56:56 cache.go:132: hash 16ff03965e5d0acc45e289633f831ccbfb597a8c57e6f389227ca7e804ef7a5b: unsafe for 1 threat(s) safebrowsing: 2016/08/18 19:18:50 cache.go:125: hash 621677deabc79c4931b6d454516416cc07c577e88869a5fcca8f59d0641953c0: expired PTTL safebrowsing: 2016/08/18 19:19:13 cache.go:132: hash 5fec9c9efa9e56bd909a7e7f3fe808d15394bbb33bc415b54694dace101530f7: unsafe for 1 threat(s) safebrowsing: 2016/08/18 19:19:35 safebrowser.go:455: HashLookup failure: safebrowsing: unexpected server response code: 504 safebrowsing: 2016/08/18 19:21:10 cache.go:132: hash 0890de3b9734caa6408969c4c2914ec7266fe1b21698a6c1d38edc07ef3fd19f: unsafe for 1 threat(s)

I have run my program which sends POST requests to sbserver 10 hours. And I get some logs above. I know the " unsafe for 1 threat(s)" means a threat is checked and "unexpected server response code: 504" means maybe I request too frequently. But what is " inconsistent database: safebrowsing: threat list is stale"? Does it mean that I can not send a request in 5s when sbserver is updating? Secondly, what does "expired PTTL" mean? Does it return an unsafe response? Thirdly, will sbserver return a "{}" if it finds an unknown url?

Thx

075KG avatar Aug 19 '16 09:08 075KG

It seems that there is a slight race with the database update process since the update period is exactly equal to the period at which we consider it to be stale. Instead, we need to change it such that the update period is slightly shorter than the period where we consider it to be stale. A stale database is a transient error. Just retry the Lookup again in a few second (which seems to succeed around 20s later according to the log.

All of the messages from cache.go are entirely normal and are due to a log.Print that has been left there since development of the package. They essentially mean that cache eviction is happening as you would expect in a cache. We should probably remove those print statements since they are very normal.

The error 504 is not fatal any way. Just retry the request later. There's a TODO in the code to do batching for Lookup requests, which is something we will hopefully implement soon. That should drastically reduce the appearance of 504 errors assuming you are already sending in batches of URLs.

Action items from your report:

Hopefully, I or @alexwoz can get to these in the near future. I don't actually work on this project. My primary responsibility is actually towards the Go project itself.

dsnet avatar Aug 19 '16 22:08 dsnet

@075KG Some of the recent pull requests should have fixed some of these issues.

dsnet avatar Aug 29 '16 22:08 dsnet

Hi @dsnet ,

is this action item already done?

Still proceed with the lookup process even if the database is stale. Partial results in better than no results.

thanks

smoya avatar Nov 04 '16 12:11 smoya