OpenWPM icon indicating copy to clipboard operation
OpenWPM copied to clipboard

Incomplete table does not contain `browser_id`

Open birdsarah opened this issue 4 years ago • 4 comments

I was surprised to see incomplete_visits table doesn't contain browser_id.

If this is unexpected, let me know.

If expected, then can we add browser_id as we do for all other tables. While the chance of collision is small, presumably we still want to capture it. And if we don't should we get rid of it everywhere?

In my case I ran 700 very small, but different, crawls and put them all into one data directory to make it easy to read all the data in at once so the browser_id was meaningful for me.

birdsarah avatar May 30 '20 07:05 birdsarah

@vringar @birdsarah I would love to work on this issue. Please let me know if I may...

ankushduacodes avatar Nov 12 '20 18:11 ankushduacodes

I think we need to close this issue, as our current design makes it impossible to reliably capture the browser_id (which was previously called crawl_id). We can capture the task_id but I don't know if that is useful. It would help differentiate different crawl. The radical option is to flip this issue on it's head and talk about removing browser_id from everything but site_visits

vringar avatar Nov 13 '20 14:11 vringar

@vringar I was just looking to the todo list of Road to 1.0 that's where I found this issue, Is it Okay if I pick another issue from that to-do list? I will also keep an eye on discussion within this thread.

ankushduacodes avatar Nov 13 '20 14:11 ankushduacodes

Thanks for trying to find issues at your own but I don't think that the Road to 1.0 is a good source of issues to work on right now. Instead please have a look at which issues we consider to be high-priority.

vringar avatar Nov 13 '20 15:11 vringar