Bookie
Bookie copied to clipboard
strip common URL tracking params
When I bookmark a link, I don't really want the "&utm_medium=email" URL params to always be there. Would be nice to have them auto-removed so I don't always have to remember to do it before saving.
Yea, I've definitely pondered this. I've kept away so far because it involves parsing and altering the url. The possible best way is to use a query param parsing library to just remove those keys, but then I've got to completely strip and rebuild the url. The other thing is to use a well tested set of regexes on the url as part of saving it, but then it would mess up the url hash and require implementing on the client and server side so that they would end up with the same url. Only if they match can the hashes match and it tell you that you've stored this bookmark before, etc.
Seems there is a url parsing lib out there where you can get the params in an array and remove them, then join it back together to get a string and use that. I see what you're saying in regards to messing up hashes on existing bookmarks, I guess my thought was just doing it at save time. Cleaning up existing bookmarks could be another task.
Yea, it's possible but requires being very careful and implementing the same parsing/building in JS and Python so that the front/back ends match up.
FWIW: We belabored over this and things like it for months at delicious. If I recall, we eventually decided to never alter or destroy user-supplied data, even if the source was a semi-automatic bookmarklet.
Instead, we added a new column in tables for "canonical URL". It was never visible to end-users, but that was where we did things like remove common tracking params, the "www." prefixes, & etc. Then we used that for purposes of indexing for search and collating for "# of other users who bookmarked this" views.