CustomBangSearch
CustomBangSearch copied to clipboard
Import DuckDuckGo Bangs / Presets
Would it be possible for the extension to have a bangs preset feature that allows us to easily select for instance DuckDuckGo's bangs automatically without having to add them manually?
Are there any problems with using these files / links to allow updating the presets by pulling them from DuckDuckGo directly? https://duckduckgo.com/bv2.js https://duckduckgo.com/bang.v260.js
If there aren't any problems with this, I'd be happy to work on a PR for this.
Uh, this looks really nice and helpful! I did a quick search, but couldn't find anything concerning the license or terms of use concerning these files. I could imagine that one is not allowed to use these files/mappings...
Where did you find the links to those files?
I found this link while searching for a list of !bangs. After finding that, I went to the DuckDuckGo website to see where it was getting the file from and found the additional file which references the version number.
This Firefox extension (Bangs for Google) downloads this file: https://duckduckgo.com/bang.js which looks identical and uses that, so it appears to be okay.
Should I proceed with adding this?
Those could be transformed (and filtered for relevance) with this jq query. However, I can import, but not save them.
The first part filters the array for relevance to get less entries and constructs again an array, the second part transformes it into the format that can be imported
jq '[.[] | select(.r > 200) ]|reduce .[] as $i ({}; .[$i.t] = {"id": $i.t,"url": $i.u,"pos": $i.r}) ' > result.json
edit: sorted version:
cat bang.js | jq '[[.[] | select(.r > 100) ]| sort_by(.r) | reverse | to_entries[] | .value.r=.key | .value ]|reduce .[] as $i ({}; .[$i.t] = {"id": $i.t,"url": $i.u,"pos": $i.r}) ' |sed 's/{{{s}}}/%s/g' > result.json
(select highest hundred by r, sort by r, highest first, decompose with index, replace r with the index of the element, transform, replace parameter string) Still doesn't work. Can I see the error somewhere?
@thigg Unfortunatley the error is Quota exceeded: ItemBytes, this is because it's too big to save as a single value in the sync storage (Maximum item size).
If you reduce that to just the top 30 items it saves fine (you'll have to experiment to find how many you can get).
It's possible to solve this by breaking up the bangs and saving them as several objects, but I'm unsure what exactly would be the best approach.
I'm happy to include these bangs as some sort of default in custombangsearch, perhaps the top 25?
Thanks for the elaborate answer @psidex
I am actually sure I need only 10 or something... thus I could certanly solve my issue by just adding them.
But it would be really nice, if a user could just import all or most DDG bangs. I originally installed the plugin because I thought it is stupid that my browser goes to ddg first, but i liked the bangs already so much that i didnt want to configure that in FF directly
The sync quota are 102kb if i read that correctly, if I do curl https://duckduckgo.com/bang.js | gzip --best >> out I get a resulting 450kb. That should be around 2500 entries. would it be easy to compress the data in the sync storage first?
Another option would be to add a checkbox which enables DDG bangs and downloads and converts them to some other storage.
Maybe we could separate preset-based bangs from custom bangs and store the preset ones in just local or session storage to ignore this?
Another approach would be to be able to choose which bangs will be synced with a checkbox or something alike (and limit that to an approximate number of entries?).
The sync quota are 102kb if i read that correctly, if I do
curl https://duckduckgo.com/bang.js | gzip --best >> outI get a resulting 450kb. That should be around 2500 entries. would it be easy to compress the data in the sync storage first?
102kb is the cumulative storage quota, cbs uses a single "item" within the sync storage, so we can currently only store a max size of 8kb, which is calculated after the bangs object is passed through JSON.stringify. That's what I meant by breaking the settings up into multiple objects, probably do-able but right now we just use a single item.
Maybe we could separate preset-based bangs from custom bangs and store the preset ones in just local or session storage to ignore this?
Yeah I like this idea, having the DDG bangs locally and then maybe just having a toggle, "Enable DDG bangs" or something like that. Then the toggle could be synced as well, so no loss of sync either!
Yeah I like this idea, having the DDG bangs locally and then maybe just having a toggle, "Enable DDG bangs" or something like that. Then the toggle could be synced as well, so no loss of sync either!
This sounds very sensible. Looking forward to it :)
In the scenario where a DDG bang conflicts with a custom bang, I imagine we would use the custom one and discard the DDG one?
Hey all,
Just to leave an update for people, I'm currently working on a rewrite of custombangsearch so have decided to put this feature request on hold for now.
The rewrite probably won't change the user experience much (although it should address some significant bugs), but will greatly improve the development experience for myself and any future contributors, and will also make it easier to maintain in the future.
I have no timeline for this, it might take weeks, months, or longer. It will most likely be released as version 0.10.0, and in the meantime if any smaller bugs are found or people request new search engines (that are easy to add) I'll happily support and release under 0.9.x.
I am planning to add, among other things, support for some level of duckduckgo bang importing, which is why I'm leaving this comment on this thread.
Thanks for being patient.
I've just pushed the code ready for v0.10.0 to master 🎉 I will create a release and upload to ff / chrome over the next day or so.
Here's the work I've done with regards to DDG bangs, hopefully this is what people wanted. I might make a tighter integration in the future but for now I think this should work fine for most people :)
Nice work, thank you! For fun, I downloaded the 200 json file and compressed it with gzip and lz on shell and got half the size from what you got in storage. I wonder why that is.
| name | size |
|---|---|
| gzip | 3451 |
| lz4 | 5859 |
| brotli | 3047 |
raw results:
$ ls -l ddg-top-220.json*
25339 ddg-top-220.json
3047 ddg-top-220.json.br
3451 ddg-top-220.json.gz
5859 ddg-top-220.json.lz4
Oh, looking at this.. is the storage utf16 encoded and it does not work as well as binary?
Thanks!
Yeah so I'm using the lz-string library, there's a bit of an explanation on that page as to exactly what it does. I basically did what you have done but with the different compression methods, and get these results for the 220 file:
raw : 16086
compress : 7711
compressToUTF16 : 8100
compressToBase64 : 7032
compressToEncodedURIComponent : 7030
compressToUint8Array : 18402
As you can see the utf-16 version is only slightly bigger than compress (which usually is the best, maybe not for this particular file), and I couldn't find any documentation as to how browsers handle invalid strings across sync storage, so I went with that (I imagine there's a lot of shuffling around of the data when it syncs, so probably don't want to risk undefined behavior).
The number is 8100 here instead of the 8140 on the ddg page because the extension also stores an options object that controls the enabled/disabled search engines, this isn't exported when you export your settings and isn't importable, so is left out of those ddg files.
If you're interested, this is the script I used to get those numbers:
import fs from 'node:fs';
import lz from 'lz-string';
const d = JSON.stringify(JSON.parse(fs.readFileSync('./ddg-top-220.json')));
const len = (s) => (new TextEncoder().encode(s)).length
console.log(' raw : ' + len(d));
console.log(' compress : ' + len(lz.compress(d)));
console.log(' compressToUTF16 : ' + len(lz.compressToUTF16(d)));
console.log(' compressToBase64 : ' + len(lz.compressToBase64(d)));
console.log(' compressToEncodedURIComponent : ' + len(lz.compressToEncodedURIComponent(d)));
console.log(' compressToUint8Array : ' + len(lz.compressToUint8Array(d)));
The len function came from here - https://stackoverflow.com/a/34332105/6396652