obsidian-stack-overflow
obsidian-stack-overflow copied to clipboard
Refactor extractAnswerId to make use of the URL parser?
Just had a brief skim through the code and noticed that you're treating the url as a string in extractAnswerId, then doing string manipulation to parse out the various bits.
Not sure if you're aware of the URL API, but it might be helpful to simply/make your extraction more robust:
- https://developer.mozilla.org/en-US/docs/Web/API/URL
We could also make the checks a bit more robust using string.match() or similar:
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match
While I haven't directly tested this, the code would likely end up looking something like:
+ const shortUrlRegex = /\/a\/\d+\/(\d+)/;
extractAnswerId(url: string) {
try {
if (url.length === 0) {
new Notice("[Stack Overflow Answers] - url is empty");
return "";
}
+ const parsedUrl = new URL(url);
+
- if (url.includes("#"))
// in form of https://stackoverflow.com/questions/14122919/how-to-hide-show-objects-using-three-js-release-54/14123978#14123978
- return url.split("#").pop();
+ if (parsedUrl.hash) return parsedUrl.hash.slice(1);
+
- const urlPopped = url.split("/"); // in form of https://stackoverflow.com/a/32232028/3952024
- urlPopped.pop();
- return urlPopped.pop(); // second to last
+ // in form of https://stackoverflow.com/a/32232028/3952024
+ const shortUrlMatch = parsedUrl.pathname.match(shortUrlRegex);
+ if (shortUrlMatch) return shortUrlMatch[1];
+
+ // None of our URL extractions matched
+ new Notice("[Stack Overflow Answers] - Could not extract answer id from url");
+ return "";
} catch (err) {
new Notice("[Stack Overflow Answers] - Could not extract answer id from url");
return "";
}
}
I'm not 100% sure, but this may also end up being a fix for https://github.com/bramses/obsidian-stack-overflow/issues/3, as I believe the current code would return pine-script-optimize-with-lower-timeframe-backtest from the URL due to assuming that if there is no hash, that it will be in the /a/123/456 format.
Great ideas! I really like the idea of string.match() for more URL robustness. What types of urls would be captured that aren't currently?
The plugin currently captures answers in long form (https://stackoverflow.com/questions/61907922/pine-script-optimize-with-lower-timeframe-backtest/72722933#72722933) (the answer hash is at the end) and short form (https://stackoverflow.com/a/72722933/3952024)
I think entire questions may be out of scope, but open to suggestions!
I think entire questions is the main other URL i've run into myself so far. I definitely understand if processing/extracting the questions is out of scope, but at the very least detecting those URLs and giving a user friendlier 'not going to work' sort of error would improve the UX I feel.