regex sub-selection
I often have to do a bunch of regexes to get the text I actually need. If I could put it in my query, that'd be even better.
Here is an example:
I have a lil query that grabs some data from the google play-store. I do my pre-processing of input via js template strings in the query, and I'd like to do my post-processing in the query itself:
const details = (id, country='US', lang='en') => graphql(schema, `{
page(url: "https://play.google.com/store/apps/details?id=${id}&hl=${lang}_${country}"){
title: text(selector: "[itemprop='name']")
icon: attr(selector: "[itemprop='image']", name: "src")
developerName: text(selector: "a[href^='/store/apps/dev']")
developerUrl: attr(selector: "a[href^='/store/apps/dev']", name: "href")
developerId: attr(selector: "a[href^='/store/apps/dev']", name: "href", search="/store/apps/dev?id=(.+)")
}
}`)
In this example, I am pulling developerId from the same place I get developerUrl, extracting the id from the regex search. I'm not quite sure how to handle multiple matches, but it would be pretty useful, even if it just returned the 1st match:
{
"data": {
"page": {
"title": "Hello Neighbor",
"icon": "https://lh3.googleusercontent.com/r1wx-kmI9I_zxv8UIF_0_YvmhoLOx25mjT23GCO4bse6H-pgqfjZ5Tvz3HRJ0i2HdEoQ=s100",
"developerName": "tinyBuild",
"developerUrl": "/store/apps/dev?id=4988311280735374056",
"developerId": "4988311280735374056"
}
}
}
Is there interest in this? Should I make a PR?
Hey, sorry I'm just seeing this (been busy with a lot of non open source stuff ha) - this sounds like a really cool idea! You could have an argument called match (& possibly also test?) which returns the matched result. Probably makes sense to keep it close to the JS regex method names. If you're still keen on trying this out, I'd definitely accept a PR :)
You could have an argument called match
You mean like rename search above? Sounds good.
(& possibly also test?)
Like as a subfield? Not sure how to make theString-types return Boolean, and still be compatible. I am also really trying to think about how to do multiple-matches, which I think would be important for a lot of use-cases. Any ideas? It's sort of the same problem (don't want to change the signature from String to [String].)
Maybe I should start with just search and single returns per parent (as param to text,attrib, etc.)
Or maybe I should make a new kind of parent element, like search or match, that returns [String] and another, maybe called test, that returns Boolean.
Maybe both use-cases could be met (sort of) by something that works like queryAll/query but runs regex match on the matched html, something like this:
page(url: "https://play.google.com/store/apps/details?id=com.tinybuildgames.helloneighbor&hl=en_US"){
developer: regex(selector: "a[href^='/store/apps/dev']") {
id:match(regex: "/store/apps/dev?id=(.+)\"")
}
}
It's not perfect, as it confusingly mixes regex and attrib-grabbing in a weird way. I'm going to keep thinking on it.
I think a completely separate type for these definitely makes sense. Let me know once you've found a schema for this that you're happy with! :)