Feature/no answer pipeline
Added new noAnswer key and updated generic pipeline aggregation to show all responses without answer.
This is a good start! But it's missing a key feature, which is that the no_answer key should be added to buckets, not just facets.
What I mean is that currently this gives up data like this (in this case, "years of experience" with the "gender" facet):
facets: [
{
type: 'gender',
id: 'noAnswer',
buckets: [
{ id: 'range_5_10', count: 66 },
{ id: 'range_10_20', count: 49 },
{ id: 'range_less_than_1', count: 20 },
]
},
{
type: 'gender',
id: 'not_listed',
buckets: [
{ id: 'range_2_5', count: 35 },
{ id: 'range_5_10', count: 34 },
{ id: 'range_10_20', count: 39 },
]
},
{
type: 'gender',
id: 'male',
buckets: [
{ id: 'range_2_5', count: 7970 },
{ id: 'range_10_20', count: 5470 },
{ id: 'range_5_10', count: 7362 },
]
},
So you've added the "years of experience" breakdown for people who didn't answer the "gender" question.
But within each "years of experience" array of buckets, we also want to know how many people didn't answer the years of experience question. So the data we actually want for would be more like this:
facets: [
{
type: 'gender',
id: 'noAnswer',
buckets: [
{ id: 'range_5_10', count: 66 },
{ id: 'range_10_20', count: 49 },
{ id: 'range_less_than_1', count: 20 },
{ id: 'no_answer', count: 123 }, // people who didn't answer gender OR years of experience
]
},
{
type: 'gender',
id: 'not_listed',
buckets: [
{ id: 'range_2_5', count: 35 },
{ id: 'range_5_10', count: 34 },
{ id: 'range_10_20', count: 39 },
{ id: 'no_answer', count: 123 }, // people who picked "not_listed" as gender but didn't answer "years of experience"
]
},
Additionally we want this no_answer bucket to appear even when people don't select any facet. So we also want this:
"facets": [
{
"id": "default", // this is what we get when no facet is selected
"buckets": [
{
"id": "range_less_than_1",
"count": 1272,
},
{
"id": "range_1_2",
"count": 4177,
},
{
"id": "range_2_5",
"count": 8710,
},
{
"id": "no_answer",
"count": 123,
},
By the way, that no_answer bucket already appears in the survey results, but currently it's manually calculated in the chart itself (number of total respondents - sum of respondents in the other columns). I think it would be cleaner to do it at the API level.
(Also I guess it wouldn't be too hard to do it outside the aggregation pipeline in the rest of the JS code if the pipeline can't easily do it)
![]()
By the way, that
no_answerbucket already appears in the survey results, but currently it's manually calculated in the chart itself (number of total respondents - sum of respondents in the other columns). I think it would be cleaner to do it at the API level.(Also I guess it wouldn't be too hard to do it outside the aggregation pipeline in the rest of the JS code if the pipeline can't easily do it)
Yes, of course - better to make calculations inside API.
Good progress! But now I'm running into a different issue. It doesn't work when querying for a field where people can pick multiple options at the same time.
For example with the following GraphQL query:
query raceEthnicityQuery {
survey(survey: state_of_js) {
demographics {
race_ethnicity: race_ethnicity(filters: {}, options: {}) {
keys
year(year: 2022) {
year
completion {
total
percentage_survey
count
}
facets {
id
type
completion {
total
percentage_question
percentage_survey
count
}
buckets {
id
count
percentage_question
percentage_survey
}
}
}
}
}
}
}
I get this:
results: [
{
facets: [
{
type: 'default',
id: 'default',
buckets: [
{ id: [ 'multiracial', 'white_european' ], count: 33 },
{
id: [
'black_african',
'east_asian',
'hispanic_latin',
'middle_eastern',
'multiracial',
'native_american_islander_australian',
'south_asian',
'south_east_asian'
],
count: 1
},
{
id: [ 'multiracial', 'hispanic_latin', 'white_european' ],
count: 2
},
{
id: [ 'multiracial', 'white_european', 'middle_eastern' ],
count: 2
},
{ id: [ 'east_asian', 'multiracial' ], count: 1 },
{
id: [ 'south_east_asian', 'south_asian', 'east_asian' ],
count: 3
},
{
id: [
'black_african',
'east_asian',
'hispanic_latin',
'middle_eastern',
'native_american_islander_australian',
'multiracial',
'south_asian',
'south_east_asian',
'white_european',
'not_listed'
],
count: 1
},
{ id: [ 'south_east_asian' ], count: 1000 },
{ id: [ 'multiracial', 'south_east_asian' ], count: 1 },
{
id: [
'east_asian',
'native_american_islander_australian',
'south_asian',
'white_european'
],
count: 1
},
etc.
As you can see it's using every existing combination of answers as a unique id key instead of aggregating them. The correct output (from main branch) would be:
results: [
{
facets: [
{
type: 'default',
id: 'default',
buckets: [
{ id: 'multiracial', count: 727 },
{ id: 'east_asian', count: 1710 },
{ id: 'white_european', count: 19790 },
{ id: 'middle_eastern', count: 1158 },
{ id: 'hispanic_latin', count: 2795 },
{ id: 'south_asian', count: 1731 },
{ id: 'native_american_islander_australian', count: 142 },
{ id: 'not_listed', count: 795 },
{ id: 'south_east_asian', count: 1221 },
{ id: 'black_african', count: 1074 }
]
}
],
year: 2022
}
]
}
Someone is attempting to deploy a commit to the Devographics Team on Vercel.
A member of the Team first needs to authorize it.
Good progress! But now I'm running into a different issue. It doesn't work when querying for a field where people can pick multiple options at the same time.
I have added back unwind operator with specific option which not skip nullable/empty fields. Seems, that we cannot remove unwind operator. Tested your case, working fine now, tested previous cases locally also - seems working for me. For me difficult to know and test all cases, but let me know if something is wrong.