whoisit icon indicating copy to clipboard operation
whoisit copied to clipboard

Return merged response when `raw=True`

Open thomas-weiss-42 opened this issue 2 months ago • 2 comments

Right now, when raw=True, the  _domain()  function yields the raw response before following the related or registration links. That means I can’t get the final merged response when using raw.

def _domain(domain_name, raw=False, session=None, follow_related=True, async_client=None, is_async=False):
    is_async = async_client is not None
    method, url, _ = build_query(query_type='domain', query_value=domain_name)

    if is_async:
        q = QueryAsync(async_client, method, url)
    else:
        q = Query(session, method, url)

    response = yield q

    if raw:
        yield response
    if follow_related:
        # Attempt to follow the 'related' or 'registration' links if the TLD has
        # an upstream RDAP endpoint that may have more information
        relresponse = None
        for link in response.get('links', []):
            rel = link.get('rel', '')
            if rel in ('related', 'registration'):
                relhref = link.get('href', '')
                if relhref:
                    if is_async:
                        relq = QueryAsync(async_client, method, relhref)
                    else:
                        relq= Query(session, method, relhref)
                    yield
                    relresponse = yield relq
                    break
        if relresponse:
            # Overlay the related response over the original response
            recursive_merge(response, relresponse)
    yield parse(_bootstrap, 'domain', domain_name, response)

I would like to suggest to put the check after the follow_related block.

if raw:
    yield response
else:
    yield parse(_bootstrap, 'domain', domain_name, response)

What do you think of this change?

thomas-weiss-42 avatar Oct 20 '25 12:10 thomas-weiss-42

The raw response is returned early specifically by design. You would want to use the raw response to perform parsing yourself. Most of the time, this is because the response is non-standard or can't otherwise be parsed automatically. Overlaying potentially multiple non-standard responses would mean the raw request wouldn't actually contain the raw data any more, plus the response could potentially be itself a bit of a mess or have information lost through replacement.

The other reason to use raw is if one of the related RDAP endpoints isn't functional or is returning invalid results and you don't want the related endpoints to be followed.

To do what you want you probably want to request a raw result, extract any related endpoints manually, then follow those with more raw requests.

Your suggested change could be implemented but it would need to be a merged_related_raw flag or similar as it would need to be in addition to the standard raw flag. I'm not opposed to adding the additional flag as a parameter if people would find it useful.

meeb avatar Oct 20 '25 13:10 meeb

Thanks for the quick reply. Your comment is very helpful.

I would appreciate it if you could include the merged_related_raw flag.

Yes, you're right that the merged version might not be optimal for the reasons you mentioned. I want to have information like registrant and technical roles among others included in the response. I might end up implementing the logic to follow the related links by myself.

It would be helpful to have the option to retrieve not only the merged related raw, but also an array with the raw responses from each queried RDAP server.

thomas-weiss-42 avatar Oct 21 '25 11:10 thomas-weiss-42

Thanks to the PR from @KolevVelyan in #55 this should be far easier to work around. I'll close this for now but feel free to open a new issue if you want to discuss additional features.

meeb avatar Nov 14 '25 08:11 meeb

The PR looks very good. Thanks 🥳

thomas-weiss-42 avatar Nov 14 '25 08:11 thomas-weiss-42