jsonref icon indicating copy to clipboard operation
jsonref copied to clipboard

30GB of RAM used then SIGKILL

Open SamuelMarks opened this issue 3 years ago • 2 comments

$ git clone --depth=1 https://github.com/stripe/openapi stripe-openapi
$ python main.py

Where main.py is:

#!/usr/bin/env python

from os import path
from json import load, dump

from jsonref import replace_refs

with open(path.join("stripe-openapi", "openapi", "spec3.sdk.json"), "rt") as f:
    doc = load(f)

derefed = replace_refs(doc)
with open('derefed_spec3.sdk.json', 'wt') as f:
    dump(f, derefed, indent=4)

I've got a new laptop with 32GB of RAM and a AMD Ryzen™ 9 6900HX with Radeon™ Graphics × 16, it is all used by this code and then SIGKILL is sent and game over. Ubuntu 22.10 on an SSD.

SamuelMarks avatar Dec 24 '22 17:12 SamuelMarks

Same problem with (large) Stripe OpenAPI. I choose a more "low tech" solution and parse it manually/recursively to replace $refs

def flat_json(self,json_object, indent = 0):
    if isinstance(json_object, dict):
        new_obj = {}
        for key, value in json_object.items():
            if key=='$ref':
                new_obj = self.flat_json(self.find_ref(value), indent + 1)
            else:
                new_obj[key] = self.flat_json(value, indent + 1)
        return new_obj
    else:
        return json_object

"find_ref" uses jsonpath_ng

def find_ref(self,n):
    n = n.replace('#','$')
    path = n.split('/')
    last_element = path.pop()
    if '.' in last_element:
        n = '.'.join(path) + '."' + last_element + '"'
    else:
        n = '.'.join(path) + '.' + last_element
    jsonpath_expr = parse(n)
    result = [match.value for match in jsonpath_expr.find(self.dict)]
    if result and len(result)>0:
        return result[0]
    else:
        print("ERROR FINDING Ref:",n)
        return None

Won't process "anyOf"... yet

plog avatar Aug 03 '23 02:08 plog

Yeah for months I've had this solution https://github.com/offscale/cdd-python#gen :

python -c 'import sys,json,os; f=open(sys.argv[1], "rt"); d=json.load(f); f.close(); [(lambda f: json.dump(sc,f) or f.close())(open(os.path.join(os.path.dirname(sys.argv[1]), sc["$id"].rpartition("/")[2]), "wt")) for sc in d["schemas"]]' <path_to_json_file>

Still would be good if this library was modified to handle large files.

SamuelMarks avatar Sep 17 '23 01:09 SamuelMarks