multiplier
multiplier copied to clipboard
Create a database diagnostic tool
The tool should look for hints of various possible issues for further investigation. For example:
- [ ] Fragments sharing the same
file_token_id. - [ ] Fragments sharing the same
hash. - [ ] Repeats of the same symbol name in the
mangled_name, where the data is first split by spaces, and the first element is taken (symbol name). Then looking that they are all in a reference chain.
Get the base name of a mangled name:
CREATE VIEW IF NOT EXISTS base_mangled_name AS
SELECT m.entity_id AS entity_id,
substr(m.data||' ', 0, instr(m.data||' ',' ')) AS data
FROM mangled_name AS m
Get all entity IDs sharing the same base name of a mangled name:
SELECT data, GROUP_CONCAT(entity_id,',') AS entity_ids FROM base_mangled_name GROUP BY data;
Find fragments sharing the same file_token_id, and also having the same length of serialized data. If two fragments are at the same location and have identical (compressed) serialized sizes then this is a strong indicator of redundancy.
SELECT file_token_id,
GROUP_CONCAT(fragment_id, ',') AS fragment_ids
GROUP_CONCAT(data_length, ',') AS data_lengths
FROM (SELECT f.fragment_id,
length(f.data) AS data_length,
h.file_token_id FROM fragment AS f
JOIN fragment_hash_0 AS h
ON (f.fragment_id & 16777215) = h.fragment_index)
GROUP BY file_token_id;