collins
collins copied to clipboard
Overhaul Hardware Intake System
Replace the hardware intake system. The current one depends on XML output from LLDP and LSHW. Unfortunately the schema changes somewhat frequently which makes it brittle, LSHW/LLDP may not be available on the host platform, and most importantly we have to add support for every hardware type in collins. The goal is to provide a flexible enough intake system to support hardware that collins doesn't know about right now. The current thinking is we'll introduce a JSON format to replace the XML formats in use, and provide converters from LLDP/LSHW to the JSON format. Since collins supports API versioning, we'll likely peg the old endpoints at version 1.1 and the new version as 1.2. We'll open up discussion on the list about the JSON format before we start coding.
Notes from Dan from internal ticket:
Problem
The current flat key/value store for assets and meta tags appears to be inadequate.
Example: Meta tags can't properly store VLan information for an asset. An asset may have multiple VLans, and each VLan has an id and name. Group id is used to distinguish the VLan tags from other tags, but the correlations between ids and names is lost.
Proposed Solutions
Sub-assets, asset pointers, Lucene indexing
Allow the creation of assets which serve as the values for asset meta tags. The meta value itself will be a pointer to the id of the sub-asset.
For example, to solve the issue with VLans, each VLan would be a sub-asset. The same principle could be applied to most other groups of tags such as disks, CPU's and Nics.
The asset/sub-asset tree would be limited to a depth of 1, ie. sub-assets could not have sub-assets of their own. In order to handle searching these values, we would use Lucene to flatten and index all top-level assets. Values of sub-assets could be squashed into multi-value keys.
Add a subgroup_id column to asset meta values
Currently group_id is used to add a dimension to asset meta values to group related values together. We could simply add another subgroup_id to address situations like the VLan issue.
Deal with it
Make no DB-level changes, address the issues in the app.
Now that we have solr in place, allowing the import of arbitrary JSON should be ok, since we can pretty easily index whatever data we want into solr without having to worry normalizing the data for mysql table insertion.
In order to keep asset hardware data exposed as key/values, we'd adopt a translation scheme from keys to x-path-like traversals of JSON objects. For example "hardware/cpu_speed_ghz", or "network_interface/vlan/0/name"
outputting the stored JSON in the API is trivial, but we'd need a little refactoring in the web app to neatly display the data.
We definitely need to keep the data normalized since search isn't the only thing that uses the values. That being said I think we can accept a format that will work with the underlying schema.
I don't think we've fully figured this out, but I'm going to start prototyping two features that I think will help solve this issue:
- Store "asset_meta_group" in the database. This will allow us to easily categorize all hardware-related tags into a single group, so we know exactly which data to clear out when importing a new profile. We can also dynamically create new meta tags as they appear in imported profiles and not mix them in with user-created attributes.
- Create a flexible JSON import action, where the JSON x-path of a value is translated into an asset_meta name.
For example if a JSON document looks something like
{
"DISK" : [
{
"SIZE_BYTES" : 1234567890
}
]
}
This will get translated into an asset meta with a name of "DISK_SIZE_BYTES" and a value of "1234567890", using the asset_meta_value.group_id to handle multiple disks. We still have that dimensionality issue, but however that gets solved I think these two features will still be useful.
I will also re-investigate my above "sub-asset" idea and see how easy that would be to implement now that we have solr searching fully in-place.