PyMISP icon indicating copy to clipboard operation
PyMISP copied to clipboard

Creating a MISPGalaxyCluster

Open imidoriya opened this issue 3 years ago • 14 comments

@tomking2 would you happen to have an example on how to create and add a galaxy cluster?

I started out with something like this.. but it didn't turn out right. Tag contained the uuid and such:

cluster = MISPGalaxyCluster()
cluster["description"] = "UNK:ursu"
cluster["value"] = "ursu"
cluster["tag_name"] = 'misp-galaxy:avclass="ursu"'
cluster["distribution"] = 3
cluster["type"] = "avclass"
cluster["source"] = "AvClass"
pymisp.add_galaxy_cluster(galaxy, cluster)

imidoriya avatar Apr 08 '21 21:04 imidoriya

Hi @imidoriya,

Here's a modified code block I use to add a galaxy cluster. Looks very similar to yours so I expect yours is also adding the cluster to MISP, is that right?

cluster = MISPGalaxyCluster()
cluster.description = "UNK:ursu"
cluster.value = "UNK:ursu"
cluster.authors = ["imidoriya"]
cluster.source = "AvClass"
cluster.distribution = 3
# Example of adding a cluster element
cluster.add_cluster_element("type-of-incident", "Espionage")

cluster = pymisp.add_galaxy_cluster(galaxy, cluster)
misp.publish_galaxy_cluster(cluster)

As for picking the 'tag_name', unfortunately that isn't possible. For Galaxy 2.0 clusters, the tag name is auto-selected and has to be the UUID of the cluster, it can't be a freely decided name such as 'ursu'. I reckon this is to ensure a custom galaxy cluster doesn't clash with any of the public clusters.

tomking2 avatar Apr 09 '21 07:04 tomking2

what is the "galaxy" variable in your code examples?

chrisinmtown avatar Apr 09 '21 12:04 chrisinmtown

@chrisinmtown galaxy variable was the galaxy uuid. It added the Cluster to the Galaxy fine, I was just really confused on the tag name. When I tag something for a galaxy, I do so via:

pymisp.tag(event, 'misp-galaxy:avclass="ursu"')

So what happens when my tag is now a UUID? That makes no sense to me.

imidoriya avatar Apr 09 '21 16:04 imidoriya

I guess one thing to also note, the type of cluster is fixed to the galaxy you are extending. Galaxy 2.0 doesn't allow you to create a custom galaxy, but it does allow you to extend a galaxy to introduce new clusters.

So if an existing galaxy called avclass exists, you can add a new cluster through the above code (where galaxy is the uuid of the avclass galaxy).

And the tag name will be fixed for the uuid. Thus, if you create a new cluster, and the UUID becomes c46d4db8-4f66-412d-8240-c991adbe743e

You can tag it with:

pymisp.tag(event, 'misp-galaxy:avclass="c46d4db8-4f66-412d-8240-c991adbe743e")

@mokaddem might be able to confirm the exact functionality of Galaxy 2.0 and whether my assumptions on fixed a fixed UUID-based tag name is accurate

tomking2 avatar Apr 11 '21 09:04 tomking2

That makes no sense to me why they would do that. Is the hierarchy of misp-galaxy:avclass insufficient to satisfy uniqueness? That might work fine for the GUI, but sucks for the API. If doing this via the API where I'm generating tags based on AV Classification for example, why would I have a UUID? Why would anyone have a UUID? I guess I now have to go through some process of getting all the clusters and looking through for the name to get a UUID if it even exists? This is just bazaar. I hate Galaxy 2.0. How can I turn off that "feature"? lol. Heck, or at least hide away the complexity behind the scenes so we can use it like normal.

imidoriya avatar Apr 11 '21 14:04 imidoriya

I've also had a system sync with me that doesn't have the AvClass Galaxy installed (as it's still in beta) and the tags just come across as normal tags. If they're UUID, not only with they be super long, they'll make no sense.

imidoriya avatar Apr 12 '21 03:04 imidoriya

Certainly worth further discussions with @mokaddem, could be a Galaxy 2.1 update to improve how tags are defined and/or shared.

I reckon this has been done intentionally for a few reasons:

  1. If multiple organisations share their own clusters, UUID ensures uniqueness. So if one org creates one as 'ursu' and another creates their own also called 'ursu', there will be clashes if they are synced to the same MISP instance. It's the same reason events use UUID instead of the event info for sharing
  2. If a galaxy cluster is locked down to a sharing group, you might not want to expose the tag name, as this gives insight as to the galaxy being attached, even if the cluster details itself is not visible.

How I assume this could be improved in future would be handling receiving a tag with the value as 'ursu', and figuring out/auto-mapping to the UUID counterpart. We could probably add a helper into PyMISP quite easily which would handle this for us perhaps, raising an Exception if more than one cluster shares the same name.

tomking2 avatar Apr 12 '21 08:04 tomking2

Each cluster has a UUID separate from the tag name, so I wouldn't expect a clash. The uuid is the machine name, the tag is the human name. There would be an odd situation of having two tags with the same name though, which would make future tagging odd. But it might make more sense to address the edge case. I would think that if you give permission to a galaxy in a sharing group, then anyone inside the sharing group could see the tag names and anyone outside the sharing group could not.

MISP can work fine with the old system (all my other avclass tags use names), so I think the uuid format should be optional. If I supply the cluster["tag_name"], use what I specify, don't overwrite it with a uuid. Or give me a flag to specify the legacy format=True to force it. Direct SQL update is a possible option.

On the MISP end, a helper function would also work. I don't know that it would fix the example above of those orgs that don't have the galaxy installed seeing useless tags which is a big concern for me, but it would certainly make tagging easier. It would also probably tie into another feature, which is just.. does this galaxy tag exist? Another helper that could be added in is "Is this tag a synonym for another tag? Give me the main tag name..", which is what I'd essentially consider the uuid (or now the main name and the main name is now a synonym).

imidoriya avatar Apr 12 '21 12:04 imidoriya

Indeed, certainly worth a follow up to see how Galaxy 2.0 could be improved with more helpful tag names. Maybe raise an feature request in the MISP codebase to start those discussions, tagging @mokaddem (original 2.0 creator).

I think the odd situation of two tags with the same name would need to be addressed along with improvements to the distribution settings (currently, if a user is not part of the distribution they'll see the tag but it'll just be the UUID, instead of the tag being hidden like you'd probably want to occur).

Yeah, I think I spotted you talking about the 'find me a tag with synonym X', I might be able to look into that and push a PR for PyMISP, however I expect we'll need some improvements MISP side otherwise it'll be too slow to search and iterate through the possible options (I don't think there is a galaxy search that has the granularity of searching a specific cluster element). It's now on my list of things to look into.

tomking2 avatar Apr 12 '21 12:04 tomking2

Here is what I'm working on thus far on the synonym check and tag creation:

    galaxy_synonyms = defaultdict(set)
    avclass_galaxy_uuid = 'xxxxxxxxxxxxxxxxxxxxx'

    def build_synonyms(self) -> None:
        galaxy = self.client.get_galaxy(
            self.avclass_galaxy_uuid, withCluster=True, pythonify=True
        )
        if galaxy and "GalaxyCluster" in galaxy:
            for cluster in galaxy.get("GalaxyCluster"):
                if "GalaxyElement" in cluster:
                    for element in cluster.get("GalaxyElement"):
                        if element.get("key") == "synonyms":
                            self.galaxy_synonyms[element.get("value")] = cluster.get(
                                "value"
                            )
                self.galaxy_synonyms[cluster.get("value")] = cluster.get("value")

    def check_galaxy_synonym(self, tag: AnyStr) -> AnyStr:
        if tag in self.galaxy_synonyms:
            return self.galaxy_synonyms[tag]
        else:
            cluster = MISPGalaxyCluster()
            # build a new cluster for the galaxy
            self.client.add_galaxy_cluster(self.avclass_galaxy_uuid, cluster)
            return tag

imidoriya avatar Apr 12 '21 19:04 imidoriya

I even tried an update_galaxy_cluster afterward and it wouldn't let me change the tag_name. @mokaddem is this really necessary to prevent?

imidoriya avatar Apr 13 '21 13:04 imidoriya

Hey,

The following analysis by @tomking2 is on point

I reckon this has been done intentionally for a few reasons:

  • If multiple organisations share their own clusters, UUID ensures uniqueness. So if one org creates one as 'ursu' and another creates their own also called 'ursu', there will be clashes if they are synced to the same MISP instance. It's the same reason events use UUID instead of the event info for sharing
  • If a galaxy cluster is locked down to a sharing group, you might not want to expose the tag name, as this gives insight as to the galaxy being attached, even if the cluster details itself is not visible.

Another aspect that was not mentioned so far is the synchronization of clusters. If you are sharing an event tagged with a custom cluster with a restricted distribution setting, you still want to propagates the tag to other MISP instances.

Based on how synchronization of events works, if you later decide that the custom cluster can be public (i.e. distribution = all community), events being tag with the cluster will not be re-synchronized. However, you still want remote instances to have the mapping between the tag and the cluster since they have it. Using UUID for the tag name provides us multiple benefits (as pointed out by @tomking2 ) but this time for the sync:

  • Ensures uniqueness
  • Ensures some kind of anonymity
  • Allows an easier mapping between tag and cluster

But indeed, at the cost of user experience.

For @imidoriya issue, what you can do to solve the issue is to defined these clusters into the misp-galaxy folder and import them into MISP. They'll be marked as default clusters and thus will have the expected tag name but the clusters themselves will not be synced. So if you want other connected instances to view the cluster, you'll have to open a PR in the misp-galaxy repo so that any up-to-date MISP will have it.

Galaxy2.0 has a 0 for a reason ;) . We are slowly getting feedback from the community and the feature will obviously be improved overtime.

Here are some ideas that could be implemented for Galaxy2.1. Feel free to give your opinion on these or propose other ideas:

  • UI should hide tags which don't resolve to a cluster (either it doesn't exists or the user do not have the permission to see it)
  • API should allow the creation of default cluster. This one seems good on paper but my fear is that user do it to have a similar use case as @imidoriya and also expect the cluster to be sync. It could be a double edge sword.
  • Allow tagging by cluster name / synonyms if the provided tag resolves to only one cluster

mokaddem avatar Apr 14 '21 14:04 mokaddem

Thank you @tomking2 and @mokaddem for the discussion and development. Don't take my small gripes as a lack of appreciation. I realize there are many complexities in development that can often be confusing to end users.

With regard to defining the clusters in the misp-galaxy folder, that is already being done for 95% of the tags on AvClass and I'm adding more where it makes sense and we can classify. For my own instance, what I'm trying to add is the remaining tags, which are not classified and can be dynamic in nature as I want them to show up as a galaxy tag, not a regular tag.

I like the idea of hiding tags you don't have permission to see - that makes sense to me. It may allow some instances that do sharing groups on their galaxy to hide it. Allowing tagging by cluster name seems like a must have to me if the provided tag resolves to only one cluster, which I expect is the vast majority of cases. It is a query that would need to be done anyway if the user doesn't know the uuid. In my case, I'm using another application to generate plain english tags that we want to place in a galaxy structure. So they would all require lookups.

Having a custom galaxy sync is important if they have permission to view the galaxy. I'm trying to understand how a default tag prevents that. So a few thoughts on that..

  • I would expect that only conflicts would not be synced (when there is a conflict). Why block the entire galaxy when there are no conflicts?
  • You could have an option to merge / overwrite conflicts. If something in a galaxy has the same value / tag_name, but different uuid.. good chance they should be merged, not uniquely duplicated.
  • Couldn't you add a cluster_id (or uuid) to the tags table, which would help determine if that tag syncs based on the sharing of the cluster? That would seem to work better than having a bunch of unintelligible tag names and syncing them all (making the remote site a, pun intended, cluster of uuid tags).

It seems to me that the cluster "tag_name" is being used as a foreign key for the tags table, which is why we're trying to make a human tag into a machine tag. When you have the right permission and cluster, the user will see the cluster value instead of the machine tag. But otherwise, presenting a machine tag in a GUI for human use / API interaction is just .. well, not great. As mentioned above, seems using the actual id of the cluster to link the tag table could avoid repurposing the name field. You may not even need the tag_name field in the clusters table.

imidoriya avatar Apr 14 '21 17:04 imidoriya

As another note, I'm trying to galaxify tags that already exist. I already have the tag misp-galaxy:avclass="ursu". The issue is that it doesn't yet belong to the galaxy, so it presents as a normal tag. But if I try to add misp-galaxy:avclass="ursu" to the galaxy, it comes out as misp-galaxy:avclass="566709cd-d415-4dbb-8ca5-0ad930d4378a" which doesn't solve the problem. I still have misp-galaxy:avclass="ursu" not in the galaxy.

imidoriya avatar Apr 15 '21 19:04 imidoriya