prototype-cjdns-pi icon indicating copy to clipboard operation
prototype-cjdns-pi copied to clipboard

nodeinfo.json Yggdrasil support

Open makew0rld opened this issue 6 years ago • 14 comments

The nodeinfo file on nodes should also provide information on yggdrasil: node id, keys, etc. Does yggdrasil have it's own replacement for nodeinfo? There should be one unified system on our nodes.

makew0rld avatar Dec 01 '18 21:12 makew0rld

Yggdrasil currently doesn't have a spec for nodeinfo. The CJDNS spec is here, and we need to figure out a way to combine Ygg info with that already existing spec. Once that is worked out it can be submitted to the yggdrasil-specs repo.

I'd note that the CJDNS spec is not a golden standard or anything, and so we can definitely mess around with it a bit.

makew0rld avatar Jan 30 '19 23:01 makew0rld

Whatever is worked out here, possibly with @neilalexander's help, should be served over http and possibly through the Yggdrasil config file as well. At the moment, we have a name value setup, but that will be sort of unclear when put into the http nodeinfo, and so maybe we can change that later.

makew0rld avatar Jan 31 '19 20:01 makew0rld

My idea for a spec:

{
  "hostname": "your.node",
  "name": "your.node",  // for ygg
  "contact": {
    "name": "Your name, handle, nick, whatever",
    "email": "[email protected]",
    "xmpp": "[email protected]",
    "other cool communication service": "whatever identifier they use"
  },
  "last_modified": "2014-11-30T00:35:48+00:00",
  "pgp": {
   "fingerprint": "FULL40CHARFINGERPRINT",
   "full": "http://link.to.download/full.key",
   "keyserver": "hkp://pgp.mit.edu"
  },
  "location":{
    "longitude":0,
    "latitude":0,
    "altitude":0,
    "continent":"NA",
    "country":"Canada",
    "region":"Ontario",
    "municipality":"Toronto",
    "uri":"https://tomesh.net/"
  },
  "software": {
        "repo": "__REPO__",
        "branch": "__BRANCH__",
        "commit": "__COMMIT__",
        "installed": "__INSTALLED__",
        "uri": "https://github.com/tomeshnet/prototype-cjdns-pi/"
  },
  "org": "__ORG__",
  "services": {
    "ipfs": {
      "version": <version>,
      "ID": <id>
    },
    "cjdns": {
      "key": <publickey.k>,
      "address": <address>,
      "uri": "https://github.com/cjdelisle/cjdns"
    },
    "yggdrasil": {  // Can be obtained through `yggdrasilctl -v getSelf`
      "key": <publickey>,
      "address": <address>,
      "subnet": <subnet>, // ::/64
      "uri": "https://github.com/yggdrasil-network/yggdrasil-go"
    }
  }
}

This is basically our current tomesh spec, which is based off of the CJDNS spec. I added some stuff from the originally CJDNS spec, like contact info, that we removed for tomesh, but it doesn't really matter, it was just for show. The main thing is that I've listed CJDNS and Yggdrasil as services, instead of putting CJDNS stuff as toplevel in the json.

Once this gets fixed, our full nodeinfo setup should be documented in the docs folder, with links to the CJDNS spec and everything.

@darkdrgn2k @benhylau @neilalexander @arceliar What do you guys think?

makew0rld avatar Feb 02 '19 00:02 makew0rld

So as you identified we use "name" instead of "hostname". We also have built-ins that are different from "software" in the form of "buildarch", "buildname", "buildplatform" and "buildversion":

{
  "nodeinfo": {
    "buildarch": "amd64",
    "buildname": "yggdrasil-develop",
    "buildplatform": "darwin",
    "buildversion": "0.3.2-0080",
    "name": "imac.y.neilalexander.eu"
  }
}

I don't particularly like the "services" part either because the URI bit is either difficult to specify or inapplicable for many protocols. Something closer to DNS-SD (RFC 6763) would be preferable, e.g.

"somehttpservice": ["_http", "_tcp", ttl, class, priority, weight, "1.2.3.4", 80]

... or similar, given that it makes it possible to either broadcast mDNS packets locally for those services or to populate SRV records in DNS zones.

I don't particularly like that the "pgp" section makes the assumption that a node will also have Internet access to reach SKS, because we have no guarantee of that, and also SKS is extremely broken anyway.

The "contact" section is probably fine, as long as we specify the key names and value formats, e.g. "email", "xmpp", "matrix".

"location" is another difficult one - in principle I am fine with the "longitude", "latitude", "altitude" approach but for those we should also have an "accuracy" to specify the order of magnitude accuracy for those who don't want to pin-point their exact node locations.

The address formatting stuff is also difficult because "municipality" might not apply in some places given different address formats across the world, and to be honest, may be altogether unnecessary anyway.

We also have a hard 16KB limit on the entire "NodeInfo" section once converted into JSON.

neilalexander avatar Feb 02 '19 10:02 neilalexander

To address each of your concerns @neilalexander:

"software"

The software section is tomesh specific, we use it to identify what version, branch, etc of our stack that our nodes are running. It's not related to the version of Yggdrasil or CJDNS, although maybe the versions of various services should be included, like is done with IPFS above.

"uri"

Having URIs for services was taken from the CJDNS spec, and I think it's a decent idea, especially because this should be human readable to some degree. If I see Yggdrasil listed as a service on a node but have no idea what it is, the URI helps me find it online. The URI is not intended to direct people to where the service actually runs on the node, or how to use it or something. It's just a reference for what the service even is.

"pgp"

I agree with this, I don't think a PGP section is necessary either, I just copied it from CJDNS. If anything, it can be under contact, at the node owner's discretion. It can be listed as an optional part of the spec, or not at all.

"location"

I should have mentioned: the lat and lon stuff is not in tomesh at the moment. Addressing is difficult, but specifying things like "municipality" makes it much more human readable. Longitude and latitude can cover any places that "municipality" won't, but an "accuracy" param is a good idea as well, as an optional param.

We also have a hard 16KB limit on the entire "NodeInfo" section once converted into JSON.

That should be ok, I think, I doubt we'll hit that.

makew0rld avatar Feb 02 '19 17:02 makew0rld

Isn't exposing to much information a possible security risk?

darkdrgn2k avatar Feb 02 '19 18:02 darkdrgn2k

@darkdrgn2k I mean it really depends how much we expose. Having a location setting is still a good idea, but it's always optional of course. Having basic services and software info is important I think though.

makew0rld avatar Feb 02 '19 18:02 makew0rld

There's no technical reason why we need to use "name" instead of "hostname" for Yggdrasil, that's just the convention we happened to take. We could probably check them both and use "hostname" for the map in the case where they're both set. So I'm OK with either, or maybe something like "alias", just whatever people think makes the most sense.

Arceliar avatar Feb 06 '19 01:02 Arceliar

@Arceliar Personally, "hostname" makes more sense to me, so I would like it if Yggdrasil checked for both. What are your opinions on the spec overall though, and @neilalexander's comments and my reply?

makew0rld avatar Feb 06 '19 01:02 makew0rld

Just to be clear. What "nodeinfo.json" are we talking about

Are we talking about the nodeinfo that Yggdrasil returns or port nodeip:80/nodeinfo.json of the node as defined by CJD and implemented in or this one https://github.com/tomeshnet/prototype-cjdns-pi/issues/173

Also goes back to a security thing

How much information do we WANT to reveal about the nodes (both in nodexporter and in nodeinfo.json capacity). One of the large attach surfaces for infosec id revealing to much about the devices (version numbers and what services are running)

I guess another question is how is this going to be used? To crawl the entire network looking for ipfs servers to bootstrap to?

darkdrgn2k avatar Feb 06 '19 02:02 darkdrgn2k

Just to be clear. What "nodeinfo.json" are we talking about

Both, they serve the same purpose.

How much information do we WANT to reveal about the nodes

Since this is prototype, and we have a firewall, I feel okay about exposing these things. I think having the data is important. If you have issues with specific parameters we can talk about them.

I guess another question is how is this going to be used? To crawl the entire network looking for ipfs servers to bootstrap to?

I think having the information is cool and important, to help people understand the nodes on the network. It also helps for our own tomesh surveying. As for IPFS ID, that is used to peer with peers already.

makew0rld avatar Feb 06 '19 02:02 makew0rld

Both, they serve the same purpose.

Do they? i'd like to hear from @neilalexander and @Arceliar of what purpose they intended thier "nodeinfo.json" to have.

Since this is prototype, and we have a firewall, I feel okay about exposing these things. I think having the data is important. If you have issues with specific parameters we can talk about them.

Firewall is not just some magical beast that fixes all problems. As an example for how to much info could be bad - NodeExporter for example returns up to the second core temperature of the CPU. There was a some paper that talked about how recording these fluctuations in temperate can help one figure out what instructions the cpu was running.

As for IPFS ID, that is used to peer with peers already.

Im not happy about this solution but there was not much else that could be done. But my question is are you going to start iterating through the WHOLE mes looking for IPFS servers?

https://github.com/tomeshnet/prototype-cjdns-pi/blob/develop/scripts/ipfs/ipfs-swarm.sh#L34

And how smart is that at scale.

So my point is

WHAT do we want nodeinfo.json to actually do (what problem does it solve)

darkdrgn2k avatar Feb 06 '19 02:02 darkdrgn2k

The idea of NodeInfo in Yggdrasil is for people to be able to publish public metadata. We designed it to be entirely voluntary, and it is also acceptable for a node to publish no metadata whatsoever. Anything that gets published into the Yggdrasil NodeInfo is public to the entire network, and can easily be found by crawling the DHT (which is exactly what @Arceliar's map and @yakamok's API do).

I do see some merit in agreeing on some common-sense standards for some of the more common entries, but I am very cautious of over-specifying this - we have it as free-form JSON for a reason. I think I speak for both myself and @Arceliar when I say that we don't really want the responsibility of us telling you what that should or shouldn't be.

Similarly I don't think we should be overloading it with information either. If there is specific information that you anticipate every one of your nodes advertising through NodeInfo, then sure - come up with a standard and stick with it, but it's difficult to encourage everyone else on the Yggdrasil Network to do the same.

We typically publish Yggdrasil version numbers, build names and platforms ("buildname", "buildversion", "buildarch" and "buildplatform") as built-ins because it helps us to understand whether users running old versions/not upgrading may impact any future changes we make, or whether we should be focusing on specific platforms that are rising in popularity. We also encourage the "name" field because there is an element of convenience there in showing names on the map without people having to submit their data to a nodelist repository (like we did before).

We've looked into the idea of publishing services so that we can dynamically generate public services list instead of curating it by hand but so far we don't have a spec for that. It's just something that I tinkered with and I did it using DNS-SD because there's a nice element of "zero configuration" to that from a user perspective.

Something else to consider is that Yggdrasil NodeInfo is a protocol-level feature and won't be filtered by firewalls, either at operating system-level or the built-in session firewall. You should be sure that when you publish something, you are happy for it to be public and potentially indexed forever, because it probably will be by someone out there.

neilalexander avatar Feb 06 '19 10:02 neilalexander

I think maybe I over specified above. But for tomesh at least, "hostname", "services", "org", and "software" should all be there.

makew0rld avatar Feb 07 '19 02:02 makew0rld