activitystreams
activitystreams copied to clipboard
Activity Streams should allow to state activities should not be tracked (robots.txt)
Please Indicate One:
- [ ] Editorial
- [ ] Question
- [ ] Feedback
- [ ] Blocking Issue
- [X] Non-Blocking Issue
Please Describe the Issue:
Mastodon implemented a feature to set up robots meta tag for HTML representations of objects (https://github.com/tootsuite/mastodon/issues/1599). That controls behaviors of robots on the Web.
However, it is also an ActivityPub application, and robots could exist in the federation. Those bots could not understand such intention.
Activity Streams should allow to state that activities should not be tracked by robots to solve the issue. My suggestion is to extend Activity Vocabulary by adding robots property to the object. The value could be same or similar to the content of robots meta tag of HTML.
This has been discussed before in the ActivityPub issue tracker. I believe https://github.com/w3c/activitypub/issues/221#issuecomment-300205759 represents the consensus of the working group, although it could be just Evan. Either way I suspect his answer will be identical here.
Sorry, I missed the issue. That is exactly the problem I want to address. However I have some arguments to support this idea rather than using audience, and because of that, I thought Activity Streams rather than ActivityPub should be extended and opened this issue.
audiencecould not represent partial restrictions ofrobotsmetatag androbots.txt.
The standard shows the following restrictions:
noindexinmetatag: the page should not be indexed.nofollowinmetatag: the links in the page should not be followed.Disallowinrobots.txt: the content of the page should not be scraped.
They are different restrictions, and the page administrator can show partial restrictions by choosing directives to include in the meta tag or robots.txt.
For example, only noindex means robots can follow links in the page. That is exactly what Mastodon does. (see https://github.com/tootsuite/mastodon/pull/4199.)
In such cases, robots are still in audience of the page.
- Compatibility with
robotsmetatag
We can have better compatibility by having robots property with similar content to robots meta tag. Compatibility matters because Activity Streams applications could often be Web applications as well.
robotsis suited for the standard whileaudienceis more dependent on implementations.
Activity Streams does not define the content of audience, and it could be more dependent on implementations. However, robots property could be a standard as robots.txt is a de facto standard.
This is a cool idea, but donno if it should be a long-standing open issue here.
If I were you and still need this, I'd write a short document explaining this (copy-paste?) and host it as https://mastodon.social/activitystreams-extensions/robots .
Anyone can then add 'robots' to their JSON objects by defining it in the @context.
This is an interesting idea. It's also an area of a lot of conversation in the fediverse. It's not currently part of AS2, so it would need to be an extension. That's something well-documented in the AS2 core document:
https://www.w3.org/TR/activitystreams-core/#extensibility
We do have a list of well-known extensions, so if this is widely used, we should probably include it.
For now, I'm going to close this issue, with the recommendation that a new extension vocabulary be added.