appstream icon indicating copy to clipboard operation
appstream copied to clipboard

Add metadata to indicate if an app uses a proprietary web service

Open pwithnall opened this issue 3 years ago • 7 comments

There are a number of desktop apps which exist as clients for proprietary web services, and hence aren’t useful without those services. That might not be obvious to users before they install the apps, though. I think it would be useful if appstream had an element which could express that an app is significantly tied to a web service, and to express what the license of that web service is.

This could allow gnome-software to give users more notice that by using certain apps, they are potentially tying themselves into a proprietary web service which they can’t contribute to / export their data from / run their own instance of / etc. On the flipside, we could promote apps which integrate with FOSS web services where the user can do those things if they want.

I’m not sure of the best markup to recommend putting in the specification, as this is a bit of an odd crossover of <requires>/<recommends> and <project_license>.

How about a <web_service> element which goes within <requires>/<recommends>, which specifies:

  • A human-readable name for the web service
  • A URI for its home page
  • A URI for its privacy policy/license
  • A URI for downloading/installing/self-hosting it
  • Its license

Example markup:

<requires>
  <web_service name="Dropbox" license="Proprietary">
    <url type="homepage">https://dropbox.com</url>
    <url type="privacy_policy">https://dropbox.com/privacy</url>
  </web_service> 
</requires>
<requires>
  <web_service name="Nextcloud" license="AGPL-3.0-or-later">
    <url type="signup">https://nextcloud.com/signup/</url>
    <url type="download">https://nextcloud.com/install/</url>
  </web_service>
</requires>

Should the name be localisable? I’m not 100% sure about the URL types for self-hosted applications, but they might fit with some work.

/cc @wjt, because you’re interested in this kind of thingy

pwithnall avatar Aug 10 '21 08:08 pwithnall

In general, I see that something like this would make sense. It is not what requires/recommends is traditionally used for, but it feels logical to put something like this into a requires block - possibly generalized as remote_service, but not sure if that makes sense. With the amount of detail packed into this tag however, this will make it really hard to implement with the current API libappstream uses for relation elements - but that's something to be dealt with later :-)

The name would have to be localizable, and for things like licenses etc., AppStream tries to reuse the same tag for the same thing, so it would be an own tag. So this may look like this:

<requires>
  <web_service>
    <name>Nextcloud</name>
    <project_license>AGPL-3.0-or-later</project_license>
    <url type="signup">https://nextcloud.com/signup/</url>
    <url type="download">https://nextcloud.com/install/</url>
  </web_service>
</requires>

However, I wonder if this is truly the right approach, because implementing something like this duplicates so much of what a component already defines in AppStream - in the long run, people may even want to add the gdpr agreement block for the web service to this... AppStream already defines a web-application component type for precisely web apps themselves. The intent for this component is primarily progressive webapps and similar stuff, but I wonder whether an application that is a client to a web service could just define a matching web-application component and then reference that as requirement in its requires block (a feature we already support). This will need some policy too, as currently this would mean there couldn't be two clients for the same webapps, as e.g. two nextcloud clients would both install a com.nextcloud.metainfo.xml file into /usr/share/metainfo, leading to a file conflict. This could be resolved by e.g. referencing the metainfo file of the web application via a web URL (so the webapp would need to host it on its server somehow, which does make some sense to me - with the drawback of a software center potentially pinging an external service if the project's page is viewed, which may be a privacy concern), allowing multiple components to be defined in special metainfo files (not a fan of this), installing the webapp file into a different location, giving it a unique name per app, or something else.

I'll have to think about this some more, but at least that's my very early thoughts about this. Feedback/opinions on this are very welcome!

ximion avatar Aug 11 '21 00:08 ximion

AppStream already defines a web-application component type for precisely web apps themselves. The intent for this component is primarily progressive webapps and similar stuff, but I wonder whether an application that is a client to a web service could just define a matching web-application component and then reference that as requirement in its requires block (a feature we already support).

That sounds excellent. Much better than my suggestion.

This will need some policy too, as currently this would mean there couldn't be two clients for the same webapps, as e.g. two nextcloud clients would both install a com.nextcloud.metainfo.xml file into /usr/share/metainfo, leading to a file conflict.

One suggestion for handling this, at least to get things started, would be to install them with libappstream itself. There are unlikely to be many webapps people want to describe, at least to begin with, so storing the metainfo for them in one centralised repository doesn’t seem too bad. This could be changed in future if it becomes burdensome.

This could be resolved by e.g. referencing the metainfo file of the web application via a web URL (so the webapp would need to host it on its server somehow, which does make some sense to me - with the drawback of a software center potentially pinging an external service if the project's page is viewed, which may be a privacy concern),

Pinging an external service would be quite a privacy concern, and would also slow down displaying the details of an app in gnome-software (and it would break if the user was offline).

Additionally, this could never work for proprietary services who couldn’t be made to care about metainfo files. Any of the Google services, for example (Drive, YouTube, etc.), Spotify, etc. Those are services which are likely to have desktop clients.

allowing multiple components to be defined in special metainfo files (not a fan of this),

That feels like it would require a lot of changes across a lot of metainfo parsing code.

installing the webapp file into a different location, giving it a unique name per app, or something else.

Giving it a unique name per app could work, but feels like a bit of a workaround (and one which isn’t easily undone if we think of something better in future).

pwithnall avatar Sep 08 '21 12:09 pwithnall

One suggestion for handling this, at least to get things started, would be to install them with libappstream itself. There are unlikely to be many webapps people want to describe, at least to begin with, so storing the metainfo for them in one centralised repository doesn’t seem too bad. This could be changed in future if it becomes burdensome.

It seems I actually already have a /usr/share/app-info/xmls/webapps.xml file on my system, which comes from the appstream-data package. So perhaps that’s the solution.

pwithnall avatar Sep 08 '21 13:09 pwithnall

It seems I actually already have a /usr/share/app-info/xmls/webapps.xml file on my system, which comes from the appstream-data package. So perhaps that’s the solution.

That file is Fedora-specific, Debian has something similar with different apps (only free-software ones). So people can not rely on these to be present or even to have the same name...

AppStream already defines a web-application component type for precisely web apps themselves. The intent for this component is primarily progressive webapps and similar stuff, but I wonder whether an application that is a client to a web service could just define a matching web-application component and then reference that as requirement in its requires block (a feature we already support).

That sounds excellent. Much better than my suggestion.

This will need some policy too, as currently this would mean there couldn't be two clients for the same webapps, as e.g. two nextcloud clients would both install a com.nextcloud.metainfo.xml file into /usr/share/metainfo, leading to a file conflict.

One suggestion for handling this, at least to get things started, would be to install them with libappstream itself. There are unlikely to be many webapps people want to describe, at least to begin with, so storing the metainfo for them in one centralised repository doesn’t seem too bad. This could be changed in future if it becomes burdensome.

Problem with that is twofold: We would need to ship the service icons with this, which for certain won't all be under licenses that permit free or combined redistribution. Defining icons as "remote" is an escape hatch, but then we will ping an external service again. In order for this webapp registry to be useful, we would need to make distributions actually ship the stuff, which means making the data freely distributable at least. On the Debian side, there will also be a debate whether this will be advertising of non-free services, depending on how software centers choose to display all the new webapps, but we can ignore that issue at least initially, I think.

Second issue is that AppStream was designed to be decentralized/distributed, so we would not have a central registry that people have to go to in order to make apps known, saving a lot of time that I or a team would otherwise have to spend on name-gatekeeping, so creating one for webapps would kind of go contrary to that goal. Interestingly though, while the design of AppStream allows people to do their own thing, this actually often leads to worse metadata and duplicate effort, especially in the realm of tagging firmware and drivers. Since webapps will indeed be limited in scope (I think/hope), it may actually not be unreasonable to expect people to file a PR on a repository in GitHub for a new webapp (I may move all the AppStream stuff to a new organization on GitHub and create an extra repo for that). This will also allow us to centrally translate all the things...

I am really unsure about the legal stuff though, especially on any graphics (webapps have to have at least an icon, for good reason).

This could be resolved by e.g. referencing the metainfo file of the web application via a web URL (so the webapp would need to host it on its server somehow, which does make some sense to me - with the drawback of a software center potentially pinging an external service if the project's page is viewed, which may be a privacy concern),

Pinging an external service would be quite a privacy concern, and would also slow down displaying the details of an app in gnome-software (and it would break if the user was offline).

Indeed. Usually AppStream metadata is processed through a tool like appstream-generator, which will cache data (so Debian's software center will only ping appstream.debian.org) and which could embed the referenced metadata. But things like Flathub don't do that, and this introduces a pretty good chance that the software center makes a user's existence known to an unwanted service. (we will run into that issue as well with the intended release data split for AppStream 1.0, and have it already for remote icons...)

Additionally, this could never work for proprietary services who couldn’t be made to care about metainfo files. Any of the Google services, for example (Drive, YouTube, etc.), Spotify, etc. Those are services which are likely to have desktop clients.

True - they could reference a file hosted somewhere on Github or a different random location though.

allowing multiple components to be defined in special metainfo files (not a fan of this),

That feels like it would require a lot of changes across a lot of metainfo parsing code.

Ideally only in libappstream, but I'm not a fan of a "screw you for the audacity to parse AppStream data yourself!" attitude - I'd rather not want to break parsing at all, unless there is an extremely good reason to do so (it will not only be a pain for external parsers but also for older versions of libappstream). TBH, I don't think this is a good reason for such an invasive change (it will break all assumptions made about metainfo files).

installing the webapp file into a different location, giving it a unique name per app, or something else.

Giving it a unique name per app could work, but feels like a bit of a workaround (and one which isn’t easily undone if we think of something better in future).

Yes - the central registry approach is a bit easier to undo once we have a better idea, but it'll still be a pain. At the moment I can't think of anything better though, if the legal question can be answered and if we can get a distribution like Debian to distribute the data as well in the main repository, so apps and software centers can rely on the component IDs to actually be there (I don't see an issue with that, but if we go that route I'd like to get feedback from others as well, ideally without causing a thousand-replies-long flamewar ^^).

ximion avatar Sep 08 '21 14:09 ximion

It seems I actually already have a /usr/share/app-info/xmls/webapps.xml file on my system, which comes from the appstream-data package. So perhaps that’s the solution.

That file is Fedora-specific, Debian has something similar with different apps (only free-software ones). So people can not rely on these to be present or even to have the same name...

I guess that’s not the solution then.

Problem with that is twofold: We would need to ship the service icons with this, which for certain won't all be under licenses that permit free or combined redistribution. Defining icons as "remote" is an escape hatch, but then we will ping an external service again. In order for this webapp registry to be useful, we would need to make distributions actually ship the stuff, which means making the data freely distributable at least. On the Debian side, there will also be a debate whether this will be advertising of non-free services, depending on how software centers choose to display all the new webapps, but we can ignore that issue at least initially, I think.

At least from the gnome-software side, I think we could quite happily not display an icon for web services when they’re being listed as a dependency of a desktop app. The UI I was thinking about would be something integrated into the context bar on the details page in gnome-software. (There’s a screenshot of it here.) It would likely just be some text saying “This app requires you to use Google Drive, which is proprietary” or words to that effect.

gnome-software also doesn’t currently list webapps in its UI at all. They aren’t listed in categories and don’t turn up in search results. So adding more web app metainfos won’t suddenly fill gnome-software up with new ‘apps’.

That doesn’t solve the icon issue, but does mean it’s a bit less important to solve. We could, for example, point all these web apps at some generic web app icon in order to fulfil the requirement that a webapp has an icon and it wouldn’t affect the user experience.

I don’t know what impact this would have on other consumers of appstream data, though. I can only speak from the gnome-software point of view.

Since webapps will indeed be limited in scope (I think/hope), it may actually not be unreasonable to expect people to file a PR on a repository in GitHub for a new webapp (I may move all the AppStream stuff to a new organization on GitHub and create an extra repo for that). This will also allow us to centrally translate all the things...

:+1:

Giving it a unique name per app could work, but feels like a bit of a workaround (and one which isn’t easily undone if we think of something better in future).

Yes - the central registry approach is a bit easier to undo once we have a better idea, but it'll still be a pain. At the moment I can't think of anything better though, if the legal question can be answered and if we can get a distribution like Debian to distribute the data as well in the main repository, so apps and software centers can rely on the component IDs to actually be there (I don't see an issue with that, but if we go that route I'd like to get feedback from others as well, ideally without causing a thousand-replies-long flamewar ^^).

Sure, feedback from distro people would be helpful. Is there a venue where this kind of stuff has been discussed before for previous appstream questions?

pwithnall avatar Sep 09 '21 10:09 pwithnall

/cc @wjt, because you’re interested in this kind of thingy

Looking just at the apps I personally use (which is not representative etc) it's rather obvious already which ones require proprietary online services because they have the same name as that online service, and (with few exceptions) are themselves non-free. It's not clear to me that defining each web service and then the many-to-many relationship between desktop app and web service would buy very much.

A privacy policy URL type might be interesting though!

wjt avatar Sep 27 '21 21:09 wjt

FWIW, a few example that I know of:

Dialect and Mousai could be interesting from privacy perspective because they send user text or audio input to web services.

sophie-h avatar Sep 27 '21 22:09 sophie-h