codemeta icon indicating copy to clipboard operation
codemeta copied to clipboard

Defining property "hasSourceCode" to link SoftwareApplication and SourceCode

Open dgarijo opened this issue 5 years ago • 20 comments

As we discussed in #198 (and the discussions in the Scientific Software Registry Collaboration Workshop), a link between SoftwareApplication and SoftwareSourceCode is needed for several reasons:

  1. Potential adopters from software registries would like to use Codemeta, and their starting point is a software entry (SoftwareApplication) rather than a source code.
  2. Right now codemeta mixes in the spec properties from SoftwareSourceCode and SoftwareApplication. I have seen potential adopters confused on the right use of these properties. Having this link made explicit helps separating SoftwareApplication from SoftwareSourceCode.

In this pull request I have:

  • Extended the properties_description.csv with a definition of the relationship proposed.
  • Modified codemeta.jsonld with the appropriate domain/range for the property.
  • Added an example (in examples/codemeta-software.json) on how to use the property using a SoftwareApplication and Codemeta terms. The example has been validated with https://json-ld.org/playground/

Note: I haven't added an extra line to each code meta mappings (it's the inverse of targetProduct), but will be happy to do so.

dgarijo avatar Nov 30 '19 05:11 dgarijo

I agree that the usage of both classes might be confusing. On the other hand, I find also confusing having codeRepository and hasSourceCode terms. Even if hasSourceCode is more generic, I would like having a more informative distinction between the two.

moranegg avatar Dec 04 '19 02:12 moranegg

@moranegg, I think it is ok to have both terms, as long as they are well defined and with examples. However, I am open to discussion.

'schema:codeRepository' is directly from schema.org, so I don't think we can change it. 'hasSourceCode' is the property I propose to link a schema:SoftwareApplication to a schema:SoftwareSourceCode object (inverse of schema:targetProduct).

Do you have another proposal that better illustrates the term? I don't mind if we call it differently, as long as there is a term to link schema:SoftwareApplication to schema:SoftwareSourceCode.

dgarijo avatar Dec 04 '19 02:12 dgarijo

I had a discussion with @dgarijo at the above meeting about whether it made more sense for codemeta to have SoftwareSourceCode as the base with a SoftwareApplication targetProduct or the inverse, codemeta as a SoftwareApplication with a SoftwareSourceCode hasSourceCode.

From our perspective as maintainers of a digital repository that wants to capture source code as well as accompanying descriptive metadata, documentation, and dependencies we lean more towards codemeta as SoftwareSourceCode with a SoftwareApplication targetProduct but obviously it could go both ways..

alee avatar Dec 04 '19 05:12 alee

@dgarijo @alee this is a very interesting discussion ! I looked at the added CodeMeta terms, all are added into the source code class with codemeta:SoftwareSourceCode.

From my understanding, the usage of CodeMeta and the integration of a codemeta.json file in the code, is implicitly saying that CodeMeta is about source code which can be compiled and executed, so it can refer to a software product.

we should try to detail in #198 the use cases for hasSourceCode where you don't use codeRepository, but I would love to see your use cases, so we can document better the creation of a new property.

We are also working on the Force11's SCIWG CodeMeta task force on aligning the CodeMeta terms into schema.org, so maybe this term should be worked similarly. see SCIWG issue on this.

moranegg avatar Dec 04 '19 10:12 moranegg

@moranegg, @alee thanks for your answers. I am a little confused, @moranegg. Codemeta is already aligned to schema.org. Why does it need to be realigned?

Codemeta uses codemeta:SoftwareSourceCode, which is an extension of schema:SoftwareSourceCode, to add new properties, and that makes sense to me. What I think is confusing (and my impression is that to other people too) is that codemeta also uses SoftwareApplication properties, which do not describe code, but that distinction is not clear in the spec and examples. For example, things like "memoryRequirements", "operatingSystem", "downloadURL", "installURL", etc describe software applications, not code. This is not a problem of codemeta itself, it's a problem of the documentation.

I think Codemeta should be able to describe both SoftwareApplications and SoftwareSourceCode, I am very happy with its scope. From my point of view, I have a software registry where I describe SoftwareApplications, and I want to point to the SourceCode used for it. It makes sense to use hasSourceCode and codeRepository together, because hasSourceCode points to the identifier of the code and describes metadata such as the language it's written in, while codeRepository points to the code repository where the code lives (github, or my server). I can only think of a case where you would use hasSourceCode but not codeRepository, which is when you want to state that an application is written in a programming language, but you don't have the pointer to the source code.

dgarijo avatar Dec 04 '19 21:12 dgarijo

The realignment is to see if any of the extension terms can fit under newer releases of schema.org. For example, if we can replace codemeta:SoftwareSourceCode,funding with schema:SoftwareSourceCode,funding as in #160 we'd make codemeta even more sustainable.

Adding codemeta:SoftwareApplication,hasSourceCode makes a lot of sense to me, as do the documentation changes. We can then try to get it adopted by schema.org.

tmorrell avatar Dec 04 '19 21:12 tmorrell

Ah, I understand now. Thanks! Please let me know if you need some other changes to this PR.

dgarijo avatar Dec 04 '19 21:12 dgarijo

@mbjones Any objections to merging? It looks good to me, though of course, if we adopt the proposal that codemeta inherits/extends schema.org instead of subsetting it, we get this for free...

https://github.com/codemeta/codemeta/pull/229/files#diff-677f920ff0ca91c4f00013b699382a47R79

As this change impacts the JSON-LD context file, we will want to think about plans for cutting a new release and a new DOI (though that might await some possible further changes, e.g. re schema:Grant and other terms in schema.org that might also be declared explicitly into codemeta).

cboettig avatar Dec 08 '19 21:12 cboettig

I noted a syntax error above. Other than fixing that up, I think the general idea of hasSourceCode and the specific implementation and definition are all great. So, let's get the context file to validate and then we can merge.

mbjones avatar Dec 17 '19 03:12 mbjones

I validated it: https://json-ld.org/playground/#startTab=tab-expanded&json-ld=https%3A%2F%2Fraw.githubusercontent.com%2Fcodemeta%2Fcodemeta%2Fc6bf60ca58418b7e8fe56b68165b287aebf64822%2Fcodemeta.jsonld I wrongly added the "schema" prefix, but since it's a new property it should be from codemeta. Sorry about that.

dgarijo avatar Dec 17 '19 03:12 dgarijo

@cboettig Given that this is a schema change, we should probably merge into a branch for the time being as the master branch is used for validation in the tests.  So, if we create a branch for the next schema release version and merge this PR there, all should be ok until we are ready to cut a new release.

mbjones avatar Dec 18 '19 08:12 mbjones

I agree with that. The master branch is used in many projects just as something to be synchronized iwth the latest release. You could have a "develop" branch and suggest contributions there for testing.

dgarijo avatar Dec 18 '19 09:12 dgarijo

Sounds good. Looking into why the tests in Travis are failing, I think the PR has a mismatch between some of the documentation files and the new changes in the jsonld file. I fixed the test issues on master, so now it looks like just a problem in the PR with the new property not matching. You can see the error by running python3 scripts/aggregate.py from the root of the branch directory, which gives the error: Error in codemeta-V1.csv: property names hasSourceCode and should be the same @cboettig is the issue obvious to you?

mbjones avatar Dec 18 '19 19:12 mbjones

please let me know if I need to change anything else. Thanks!

dgarijo avatar Dec 18 '19 20:12 dgarijo

@mbjones yeah, apologies about the travis checks. I think should be resolved in https://github.com/codemeta/codemeta/pull/231.

cboettig avatar Dec 19 '19 02:12 cboettig

@cboettig @mbjones Any updates on this PR?

dgarijo avatar Mar 04 '20 21:03 dgarijo

@dgarijo I think this PR is waiting for a CodeMeta v3 release ;-)

moranegg avatar Apr 29 '20 15:04 moranegg

@moranegg The tests are still failing on the PR, and so we can't merge until those are passing. Plus, the PR seems to have conflicts that need to get resolved as well. Once those are taken care of, we can merge, and then include it in a 3.0 release.

mbjones avatar Apr 29 '20 21:04 mbjones

@mbjones by "taken care off" does that mean me? When I submitted the PR, there were no merge conflicts. The Travis build was failing due to an issue that was not obvious, and that was claimed to be fixed in #231.

dgarijo avatar Apr 29 '20 21:04 dgarijo

@dgarijo That's up to you, but would be appreciated. Eventually either Carl or I would get to it in the process of preparing a 3.0 release, but I am fully swamped for the next several weeks or more and don't anticipate having any cycles to spend on this. So I was just pointing out that those were the barriers to merging this particular PR.

mbjones avatar Apr 29 '20 23:04 mbjones

This PR is a candidate for discussion, it changes the spec, we should label it #discussion. Discussion will be open until March 15th and we will then proceed to a vote from March 15th to March 25th. @dgarijo: there is a fix to be made that @progval is commenting directly on this PR

moranegg avatar Feb 16 '23 13:02 moranegg

Any reason it shouldn't be this?

"hasSourceCode": { "@id": "codemeta:hasSourceCode", "@type": "schema:SoftwareSourceCode"}

This allows linking with the SoftwareSourceCode object, in addition to allowing an URI to the object; similarly to other properties like author.

progval avatar Feb 16 '23 13:02 progval

That makes sense to me. Do you want me to edit the PR?

dgarijo avatar Feb 16 '23 19:02 dgarijo

Sure, thanks!

Also, could you rebase on the current develop branch? There are some merge conflicts

progval avatar Feb 16 '23 19:02 progval

This PR will be deleted when #300 will be integrated into develop branch

moranegg avatar May 26 '23 07:05 moranegg

By deleted you mean closed?

dgarijo avatar Jun 12 '23 18:06 dgarijo

Closing in favor of #300 (which resolves merge conflicts, and adds the reverse property isSourceCodeOf)

progval avatar Jul 03 '23 13:07 progval