eclipse.platform icon indicating copy to clipboard operation
eclipse.platform copied to clipboard

Child content types without "builtInAssociations" considered matching for specific file names

Open iloveeclipse opened this issue 1 year ago • 3 comments

Follow up on https://github.com/eclipse/tm4e/issues/703#issuecomment-1944801545.

It looks like we have a strange content type handling that results in https://github.com/eclipse/tm4e/issues/703 (as one example). Here what was found by @sebthom:

Ok, I could reproduce it. As I suspected, the issue is in the Eclipse Platform. When querying content types for a file, eventually ContentTypeCatalog#selectMatchingByName will be invoked.

When this method is called with the file extension "txt" it will return the directly mapped contenttype org.eclipse.core.runtime.text and all child content types that have no "builtInAssociations", e.g. have no file-extensions, file-patterns, file-names defined. So it will return these additional content types if the language pack is installed:

  • org.eclipse.tm4e.language_pack.basetype
  • org.eclipse.tm4e.language_pack.markdown-math
  • org.eclipse.tm4e.language_pack.markdown_latex_combined
  • org.eclipse.tm4e.language_pack.cpp_embedded_latex

I don't think this behavior is correct.

Originally posted by @sebthom in https://github.com/eclipse/tm4e/issues/703#issuecomment-1944801545

iloveeclipse avatar Feb 15 '24 13:02 iloveeclipse

I don't know what kind of use case was behind originally implemented code, may be there is a test that would fail, let see: https://github.com/eclipse-platform/eclipse.platform/pull/1209 Current code comment doesn't make much sense for me...

iloveeclipse avatar Feb 15 '24 14:02 iloveeclipse

Some more infos:

  • the current behaviour of ContentTypeCatalog#selectMatchingByName has not been changed in the last 16 years
  • I found this in documentation at help.eclipse.org / Content Types:
    • base-type: the fully qualified identifier of this content type's base type. This content type will inherit its base type's file associations, content describer and default charset, unless they are redefined. → I am pretty sure not many people are aware of this and its implications

With the current behaviour if I define a content type like this:

    <content-type
       id="contenttype.example"
       base-type="org.eclipse.core.runtime.text"
       name="Example" 
       file-extensions="example" />

Platform.getContentTypeManager().findContentTypesFor("txt") will not return it.

However if I configure the content type like this which:

    <content-type
       id="contenttype.example"
       base-type="org.eclipse.core.runtime.text"
       name="Example" />
    <file-association
      content-type="contenttype.example"
      file-extensions="example" />

Platform.getContentTypeManager().findContentTypesFor("txt") will also contenttype.example.

I think this strongly violates the Principle of Least Surprise


The big question is how can I define a content type that does not inherit its base type's file assocations without setting new file associations. I.e. if you want an intermediate content-type that has no file associations or where the content type is determined based on the content itself.

The ugly workaround at the moment is

    <content-type
       id="contenttype.example"
       base-type="org.eclipse.core.runtime.text"
       name="Example" 
       file-extensions="PUT_SOME_VALUE_THAT_HOPEFULLY_NO_ONE_ELSE_USES" />

sebthom avatar Feb 15 '24 20:02 sebthom