rest.li icon indicating copy to clipboard operation
rest.li copied to clipboard

Various Data Template generation issues on Windows

Open mars-lan opened this issue 4 years ago • 13 comments

generateDataTemplate task failed with the following error when running on Windows.

Caused by: java.lang.IllegalArgumentException: 'other' has different root

The same Pegasus models build fine on Mac & Linux. See https://github.com/linkedin/datahub/issues/1640 for more details.

mars-lan avatar May 04 '20 23:05 mars-lan

(Adding to this issue as it falls under "Various" Data Template generation issues, but I'm not seeing the exact same error; let me know if you'd prefer me to split this out as a separate issue).

I'm seeing errors of the following form on Windows when building the linkedin/datahub project, for every single PDL file:

[main] ERROR com.linkedin.restli.tools.data.SchemaFormatTranslator - Parsed top-level schema does not match the schema file name. File: C:\Work\datahub\li-utils\src\main\pegasus\com\linkedin\avro2pegasus\events\common\datamonitor\PlatformName.pdl

The earlier debug log outputs the following:

[main] DEBUG com.linkedin.restli.tools.data.SchemaFormatTranslator - Loaded source schema: com/linkedin/avro2pegasus/events/common/datamonitor/PlatformName, from location: C:\Work\datahub\li-utils\src\main\pegasus\com\linkedin\avro2pegasus\events\common\datamonitor\PlatformName.pdl

... Which reveals the problem. schemaFullname at this point is com/linkedin/avro2pegasus/events/common/datamonitor/PlatformName but it should be com.linkedin.avro2pegasus.events.common.datamonitor.PlatformName, since this is the value that's going to be compared against the parsed top-level schema later.

The root cause is the line that replaces File.separatorChar with '.'. At this point in the code, it's working with URIs (which always use forward slashes, by definition), rather than file paths. So I think it should be replacing / with . on all platforms, rather than replacing the platform-specific file path separator with ..

hgcummings avatar Jul 02 '20 19:07 hgcummings

@evanw555 seems like @hgcummings has found the root cause. Could you look into a fix?

mars-lan avatar Jul 21 '20 13:07 mars-lan

Thanks Mars, we already have internal task tracking this issue

junchuanwang avatar Sep 15 '20 17:09 junchuanwang

Any update @junchuanwang? Another custom is encountering a similar issue when building DataHub.

mars-lan avatar Oct 23 '20 16:10 mars-lan

@mars-lan Last time our team reviewed the task, we de-prioritized it.

But since now there are more customer asking for the fix, we will review it again to see if we have resource to prioritize

junchuanwang avatar Oct 23 '20 16:10 junchuanwang

The specific error mentioned by @hgcummings looks like it may have been solved in https://github.com/linkedin/rest.li/pull/448/files#diff-1ccaa495740d8d078ae73b8dcaff97f17d465f1403ed2c2b8a2342472db7e372L218, which was committed in the last day or two.

evanw555 avatar Oct 23 '20 23:10 evanw555

@evanw555 When I updated the version of rest.li to 29.7.15, I come across a different error .

This is most likely related to how windows treats the paths in file system.

> Task :metadata-models:generateDataTemplate
There are 133 data schema input files. Using input root folder: C:\Users\nkanamar\Desktop\git-public\main-datahub\datahub\metadata-models\src\main\pegasus
[main] INFO com.linkedin.pegasus.generator.TemplateSpecGenerator - Class name: com.linkedin.data.template.StringArray, bound to schema:{ "type" : "array", "items" : "string" }, instead of schema: { "type" : "array", "items" : { "type" : "typeref", "name" : "SchemaFieldPath", "namespace" : "com.linkedin.dataset", "doc" : "Schema field path as described by schema normalizations rules: http://go/tms-schema", "ref" : "string" } }
[main] INFO com.linkedin.pegasus.generator.TemplateSpecGenerator - Class name: com.linkedin.data.template.StringArray, bound to schema:{ "type" : "array", "items" : "string" }, instead of schema: { "type" : "array", "items" : { "type" : "typeref", "name" : "SchemaFieldPath", "namespace" : "com.linkedin.dataset", "doc" : "Schema field path as described by schema normalizations rules: http://go/tms-schema", "ref" : "string" } }
Exception in thread "main" java.nio.file.InvalidPathException: Illegal char <:> at index 104: C:\Users\nkanamar\Desktop\git-public\main-datahub\datahub\li-utils\build\libs\li-utils-data-template.jar:pegasus/com/linkedin/common/FabricType.pdl
        at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
        at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
        at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
        at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
        at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
        at java.nio.file.Paths.get(Paths.java:84)
        at com.linkedin.pegasus.generator.JavaCodeUtil.annotate(JavaCodeUtil.java:89)
        at com.linkedin.pegasus.generator.JavaDataTemplateGenerator.populateClassContent(JavaDataTemplateGenerator.java:1266)
        at com.linkedin.pegasus.generator.JavaDataTemplateGenerator.generate(JavaDataTemplateGenerator.java:274)
        at com.linkedin.pegasus.generator.PegasusDataTemplateGenerator.run(PegasusDataTemplateGenerator.java:138)
        at com.linkedin.pegasus.generator.PegasusDataTemplateGenerator.main(PegasusDataTemplateGenerator.java:110)

> Task :metadata-models:generateDataTemplate FAILED

FAILURE: Build failed with an exception.

* What went wrong:

nagarjunakanamarlapudi avatar Oct 28 '20 15:10 nagarjunakanamarlapudi

We acknowledge the issue. We do not officially support Windows. We don't have plan to support Windows at the moment, if this becomes urgent, please feel free to discuss the priority.

nickibi avatar Oct 28 '20 20:10 nickibi

I'd also like to voice preference for getting this data template generation working on Windows as this is preventing us from building DataHub on Windows natively.

xdl avatar Aug 12 '21 10:08 xdl

On the point of support for Windows, I think there's some value in template generation working, even if DataHub itself isn't expected to run on Windows.

It's fine to run DataHub itself in a container, but being able to populate it from a Windows machine can be quite useful in some scenarios, and the ingestion scripts depend on the data templates. (I have previously resorted to generating the templates in a container then copying them back to the Windows host.)

hgcummings avatar Aug 12 '21 11:08 hgcummings

generateDataTemplate 在 Windows 上运行时,任务失败并出现以下错误。

Caused by: java.lang.IllegalArgumentException: 'other' has different root

相同的 Pegasus 模型在 Mac 和 Linux 上构建良好。有关更多详细信息,请参阅linkedin/datahub#1640

generateDataTemplate 在 Windows 上运行时,任务失败并出现以下错误。

Caused by: java.lang.IllegalArgumentException: 'other' has different root

相同的 Pegasus 模型在 Mac 和 Linux 上构建良好。有关更多详细信息,请参阅linkedin/datahub#1640

Can this problem be solved in the future?

gy20121221 avatar Nov 11 '21 03:11 gy20121221

Having the save problem on windows. Hope to support compiling and building on windows.

jkl0898 avatar Dec 13 '22 08:12 jkl0898

Having the save problem on windows. Hope to support compiling and building on windows.

ssyue avatar Mar 02 '23 06:03 ssyue