autogen icon indicating copy to clipboard operation
autogen copied to clipboard

"-i" option doesn't adjust Unicode BOM

Open belm0 opened this issue 7 years ago • 4 comments

When using "-i" option on a file having an existing Unicode BOM, the BOM appears to be left it place. autogen should insert the new text following the BOM.

Example diff after applying "autogen -i" follows. The string <U+FEFF> shows BOM location.

@@ -1,3 +1,17 @@
+// Copyright 2017 Google Inc.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//      http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
 <U+FEFF>using UnityEngine;
 using UnityEngine.Rendering;
 using System.Collections;

belm0 avatar Aug 17 '17 18:08 belm0

Thanks for the bug report, @belm0! Just to clarify, are you looking for the output to look as follows?

@@ -1,3 +1,17 @@
+<U+FEFF>// Copyright 2017 Google Inc.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//      http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
-<U+FEFF>using UnityEngine;
+using UnityEngine;
 using UnityEngine.Rendering;
 using System.Collections;

mbrukman avatar Aug 26 '17 01:08 mbrukman

Yes, that's it. Thank you

belm0 avatar Aug 26 '17 03:08 belm0

Thinking through this a little bit, I think this feature is complex enough to warrant a rewrite of autogen from the shell script that it is today into an actual programming language — e.g., Python, so that it can be run without compiling, or perhaps Go.

I've also been thinking of adding a configuration mechanism (see issue #23) and that almost caused me to embed an inline Python code snippet into a shell script, which would be quite a hack, so I think this is helping make a stronger case that we should use a real programming language instead.

mbrukman avatar Sep 12 '17 22:09 mbrukman

@belm0 — I just came across a similar project: https://github.com/google/addlicense (which was released 2 years after Autogen) — I'm not sure if it already handles Unicode BOM correctly, but the tool is written in Go, so even if it doesn't, it should be easy to add this functionality there.

mbrukman avatar Sep 20 '17 02:09 mbrukman