Open-XML-SDK
Open-XML-SDK copied to clipboard
Why office365 xlsx sheet utf8 encoding but openxmlsdk is utf8-withbom
Before submitting an issue, please fill this out
Is this a:
- [X] Issue with the OpenXml library
- [ ] Question on library usage
If you have answered that this is a question, please ask it on StackOverflow instead of here. This issue tracker is meant to track product issues while StackOverflow excels at answering questions
---------------- Remove this line and above before posting ----------------
Description
Please provide a simple description of the issue encountered.
Information
- .NET Target: .NET Core
- DocumentFormat.OpenXml Version: 2.14.0
Repro
public class Program
{
public static void Main(string[] args)
{
var directoryInfo = new DirectoryInfo(Directory.GetCurrentDirectory());
var fileName = $@"PracticePart1-{DateTime.Now:yyyyMMddHHmmss}.xlsx";
var filepath = Path.Combine(directoryInfo.ToString(), fileName);
Console.WriteLine($"FilePath: {filepath}");
var spreadsheetDocument = SpreadsheetDocument.Create(filepath, SpreadsheetDocumentType.Workbook);
var workbookPart = spreadsheetDocument.AddWorkbookPart();
workbookPart.Workbook = new Workbook();
var worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
worksheetPart.Worksheet = new Worksheet(new SheetData());
var sheets = spreadsheetDocument.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());
var sheet = new Sheet()
{
Id = spreadsheetDocument.WorkbookPart.GetIdOfPart(worksheetPart),
SheetId = 1,
Name = "myFirstSheet"
};
sheets.Append(sheet);
workbookPart.Workbook.Save();
spreadsheetDocument.Close();
}
}
Observed
Office 365 encoding are all utf-8 without bom, but openxmlsdk some're utf-8 with bom and some not

Expected
Should we follow office365 encoding standard? (below image is office 365 xlsx)
We changed it to that as it was causing some renderers to have problems (see https://github.com/OfficeDev/Open-XML-SDK/issues/309).
I'm not certain if there's a specific encoding is required by the spec, but we could potentially enable it to be configurable rather than relying on a specific default.
but we could potentially enable it to be configurable rather than relying on a specific default.
@twsouthwick Thanks! it will be helpful feature.
Happy to accept PRs. Probably best to add it to the OpenSettings object
@shps951023 is there a good reason why the SDK should emit non-BOM UTF-8? From our Office apps team, it appears that we don't have any requirement either way, i.e. Office apps will read UTF-8 BOM parts just fine. Does your code depend on non-BOM UTF-8?
@tomjebo So sorry about long time to see notification and to reply!! Some Chinese users need to custom encoding to read non-UTF8.
@twsouthwick Thanks, I'll try it
Happy new year! Wish everyone having a great new year.
