EPPlus icon indicating copy to clipboard operation
EPPlus copied to clipboard

Performance issue with rich text

Open StevenQuickGS1 opened this issue 3 years ago • 3 comments

I'm inserting text with superscript footnote ids into cells and have found and usage of rich text is ~15x slower than plain text

Attached is a benchmark comparing epplus to npoi.

I am inserting 50k rows with two columns, the first column just an int id, the second column:

  • for plain text: [id] This is the test for id (ie: [1] This is the text for 1)
  • for rich text: instead of [id] above, just the int id as superscript.

rich text

epplus

Is there a better way to apply rich text than this that I'm missing?

Code for the benchmark is just a .net 6 console app using benchmark .net

    [SimpleJob(RuntimeMoniker.Net60)]
    [MemoryDiagnoser]
    //[RPlotExporter]
    public class ExcelLibraryCreateTests
    {
        private readonly Dictionary<int, string> items = new();

        [GlobalSetup]
        public void Setup()
        {
            ExcelPackage.LicenseContext = LicenseContext.NonCommercial;

            for (int i = 0; i < 50000; i++)
                items.Add(i, "This is text for " + i);
        }

        [Benchmark]
        public byte[] EpplusPlainText()
        {
            using (var file = new ExcelPackage())
            {
                var rowIndex = 1;
                var worksheet = file.Workbook.Worksheets.Add("Data");
                worksheet.Cells[rowIndex, 1].Value = "Key";
                worksheet.Cells[rowIndex, 2].Value = "Value";
                foreach (var item in items)
                {
                    rowIndex++;
                    worksheet.Cells[rowIndex, 1].Value = item.Key;
                    worksheet.Cells[rowIndex, 2].Value = $"[{item.Key}] {item.Value}";
                }
                return file.GetAsByteArray();
            }
        }

        [Benchmark]
        public byte[] EpplusRichText()
        {
            using (var file = new ExcelPackage())
            {
                var rowIndex = 1;
                var worksheet = file.Workbook.Worksheets.Add("Data");
                worksheet.Cells[rowIndex, 1].Value = "Key";
                worksheet.Cells[rowIndex, 2].Value = "Value";
                foreach (var item in items)
                {
                    rowIndex++;
                    worksheet.Cells[rowIndex, 1].Value = item.Key;

                    var richTextCollection = worksheet.Cells[rowIndex, 2].RichText;
                    var richText = richTextCollection.Add(item.Key.ToString());
                    richText.VerticalAlign = ExcelVerticalAlignmentFont.Superscript;
                    richTextCollection.Add(item.Value);
                }
                return file.GetAsByteArray();
            }
        }

        [Benchmark]
        public byte[] NpoiPlainText()
        {
            var workbook = new XSSFWorkbook();
            var sheet = workbook.CreateSheet();
            var rowIndex = 0;
            var header = sheet.CreateRow(0);
            header.CreateCell(0).SetCellValue("Key");
            header.CreateCell(1).SetCellValue("Value");

            foreach (var item in items)
            {
                rowIndex++;

                var row = sheet.CreateRow(rowIndex);
                row.CreateCell(0).SetCellValue(item.Key);
                row.CreateCell(1).SetCellValue($"[{item.Key}]{item.Value}");
            }

            using var ms = new MemoryStream();
            workbook.Write(ms);
            return ms.ToArray();
        }

        [Benchmark]
        public byte[] NpoiRichText()
        {
            var workbook = new XSSFWorkbook();
            var sheet = workbook.CreateSheet();
            var rowIndex = 0;
            var header = sheet.CreateRow(0);
            header.CreateCell(0).SetCellValue("Key");
            header.CreateCell(1).SetCellValue("Value");

            var superscript = workbook.CreateFont();
            superscript.TypeOffset = NPOI.SS.UserModel.FontSuperScript.Super;

            foreach (var item in items)
            {
                rowIndex++;

                var row = sheet.CreateRow(rowIndex);
                row.CreateCell(0).SetCellValue(item.Key);

                var cell = row.CreateCell(1);
                cell.SetCellValue($"{item.Key}{item.Value}");
                cell.RichStringCellValue.ApplyFont(0, 1, superscript);
            }

            using var ms = new MemoryStream();
            workbook.Write(ms);
            return ms.ToArray();
        }

StevenQuickGS1 avatar Jun 06 '22 22:06 StevenQuickGS1

RichText creates a xml document for each cell, so the performance might not be that good, if you have a lot of data. I'll have a look to see if we can improve it somehow.

JanKallman avatar Jun 08 '22 07:06 JanKallman

Yes, this has bad performance if you have many cells with rich text. We'll need to rewrite this and remove the dependence on the XML DOM. I'll add this as an enhancement to a future version.

JanKallman avatar Jun 08 '22 08:06 JanKallman

Great thanks!

StevenQuickGS1 avatar Jun 09 '22 21:06 StevenQuickGS1