excelize icon indicating copy to clipboard operation
excelize copied to clipboard

Add calculation cache to improve CalcCellValue performance

Open DengY11 opened this issue 7 months ago • 3 comments

PR Details

Add calculation cache to improve CalcCellValue performance

Description

Implements a calculation cache mechanism for CalcCellValue to significantly improve performance when calculating formulas repeatedly. The cache stores calculated results and automatically invalidates when cells are modified.

Related Issue

Fixes #2057 - Performance enhancement - cache calculated values

Motivation and Context

As described in #2057, Excelize doesn't cache calculated values from CalcCellValue, causing expensive formulas to be recalculated every time they are referenced by other cells. This leads to poor performance when dealing with complex formula dependencies.

Example scenario:

A1: 40
A2: =VERYEXPENSIVEFUNCTION()  // takes 100ms
A3: =A1+A2                    // takes 100ms (recalculates A2)

Solution implemented:

  • Cache calculated results in thread-safe sync.Map
  • Clear entire cache when any SetCellX() function is called
  • Cache key format: "sheet!cell"

Performance improvement:

  • First calculation: ~150µs (calculated and cached)
  • Second calculation: ~600ns (from cache, 250x faster)

How Has This Been Tested

go test -v -run "TestCalc.*Cache|TestSetFunctionsClearCache"
go test ./...

Test coverage includes:

  • ☑️ Basic cache hit/miss scenarios
  • ☑️ Cache invalidation on cell modifications
  • ☑️ Multi-cell formula dependencies
  • ☑️ All 12 Set functions properly clear cache
  • ☑️ Thread safety
  • ☑️ 3 new comprehensive test functions added

Types of changes

☑️ New feature (non-breaking change which adds functionality) ☐ Docs change / refactoring / dependency upgrade ☐ Bug fix (non-breaking change which fixes an issue) ☐ Breaking change (fix or feature that would cause existing functionality to change)

Checklist

☑️ My code follows the code style of this project. ☐ My change requires a change to the documentation. ☐ I have updated the documentation accordingly. ☑️ I have read the CONTRIBUTING document. ☑️ I have added tests to cover my changes. ☑️ All new and existing tests passed.

Implementation Details

Modified Files:

  • excelize.go: Added cache fields (calcCache sync.Map, calcCacheMu sync.RWMutex) to File struct
  • calc.go: Implemented caching logic in CalcCellValue() + added clearCalcCache() helper
  • cell.go: Added cache clearing to all 12 Set functions
  • calc_test.go: Added 3 comprehensive cache test functions

DengY11 avatar May 25 '25 13:05 DengY11

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 99.47%. Comparing base (8a99deb) to head (83aeae4). :warning: Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2144   +/-   ##
=======================================
  Coverage   99.47%   99.47%           
=======================================
  Files          32       32           
  Lines       25704    25740   +36     
=======================================
+ Hits        25569    25605   +36     
  Misses         72       72           
  Partials       63       63           
Flag Coverage Δ
unittests 99.47% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 25 '25 14:05 codecov[bot]

Thanks for your PR. We need add any cache cautiously, after these changes we also need remove cache in following scenario:

  • RemoveCol
  • RemoveRow
  • DuplicateRow
  • DuplicateRowTo
  • SetSheetCol

And update or delete cache when:

  • SetSheetName
  • DeleteSheet

Consider if other scenario needed.

Thanks for your quick respond! sorry for missing a few spots where we needed to clear the CalcCellValue cache in my first commit. After taking another look and with some great input, I think we've got all a_bases covered now. Here’s a quick rundown of where we’ve added the f.clearCalcCache() calls: These are the functions we sorted out earlier, based on what was initially asked for and our first round of checks: RemoveCol ✅ RemoveRow ✅ DuplicateRow ✅ DuplicateRowTo ✅ SetSheetCol ✅ SetSheetName ✅ DeleteSheet ✅ SetCellValue ✅ SetCellInt ✅ SetCellUint ✅ SetCellBool ✅ SetCellFloat ✅ SetCellStr ✅ SetCellDefault ✅ SetCellFormula ✅ SetCellRichText ✅ SetCellHyperLink ✅

New additions in this commit: While double-checking things more recently, we found a few more functions that also mess with the sheet in ways that would need the cache to be cleared. So, I've added the cache clearing to these in the latest changes: MergeCell ✅ UnmergeCell ✅ InsertCols ✅ InsertRows ✅ CopySheet ✅ MoveSheet ✅

DengY11 avatar May 27 '25 13:05 DengY11

Code conflicts need resolve.

so sorry, the conflict was caused by my local auto-formatting. I've resolved it by keeping the original content from master

DengY11 avatar Jun 02 '25 12:06 DengY11

This is really exciting! I have some very large workbooks this should help out on a ton.

ar090 avatar Jun 17 '25 19:06 ar090

Hi @DengY11 and @xuri thanks so much for working on this! What is left to get this merged and released?

ar090 avatar Jun 26 '25 03:06 ar090

Ping @xuri

paolobarbolini avatar Oct 13 '25 23:10 paolobarbolini

Will take review for this.

xuri avatar Oct 14 '25 00:10 xuri

Notice that, we needn't add calcCacheMu in sync.RWMutex data type around calcCache, because calcCache is a sync.Map, which already concurrency-safe by design. We need to create a separate PR to fix that.

xuri avatar Nov 27 '25 13:11 xuri