Add calculation cache to improve CalcCellValue performance
PR Details
Add calculation cache to improve CalcCellValue performance
Description
Implements a calculation cache mechanism for CalcCellValue to significantly improve performance when calculating formulas repeatedly. The cache stores calculated results and automatically invalidates when cells are modified.
Related Issue
Fixes #2057 - Performance enhancement - cache calculated values
Motivation and Context
As described in #2057, Excelize doesn't cache calculated values from CalcCellValue, causing expensive formulas to be recalculated every time they are referenced by other cells. This leads to poor performance when dealing with complex formula dependencies.
Example scenario:
A1: 40
A2: =VERYEXPENSIVEFUNCTION() // takes 100ms
A3: =A1+A2 // takes 100ms (recalculates A2)
Solution implemented:
- Cache calculated results in thread-safe
sync.Map - Clear entire cache when any
SetCellX()function is called - Cache key format:
"sheet!cell"
Performance improvement:
- First calculation: ~150µs (calculated and cached)
- Second calculation: ~600ns (from cache, 250x faster)
How Has This Been Tested
go test -v -run "TestCalc.*Cache|TestSetFunctionsClearCache"
go test ./...
Test coverage includes:
- ☑️ Basic cache hit/miss scenarios
- ☑️ Cache invalidation on cell modifications
- ☑️ Multi-cell formula dependencies
- ☑️ All 12 Set functions properly clear cache
- ☑️ Thread safety
- ☑️ 3 new comprehensive test functions added
Types of changes
☑️ New feature (non-breaking change which adds functionality) ☐ Docs change / refactoring / dependency upgrade ☐ Bug fix (non-breaking change which fixes an issue) ☐ Breaking change (fix or feature that would cause existing functionality to change)
Checklist
☑️ My code follows the code style of this project. ☐ My change requires a change to the documentation. ☐ I have updated the documentation accordingly. ☑️ I have read the CONTRIBUTING document. ☑️ I have added tests to cover my changes. ☑️ All new and existing tests passed.
Implementation Details
Modified Files:
excelize.go: Added cache fields (calcCache sync.Map,calcCacheMu sync.RWMutex) to File structcalc.go: Implemented caching logic inCalcCellValue()+ addedclearCalcCache()helpercell.go: Added cache clearing to all 12 Set functionscalc_test.go: Added 3 comprehensive cache test functions
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 99.47%. Comparing base (8a99deb) to head (83aeae4).
:warning: Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #2144 +/- ##
=======================================
Coverage 99.47% 99.47%
=======================================
Files 32 32
Lines 25704 25740 +36
=======================================
+ Hits 25569 25605 +36
Misses 72 72
Partials 63 63
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 99.47% <100.00%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Thanks for your PR. We need add any cache cautiously, after these changes we also need remove cache in following scenario:
- RemoveCol
- RemoveRow
- DuplicateRow
- DuplicateRowTo
- SetSheetCol
And update or delete cache when:
- SetSheetName
- DeleteSheet
Consider if other scenario needed.
Thanks for your quick respond! sorry for missing a few spots where we needed to clear the CalcCellValue cache in my first commit. After taking another look and with some great input, I think we've got all a_bases covered now. Here’s a quick rundown of where we’ve added the f.clearCalcCache() calls: These are the functions we sorted out earlier, based on what was initially asked for and our first round of checks: RemoveCol ✅ RemoveRow ✅ DuplicateRow ✅ DuplicateRowTo ✅ SetSheetCol ✅ SetSheetName ✅ DeleteSheet ✅ SetCellValue ✅ SetCellInt ✅ SetCellUint ✅ SetCellBool ✅ SetCellFloat ✅ SetCellStr ✅ SetCellDefault ✅ SetCellFormula ✅ SetCellRichText ✅ SetCellHyperLink ✅
New additions in this commit: While double-checking things more recently, we found a few more functions that also mess with the sheet in ways that would need the cache to be cleared. So, I've added the cache clearing to these in the latest changes: MergeCell ✅ UnmergeCell ✅ InsertCols ✅ InsertRows ✅ CopySheet ✅ MoveSheet ✅
Code conflicts need resolve.
so sorry, the conflict was caused by my local auto-formatting. I've resolved it by keeping the original content from master
This is really exciting! I have some very large workbooks this should help out on a ton.
Hi @DengY11 and @xuri thanks so much for working on this! What is left to get this merged and released?
Ping @xuri
Will take review for this.
Notice that, we needn't add calcCacheMu in sync.RWMutex data type around calcCache, because calcCache is a sync.Map, which already concurrency-safe by design. We need to create a separate PR to fix that.