incubator-gluten [VL] CI: Enable GHA dependency cache on static Velox build

To speed up CI static build if velox and vcpcg's code not gets changed.

Ref: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

Dynamic build is not impacted by this patch so the overall Velox CI duration would not change.

Some links to inspect the cache:

https://api.github.com/repos/apache/incubator-gluten/actions/caches https://api.github.com/repos/apache/incubator-gluten/actions/cache/usage

Mar 27 '24 06:03 zhztheplayer

Run Gluten Clickhouse CI

Mar 27 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 27 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 27 '24 08:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 28 '24 01:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 28 '24 02:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 28 '24 02:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 28 '24 02:03 github-actions[bot]

Run Gluten Clickhouse CI

Mar 28 '24 02:03 github-actions[bot]

@zhouyuan

Mar 28 '24 02:03 zhztheplayer

I can think of a problem that when we update the Velox branch for some reason without changing Gluten's code, the cache should be manually invalidated otherwise will still be restored.

Delete cache: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-github-actions-caches-for-a-repository-using-a-cache-key

But the way doesn't seem to be so friendly to developer thus we may better to create a new Velox branch if appending changes.

Mar 28 '24 02:03 zhztheplayer

I can think of a problem that when we update the Velox branch for some reason without changing Gluten's code, the cache should be manually invalidated otherwise will still be restored.

Delete cache: https://docs.github.com/en/rest/actions/cache?apiVersion=2022-11-28#delete-github-actions-caches-for-a-repository-using-a-cache-key

But the way doesn't seem to be so friendly to developer thus we may better to create a new Velox branch if appending changes.

Yes, this will happen when we do a rebase in velox, then find some Spark UT failed in gluten, then did some fixes. can we filter on the pull request title, like if there's a key word(like forcebuildvelox)?

Mar 28 '24 03:03 zhouyuan

@zhouyuan A way to trigger rebuilding is needed anyway. I will raise a new PR for that. Thanks for the suggestion.

Mar 28 '24 04:03 zhztheplayer

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query	log/native_5145_time.csv	log/native_master_03_31_2024_0bfee3a98_time.csv	difference	percentage
q1	35.61	38.72	3.104	108.72%
q2	26.25	23.80	-2.449	90.67%
q3	37.11	37.26	0.151	100.41%
q4	40.53	38.23	-2.301	94.32%
q5	70.18	69.53	-0.653	99.07%
q6	7.44	7.39	-0.050	99.32%
q7	84.96	86.10	1.144	101.35%
q8	85.25	85.99	0.736	100.86%
q9	121.04	123.88	2.840	102.35%
q10	43.86	44.86	1.006	102.29%
q11	20.35	20.75	0.405	101.99%
q12	26.14	28.42	2.272	108.69%
q13	46.05	46.88	0.826	101.79%
q14	18.78	19.81	1.029	105.48%
q15	30.01	30.54	0.530	101.77%
q16	13.37	14.13	0.758	105.67%
q17	100.84	102.65	1.813	101.80%
q18	143.86	142.95	-0.912	99.37%
q19	13.70	13.59	-0.114	99.17%
q20	27.06	28.84	1.777	106.57%
q21	228.93	225.54	-3.387	98.52%
q22	14.16	16.63	2.466	117.41%
total	1235.47	1246.46	10.991	100.89%

Apr 01 '24 03:04 GlutenPerfBot

Update: This cache doesn't seem to be able to share among PRs. https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache.

It's OK for now and may need further improvements though.

Apr 01 '24 05:04 zhztheplayer