[Wan] Optimize time & memory
What does this PR do?
This PR reduces the time and space used when running Wan. I have successfully tested the performance improvement and I have done a crash test (put an error in place of my code and see the error). The output result is remains the same.
Before submitting
- [x] Did you read the contributor guideline?
- [x] Did you read our philosophy doc (important for complex PRs)?
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
Hey, be interesting to know the rough before and after metrics?
Be great if this does reduce memory as wan really shoots up in memory with resolution and time increases.
I have implemented this code to benchmark:
import time
...
start = time.time()
x1 = hidden_states[..., 0::2]
x2 = hidden_states[..., 1::2]
end = time.time()
print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! BENCHMARK !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!")
print(end - start)
There are 160 executions on my startup. Before, after the change and the diff (in seconds):
| Before | After | Diff |
|---|---|---|
| 0.0421395301818847 | 0.0186080932617187 | -0.0235314369201660 |
| 0.0091154575347900 | 0.0033469200134277 | -0.0057685375213623 |
| 0.0087890625000000 | 0.0034732818603516 | -0.0053157806396484 |
| 0.0093810558319092 | 0.0033228397369385 | -0.0060582160949707 |
| 0.0087580680847168 | 0.0031294822692871 | -0.0056285858154297 |
| 0.0088031291961670 | 0.0031223297119141 | -0.0056807994842529 |
| 0.0055754184722900 | 0.0031778812408447 | -0.0023975372314453 |
| 0.0071923732757568 | 0.0030803680419922 | -0.0041120052337647 |
| 0.0078923702239990 | 0.0031263828277588 | -0.0047659873962402 |
| 0.0077276229858398 | 0.0031015872955322 | -0.0046260356903076 |
| 0.0077912807464600 | 0.0031595230102539 | -0.0046317577362061 |
| 0.0087156295776367 | 0.0030493736267090 | -0.0056662559509277 |
| 0.0075991153717041 | 0.0031275749206543 | -0.0044715404510498 |
| 0.0085408687591553 | 0.0030329227447510 | -0.0055079460144043 |
| 0.0046560764312744 | 0.0031311511993408 | -0.0015249252319336 |
| 0.0051045417785645 | 0.0030672550201416 | -0.0020372867584229 |
| 0.0057218074798584 | 0.0032899379730225 | -0.0024318695068359 |
| 0.0074028968811035 | 0.0030341148376465 | -0.0043687820434570 |
| 0.0047314167022705 | 0.0031266212463379 | -0.0016047954559326 |
| 0.0048007965087891 | 0.0029926300048828 | -0.0018081665039063 |
| 0.0061461925506592 | 0.0031211376190186 | -0.0030250549316406 |
| 0.0068418979644775 | 0.0030431747436523 | -0.0037987232208252 |
| 0.0066962242126465 | 0.0031459331512451 | -0.0035502910614014 |
| 0.0049102306365967 | 0.0029878616333008 | -0.0019223690032959 |
| 0.0093643665313721 | 0.0030882358551025 | -0.0062761306762695 |
| 0.0074284076690674 | 0.0029993057250977 | -0.0044291019439697 |
| 0.0057675838470459 | 0.0031495094299316 | -0.0026180744171143 |
| 0.0071196556091309 | 0.0030658245086670 | -0.0040538311004639 |
| 0.0077340602874756 | 0.0037443637847900 | -0.0039896965026856 |
| 0.0077130794525146 | 0.0030124187469482 | -0.0047006607055664 |
| 0.0061564445495605 | 0.0031197071075439 | -0.0030367374420166 |
| 0.0079400539398193 | 0.0030746459960938 | -0.0048654079437256 |
| 0.0058634281158447 | 0.0031714439392090 | -0.0026919841766357 |
| 0.0081622600555420 | 0.0029609203338623 | -0.0052013397216797 |
| 0.0069501399993896 | 0.0031454563140869 | -0.0038046836853027 |
| 0.0078878402709961 | 0.0030357837677002 | -0.0048520565032959 |
| 0.0080575942993164 | 0.0031151771545410 | -0.0049424171447754 |
| 0.0050995349884033 | 0.0030565261840820 | -0.0020430088043213 |
| 0.0080208778381348 | 0.0031700134277344 | -0.0048508644104004 |
| 0.0065584182739258 | 0.0030229091644287 | -0.0035355091094971 |
| 0.0053703784942627 | 0.0031318664550781 | -0.0022385120391846 |
| 0.0052416324615479 | 0.0030868053436279 | -0.0021548271179199 |
| 0.0054197311401367 | 0.0030879974365234 | -0.0023317337036133 |
| 0.0049960613250732 | 0.0030584335327148 | -0.0019376277923584 |
| 0.0074501037597656 | 0.0031375885009766 | -0.0043125152587891 |
| 0.0073106288909912 | 0.0029852390289307 | -0.0043253898620606 |
| 0.0046367645263672 | 0.0031688213348389 | -0.0014679431915283 |
| 0.0049192905426025 | 0.0030219554901123 | -0.0018973350524902 |
| 0.0060331821441650 | 0.0031657218933105 | -0.0028674602508545 |
| 0.0115311145782470 | 0.0030133724212646 | -0.0085177421569824 |
| 0.0118765830993652 | 0.0032989978790283 | -0.0085775852203369 |
| 0.0052525997161865 | 0.0031018257141113 | -0.0021507740020752 |
| 0.0048851966857910 | 0.0034267902374268 | -0.0014584064483643 |
| 0.0111300945281982 | 0.0031645298004150 | -0.0079655647277832 |
| 0.0047070980072021 | 0.0031282901763916 | -0.0015788078308105 |
| 0.0045855045318604 | 0.0032043457031250 | -0.0013811588287354 |
| 0.0094137191772461 | 0.0030689239501953 | -0.0063447952270508 |
| 0.0093262195587158 | 0.0030744075775146 | -0.0062518119812012 |
| 0.0091929435729980 | 0.0032446384429932 | -0.0059483051300049 |
| 0.0071072578430176 | 0.0030021667480469 | -0.0041050910949707 |
| 0.0094301700592041 | 0.0033464431762695 | -0.0060837268829346 |
| 0.0092351436614990 | 0.0032732486724854 | -0.0059618949890137 |
| 0.0054991245269775 | 0.0033721923828125 | -0.0021269321441650 |
| 0.0046093463897705 | 0.0031516551971436 | -0.0014576911926270 |
| 0.0101990699768066 | 0.0039906501770020 | -0.0062084197998047 |
| 0.0113568305969238 | 0.0030558109283447 | -0.0083010196685791 |
| 0.0070419311523438 | 0.0031654834747314 | -0.0038764476776123 |
| 0.0086443424224854 | 0.0030453205108643 | -0.0055990219116211 |
| 0.0099291801452637 | 0.0031201839447021 | -0.0068089962005615 |
| 0.0091631412506104 | 0.0031297206878662 | -0.0060334205627441 |
| 0.0095853805541992 | 0.0033917427062988 | -0.0061936378479004 |
| 0.0111463069915771 | 0.0034832954406738 | -0.0076630115509033 |
| 0.0105581283569335 | 0.0034265518188477 | -0.0071315765380859 |
| 0.0102081298828125 | 0.0030958652496338 | -0.0071122646331787 |
| 0.0094234943389893 | 0.0032963752746582 | -0.0061271190643311 |
| 0.0081713199615479 | 0.0032901763916016 | -0.0048811435699463 |
| 0.0074520111083984 | 0.0034911632537842 | -0.0039608478546143 |
| 0.0089154243469238 | 0.0031881332397461 | -0.0057272911071777 |
| 0.0088458061218262 | 0.0033829212188721 | -0.0054628849029541 |
| 0.0096502304077148 | 0.0033123493194580 | -0.0063378810882568 |
| 0.0244009494781494 | 0.0078990459442139 | -0.0165019035339355 |
| 0.0057473182678223 | 0.0030648708343506 | -0.0026824474334717 |
| 0.0046644210815430 | 0.0031671524047852 | -0.0014972686767578 |
| 0.0047345161437988 | 0.0029947757720947 | -0.0017397403717041 |
| 0.0048987865447998 | 0.0030910968780518 | -0.0018076896667481 |
| 0.0067203044891357 | 0.0029950141906738 | -0.0037252902984619 |
| 0.0048406124114990 | 0.0030899047851563 | -0.0017507076263428 |
| 0.0047273635864258 | 0.0029575824737549 | -0.0017697811126709 |
| 0.0061659812927246 | 0.0030968189239502 | -0.0030691623687744 |
| 0.0046279430389404 | 0.0030541419982910 | -0.0015738010406494 |
| 0.0047345161437988 | 0.0031464099884033 | -0.0015881061553955 |
| 0.0069465637207031 | 0.0029978752136230 | -0.0039486885070801 |
| 0.0075678825378418 | 0.0031361579895020 | -0.0044317245483398 |
| 0.0048172473907471 | 0.0030024051666260 | -0.0018148422241211 |
| 0.0048902034759521 | 0.0030891895294189 | -0.0018010139465332 |
| 0.0048506259918213 | 0.0029928684234619 | -0.0018577575683594 |
| 0.0049655437469482 | 0.0031177997589111 | -0.0018477439880371 |
| 0.0050978660583496 | 0.0030713081359863 | -0.0020265579223633 |
| 0.0050847530364990 | 0.0031337738037109 | -0.0019509792327881 |
| 0.0047094821929932 | 0.0030446052551270 | -0.0016648769378662 |
| 0.0046484470367432 | 0.0032041072845459 | -0.0014443397521973 |
| 0.0062952041625977 | 0.0029749870300293 | -0.0033202171325684 |
| 0.0047221183776855 | 0.0030632019042969 | -0.0016589164733887 |
| 0.0046348571777344 | 0.0029704570770264 | -0.0016644001007080 |
| 0.0047054290771484 | 0.0030844211578369 | -0.0016210079193115 |
| 0.0045263767242432 | 0.0030784606933594 | -0.0014479160308838 |
| 0.0047385692596436 | 0.0031981468200684 | -0.0015404224395752 |
| 0.0048210620880127 | 0.0030486583709717 | -0.0017724037170410 |
| 0.0045921802520752 | 0.0031194686889648 | -0.0014727115631104 |
| 0.0047745704650879 | 0.0030097961425781 | -0.0017647743225098 |
| 0.0049242973327637 | 0.0031056404113770 | -0.0018186569213867 |
| 0.0046339035034180 | 0.0029680728912354 | -0.0016658306121826 |
| 0.0048007965087891 | 0.0063762664794922 | 0.0015754699707031 |
| 0.0047740936279297 | 0.0031573772430420 | -0.0016167163848877 |
| 0.0047769546508789 | 0.0030333995819092 | -0.0017435550689697 |
| 0.0073404312133789 | 0.0030534267425537 | -0.0042870044708252 |
| 0.0077805519104004 | 0.0041611194610596 | -0.0036194324493408 |
| 0.0048308372497559 | 0.0030725002288818 | -0.0017583370208740 |
| 0.0047106742858887 | 0.0032036304473877 | -0.0015070438385010 |
| 0.0047028064727783 | 0.0045213699340820 | -0.0001814365386963 |
| 0.0046601295471191 | 0.0031633377075195 | -0.0014967918395996 |
| 0.0045568943023682 | 0.0031034946441650 | -0.0014533996582031 |
| 0.0048530101776123 | 0.0035943984985352 | -0.0012586116790772 |
| 0.0046441555023193 | 0.0034363269805908 | -0.0012078285217285 |
| 0.0048103332519531 | 0.0034394264221191 | -0.0013709068298340 |
| 0.0047457218170166 | 0.0033185482025146 | -0.0014271736145020 |
| 0.0047028064727783 | 0.0033512115478516 | -0.0013515949249268 |
| 0.0046455860137939 | 0.0030872821807861 | -0.0015583038330078 |
| 0.0047202110290527 | 0.0031988620758057 | -0.0015213489532471 |
| 0.0046472549438477 | 0.0030579566955566 | -0.0015892982482910 |
| 0.0044870376586914 | 0.0032095909118652 | -0.0012774467468262 |
| 0.0066292285919189 | 0.0030336380004883 | -0.0035955905914307 |
| 0.0045423507690430 | 0.0031654834747314 | -0.0013768672943115 |
| 0.0057387351989746 | 0.0030357837677002 | -0.0027029514312744 |
| 0.0047850608825684 | 0.0033471584320068 | -0.0014379024505615 |
| 0.0044982433319092 | 0.0030486583709717 | -0.0014495849609375 |
| 0.0045578479766846 | 0.0031671524047852 | -0.0013906955718994 |
| 0.0064446926116943 | 0.0030281543731689 | -0.0034165382385254 |
| 0.0066962242126465 | 0.0031011104583740 | -0.0035951137542725 |
| 0.0046284198760986 | 0.0030128955841064 | -0.0016155242919922 |
| 0.0046155452728271 | 0.0030755996704102 | -0.0015399456024170 |
| 0.0045783519744873 | 0.0029969215393066 | -0.0015814304351807 |
| 0.0047678947448730 | 0.0030941963195801 | -0.0016736984252930 |
| 0.0046646595001221 | 0.0030879974365234 | -0.0015766620635986 |
| 0.0045282840728760 | 0.0031120777130127 | -0.0014162063598633 |
| 0.0047643184661865 | 0.0030353069305420 | -0.0017290115356445 |
| 0.0049471855163574 | 0.0030617713928223 | -0.0018854141235352 |
| 0.0045933723449707 | 0.0030624866485596 | -0.0015308856964111 |
| 0.0044887065887451 | 0.0031399726867676 | -0.0013487339019775 |
| 0.0046794414520264 | 0.0030252933502197 | -0.0016541481018066 |
| 0.0047705173492432 | 0.0030777454376221 | -0.0016927719116211 |
| 0.0045073032379150 | 0.0029671192169189 | -0.0015401840209961 |
| 0.0047283172607422 | 0.0030846595764160 | -0.0016436576843262 |
| 0.0045824050903320 | 0.0030794143676758 | -0.0015029907226563 |
| 0.0047013759613037 | 0.0031030178070068 | -0.0015983581542969 |
| 0.0068626403808594 | 0.0030944347381592 | -0.0037682056427002 |
| 0.0079987049102783 | 0.0031914710998535 | -0.0048072338104248 |
| 0.0101602077484130 | 0.0030560493469238 | -0.0071041584014892 |
| 0.0124363899230957 | 0.0031447410583496 | -0.0092916488647461 |
| 0.0094563961029053 | 0.0031492710113525 | -0.0063071250915527 |
So the time is reduced by -0.555660486221313 seconds