sonic-buildimage icon indicating copy to clipboard operation
sonic-buildimage copied to clipboard

[Marvell-armhf] Fix SDK DMA allocation failure

Open pavannaregundi opened this issue 1 year ago • 6 comments

Why I did it

Fix DMA memory allocation failure seen when SDK drivers where failed to find size aligned memory.

pci 0000:01:00.0: dma_alloc_coherent failed to allocate aligned size of 0x200000 for phys0xbf300000

Work item tracking
  • Microsoft ADO (number only):

How I did it

  • Fix DMA driver alloc failure by adding more retries and moving the free for unaligned allocation to the end of allocation logic.
  • Increase CMA memory to 32MB to allow for more retries.

How to verify it

Run SONIC PTF with fixes for armhf-nokia_ixs7215_52x-r0 platform.

Which release branch to backport (provide reason below if selected)

  • [ ] 201811
  • [ ] 201911
  • [ ] 202006
  • [ ] 202012
  • [ ] 202106
  • [ ] 202111
  • [ ] 202205
  • [ ] 202211
  • [ ] 202305
  • [x] 202405

Tested branch (Please provide the tested image version)

  • [ ]
  • [ ]

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

pavannaregundi avatar Sep 06 '24 09:09 pavannaregundi

@Blueve @Pavan-Nokia Please review

pavannaregundi avatar Sep 06 '24 14:09 pavannaregundi

@pavannaregundi please help update sonic-mgmt test id for above change

Blueve avatar Oct 18 '24 02:10 Blueve

@pavannaregundi please help update sonic-mgmt test id for above change

@Blueve We have used following sonic-mgmt commit id to test this.

commit d0c4549ed4448cada922345c8a028eb8b476daf2 (HEAD -> new_commit, origin/master, origin/HEAD, master)
Author: Bobby McGonigle <[email protected]>
Date:   Wed Aug 7 23:12:15 2024 -0700

pavannaregundi avatar Oct 23 '24 06:10 pavannaregundi

Hi Jing, below if the kusto test upload ID d974dd9a-93a8-47f0-ba89-b5ac48c56832 - Mx 5a0fcfca-3fae-4205-a436-8ccd09719c10 - M0

Pavan-Nokia avatar Oct 25 '24 13:10 Pavan-Nokia

@Pavan-Nokia should following lines use continue instead? I didn't get how the retry works

https://github.com/Marvell-switching/mrvl-prestera/compare/36fa3a3f4e317d8c0c111cc74aafffce12e1546d...9608c8c41e462998cd144ed34780e34f1b48e081#diff-d81733e3ed589ee79a5a8edab98acbe7c2cf441a2cb8b8d7381acf179403b372R397

https://github.com/Marvell-switching/mrvl-prestera/compare/36fa3a3f4e317d8c0c111cc74aafffce12e1546d...9608c8c41e462998cd144ed34780e34f1b48e081#diff-d81733e3ed589ee79a5a8edab98acbe7c2cf441a2cb8b8d7381acf179403b372R397

Blueve avatar Oct 31 '24 14:10 Blueve

@Pavan-Nokia should following lines use continue instead? I didn't get how the retry works

Marvell-switching/[email protected]#diff-d81733e3ed589ee79a5a8edab98acbe7c2cf441a2cb8b8d7381acf179403b372R397

Marvell-switching/[email protected]#diff-d81733e3ed589ee79a5a8edab98acbe7c2cf441a2cb8b8d7381acf179403b372R397

@Blueve Logic checks to find for aligned memory allocation as per the size of the request. If a non-aligned memory is found it stores them in array named non_aligned_dma_arr. If an aligned memory is found, at any stage it breaks out of loop. At the end frees the non_aligned_dma_arr.

pavannaregundi avatar Nov 04 '24 03:11 pavannaregundi

ADO: 30148643

Blueve avatar Nov 06 '24 06:11 Blueve

/azpw ms_conflict -f

Blueve avatar Nov 12 '24 05:11 Blueve

/azpw ms_conflict

liushilongbuaa avatar Nov 13 '24 07:11 liushilongbuaa

Cherry-pick PR to 202405: https://github.com/sonic-net/sonic-buildimage/pull/20781

mssonicbld avatar Nov 13 '24 07:11 mssonicbld