alluxio
alluxio copied to clipboard
JobMaster occurs memory leak problems when running too many distributedLoad job
Alluxio Version: 2.9.3
Describe the bug
After submitting a large number of distributedLoad jobs in production environment, job master has a memory leak problem and finally cause OOM.
To Reproduce 1.Set up one alluxio cluster,1 master, 3 workers. 2.Mock a large number of small files in underFileSystem 3.Submit a large number of distributedLoad jobs.Notice:Take the batchsize=1 as the loading args. 4.Observe the memory changes and gc in JobMaster.
Expected behavior The memory size continues to increase until the maximum memory size is reached,finally causing the OOM problem.
Urgency yes
Are you planning to fix it yes
Additional context
The cause of this bug is that the residual job information in mInfoMap is not deleted.