DIRAC icon indicating copy to clipboard operation
DIRAC copied to clipboard

DFC: DirectoryUsage issues

Open andresailer opened this issue 10 years ago • 8 comments

Two Issues with the DirectoryUsage for the DFC:

a) I noticed some errors in the DirectoryUsage overview in the FileCatalogCLI

FC:/> size -l
directory: /
Logical Size: 2,533,639,186,906,667 Files: 11108048 Directories: 213751
    StorageElement                        Size                   Replicas 
=========================================================================
[...]
  8 PNNL-SRM                              -2,474,098,350,160     -118993 
[...]

Notice the negative size and number of replicas. This was probably caused by me unregistering all the replicas on PNNL-SRM, and in some cases unregistering replicas no longer present at PNNL-SRM, but those were still subtracted from the usage.

b) When I tried to rebuild the directory usage this happened

FC:/> rebuild catalog
Error: Socket read timeout exceeded

At first the DirectyUsage was basically empty, but it has then filled up again. Is this directoryUsage table slowly being rebuild? I can see some discrepancies between what I can get directly from the FC_Files or FC_Replicas tables (number of replicas per SE, total size) and what the "size -l" shows.

andresailer avatar May 22 '15 15:05 andresailer

In your case the DirectoryUsage somehow got out of sync with the actual contents of the catalog. You correctly did the rebuilding of the table. This is a lengthy operation, so it is kind of normal that you have got a timeout ( I will increase the timeout for this command ). This recreates the DirectoryUsage tables completely. In the normal course of operations, the table is updated each time there is a file/replica added or removed from the catalog. So, it should be in perfect sync with what you can get from the database tables. If you see discrepancies, the table can be rebuilt. If you see discrepancies systematically, then there is a problem that we have to identify and address appropriately

atsareg avatar May 23 '15 21:05 atsareg

This is rather a topic for discussion in the DIRAC forum. Closing it here

atsareg avatar May 24 '15 11:05 atsareg

As I said, I am pretty sure the negative size and number of files is caused when calling "unregister replica LFN SE" for replicas no longer present at the given SE, so this is a bug that needs to be fixed.

I think the rebuild command does not work properly when there are other file additions in the catalog while the command is running.

If it is normal to get a timeout, why not print out a message to the user telling them about possible timeouts.

PS: The rebuild commands needs an argument and crashes otherwise

FC:/> rebuild
Traceback (most recent call last):
  File "/home/sailer/software/DIRAC/DiracDevV6r12/DIRAC/DataManagementSystem/scripts/dirac-dms-filecatalog-cli.py", line 57, in <module>
    cli.cmdloop()
  File "/home/sailer/software/DIRAC/DiracDevV6r12/Linux_x86_64_glibc-2.12/lib/python2.6/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/home/sailer/software/DIRAC/DiracDevV6r12/Linux_x86_64_glibc-2.12/lib/python2.6/cmd.py", line 219, in onecmd
    return func(arg)
  File "/home/sailer/software/DIRAC/DiracDevV6r12/DIRAC/DataManagementSystem/Client/FileCatalogClientCLI.py", line 2447, in do_rebuild
    _option = argss[0]
IndexError: list index out of range

andresailer avatar May 25 '15 12:05 andresailer

https://groups.google.com/forum/#!topic/diracgrid-forum/bXH45l-9ofw Bump

andresailer avatar Sep 06 '16 14:09 andresailer

Still the case ?

chaen avatar Aug 01 '18 09:08 chaen

I think so...

andresailer avatar Aug 01 '18 09:08 andresailer

@atsareg ping

chaen avatar Aug 20 '19 10:08 chaen

pong

chaen avatar Dec 06 '19 08:12 chaen