pdns icon indicating copy to clipboard operation
pdns copied to clipboard

batch deletion of records in a zone during AXFR to create smaller sql requests which do not reach gmysql-timeout

Open martinbarlow opened this issue 3 years ago • 3 comments

Authoritative server as secondary handle large zones

If a zone is massive (millions of records) it can take several minutes to purge a zone as part of an AXFR. This can exceed the default gmysql-timeout of 10 seconds.

It would be better to perform smaller batches of deletes, within a transaction.

Usecase

handle large zones on secondary servers, without having to increase gmysql-timeout to very large values. (5-10 minutes)

Description

Increasing gmysql-timeout to very large numbers does not feel optimal. It would be better to batch delete records in a zone in a transaction, instead of trying to do one big query which can take many minutes.

martinbarlow avatar Aug 09 '22 09:08 martinbarlow

I don't follow how doing more, smaller, queries would be more optimal than one big query. (But I haven't yet looked at what we do exactly). Can you explain why it would be better?

Habbie avatar Aug 09 '22 09:08 Habbie

habbie, thanks for your consideration.

Today when there is an AXFR for a large zone on secondary, it does

delete from records where domain_id=?:

for a very large zone (millions of records) , this can take many minutes and will take longer than gmysql-timeout

https://www.rathishkumar.in/2017/12/deleting-huge-number-of-records-in-mysql.html

instead do

begin; while more records in table: delete from records where domain_id=? limit 10000 commit;

increasing the gmysql-timeout to very large numbers does not seem ideal, as it likely causes problems in other places or mask real issues with database. gmysql-timeout is also a semi hidden internal timeout, that folks can hit arbitrarily without knowing the ability exists to hit it in normal operations.

martinbarlow avatar Aug 09 '22 10:08 martinbarlow

Ok, understood. The delete would actually take slightly longer, but cut up in chunks that do not hit the timeout. This makes sense to me. I will milestone this "auth-helpneeded", which means we consider the question valid, but do not plan to work on it ourselves any time soon - but we will consider patches if people PR them.

Habbie avatar Aug 09 '22 10:08 Habbie