anitya icon indicating copy to clipboard operation
anitya copied to clipboard

You are going to be blocked you from scraping MetaCPAN.org

Open ranguard opened this issue 7 months ago • 3 comments

Hi,

Please could you review your bot, this is traffic for 24 hours hitting MetaCPAN.org

Image

You are not remembering permanent redirects and are causing load we do not want to handle.

We will be blocking your scraper tomorrow - please update this ticket once the issue has been rectified and we will re-enable access.

Please also review if scraping is the best approach or if information published by https://pause.perl.org/pause/query?ACTION=pause_04about#indexer would be suffichent

Kind regards

MetaCPAN team

ranguard avatar May 24 '25 20:05 ranguard

Thanks for letting me know about the issue. I wasn't aware of that. For metacpan.org we are using this URL to check for new versions https://metacpan.org/release/{project.name}/. I will check the https://pause.perl.org to see if that would be better match for anitya.

Sorry for the inconvenience.

Zlopez avatar May 26 '25 10:05 Zlopez

What's the status here? Seems like most of the CPAN version checks are failing.

opoplawski avatar Jun 21 '25 22:06 opoplawski

The patch with the fix is merged, I just need to find time to release new Anitya version.

Zlopez avatar Jun 23 '25 11:06 Zlopez

@ranguard: #1904 was released, reducing the proportion of requests that will result in a redirection. Is this sufficient for MetaCPAN to unblock Anitya?

mavit avatar Jul 18 '25 10:07 mavit

@ranguard: #1904 was released, reducing the proportion of requests that will result in a redirection. Is this sufficient for MetaCPAN to unblock Anitya?

Hi All,

Thanks for taking action.

I have removed the block, any issues please email noc @ metacpan.org

All the best

MetaCPAN team

ranguard avatar Jul 19 '25 08:07 ranguard

The checks are still failing as they did when they were blocked (see, e.g., https://release-monitoring.org/project/8399/). I have emailed noc @ metacpan.org as requested.

mavit avatar Jul 22 '25 12:07 mavit

I think this has been resolved - permanently this time!

Please email again if not

ranguard avatar Jul 27 '25 12:07 ranguard