pywebcopy icon indicating copy to clipboard operation
pywebcopy copied to clipboard

URL changed when i set url property of the WebPage's get method.

Open zengyinggang opened this issue 4 years ago • 1 comments

Hi, my url contain two dots, this program ignores those dots and made a wrong asset url. For example: My url is https://example.com/rims/pro/0aff25a9b7d8705d99d558e82a19f8f8/sec/HQv2cFXZHwS6kcNVuquD6etOFRPDO7kvo_XJ6lzbxFMYxHUNy3xND7XT9Hlpqvl04hcf77j9NqhV7bF5cF129THtfGkM4rvQBOUKqT027uIuN4A7M8rvNHupBhay1QNyenlkLVk3kipkNnS1urCAHg../sed/tipps/html/0d-1568269-master.html?docuNo=7cdeaaa7e7d69c174aca6a55b1221310

You can see that before "/sed" there are two dots in it. After i crawl the website, some of the asset url changed, for example https://example.com/rims/pro/0aff25a9b7d8705d99d558e82a19f8f8/sec/HQv2cFXZHwS6kcNVuquD6etOFRPDO7kvo_XJ6lzbxFMYxHUNy3xND7XT9Hlpqvl04hcf77j9NqhV7bF5cF129THtfGkM4rvQBOUKqT027uIuN4A7M8rvNHupBhay1QNyenlkLVk3kipkNnS1urCAHgsed/tipps/assets/scss/hst2-param.css

The css link href is "../assets/scss/hst2-param.css"

zengyinggang avatar Oct 05 '21 07:10 zengyinggang

It isnt ignoring the 'dots', this is a basic security measure to prevent unauthorized access to the user files by the program. If this protection is removed then the downloaded files could be saved or read or deleted from unexpected directories. @zengyinggang

rajatomar788 avatar Oct 09 '21 08:10 rajatomar788