DesktopCommanderMCP
DesktopCommanderMCP copied to clipboard
Impossible to read heavy html pages
Example:
{
`path`: `https://www.amazon.es/-/en/dp/B081ZKNYY1/?coliid=I2COYY6AELQXYP&colid=3KN6H693EIOW6&ref_=list_c_wl_lv_ov_lig_dp_it&th=1`,
`isUrl`: true
}
Got:
result exceeds maximum length of 1048576
It seems that it's a limit in the http-client
Yeah, that limit is there as this would already eat 25% of chat context. If page is larger you are risking not being able to do anything useful with it.
I was thinking to improve this with time. But question. What is it that you are trying to do? Do you care about css/html/javascript in that page or only about text?
I mean pulling full page in will not work. Only in parts. To do it better I need to understand better what people are trying to do to allow interface that intelligently pulls only needed parts from the page.
I have the similar issue. When I looking at the mcp log, the content is very big. It would said claude response was interrupted.
@wonderwhy-er I don't care about any css/html/javascript I need only text. I meet with this issue when I was trying to analyze some text from html page, actually there was a little context, but because of a lot of junk (css/js) page didn't fit to limit.
@romnovi yeah I am thinking to add options to ask for only parts of the page. So you can get text only, or markdown style and other reduced forms.
I have a similar problem but not with url requests. Some of my js files have almost 5k lines of code, unfortunately it can't read all of them. I edited the git project according to my own wishes. I can make it read in 90k parts. But when you update desktop-commander, sadly I can't access these updates. I would be very happy if you could add this as an option in the config.
Is it 90k characters?
I would usually recommend to split files to small ones. I try not to go over 300 lines per file. That helps LLMs to read only what is needed instead of large things.
But we can add this to configuration or think of a better way to allow you to remove that "best practice"
yes it gives a maximum length error at 100k characters. I read it as 90k characters and tell it that the file has not been read yet at the end of the file that is being read, and that it should read again with read character start pos. In this way, it can read the entire file by doing 2, maximum 3 readings. The reason I kept it as 90k is because the prompts related to read_file are added to the beginning and end of the file, to make sure it is under 100k. If it is possible to add it, we can at least have an option in the config to read large files, if you want I can share the change I made.
We just released a version that allows to read any size in chunks and change defaults of how much is read by default