Apricot icon indicating copy to clipboard operation
Apricot copied to clipboard

Rewrite this one in C(++) or reduce functionality

Open akidee opened this issue 15 years ago • 5 comments

Making it dirty like this http://github.com/tautologistics/node-htmlparser/blob/master/utils_example.js is so much faster! I have rewritten a piece of code from PHP to node and Apricot is really slow compared to that for big and many HTML files.

akidee avatar Aug 02 '10 23:08 akidee

Rewrite what? Are you able to provide some bench marks? If you just want a fast parser, use htmlparser.

silentrob avatar Aug 03 '10 16:08 silentrob

Yes, I am still using htmlparser for parsing and it's pretty fast. I have tried to get elements with sizzle selectors. Doing it manually with the rudimentary DOM support of htmlparser is many times faster. I will provide you some benchmarks in some days.

akidee avatar Aug 03 '10 19:08 akidee

I'm having some performance troubles as well. Consider the following code:

https://gist.github.com/97db243b2ba3a3f9f458

time node index.js
Documented loaded
Elements found

real    0m15.752s
user    0m12.399s
sys 0m0.061s

Pretty much all of those 15 seconds are spend on executing the find('a') call on the document, so something seems wrong here.

--fg

felixge avatar Aug 23 '10 14:08 felixge

Interesting, thanks, ill dig in.

silentrob avatar Aug 23 '10 15:08 silentrob

Thanks for the quick reply. I was thinking of creating a small node app that lists all existing node.js modules by scraping various sources and lets you sorts things by github forks or google backlinks.

felixge avatar Aug 23 '10 15:08 felixge