Greg Lindahl

Results 33 issues of Greg Lindahl

``` //youtu.be/foo => https://www.youtube.com/watch?v=foo Also y2u.be works like youtu.be but is not run by google (!) That can't be good https://redd.it/7tczf9 => https://www.reddit.com/tb/7tczf9 https://app.instapage.com/route/9475232/?url=www.nat.ai/careers => http://www.nat.ai/careers ```

Right now your configuration list of cgi args is expressed as source. This certainly gets the job done, but, it might be a lot prettier to move the list into...

enhancement

Hi. I'm a search engine guy, and I'm very interested in a well-tested list of strippable CGI args to reduce the work my crawler has to do. I tried to...

tracker

## Long story short I have been fetching the front pages of millions of websites using aiohttp, and collected a large number of cases where aiohttp client's http parser throws...

I was attempting to talk my boss through using `py-spy top` on his long-running process ("ok type ps, now note the pid of your python process, then..."), and it struck...

I'm writing a package using PyAthena that might or might not have extremely large results sets, so I've been interested in memory usage. It seems that the default cursor is...

``` cdxt --cc --from 2021 --to 2020 -v -v --limit 1 iter https://www.pbm.com/ INFO:cdx_toolkit.cli:set loglevel to DEBUG DEBUG:cdx_toolkit.myrequests:getting https://index.commoncrawl.org/collinfo.json None DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): index.commoncrawl.org:443 DEBUG:urllib3.connectionpool:https://index.commoncrawl.org:443 "GET /collinfo.json HTTP/1.1"...

Right now the only interface for getting at the record content is `record.content_stream().read()`, which is streaming. I can't do that twice. So if I'm passing a record around in a...

When reporting problems, class DigestChecker expects check_digests to be 'raise' or 'log', elsewise it prints nothing. It does always set self._passed to False. So the boolean value `check_digests=True` passes all...