Radim Řehůřek

Results 318 comments of Radim Řehůřek
trafficstars

@pminervini almost certainly; let us know if you find something 8)

@pminervini GraphiChi sounds great, thanks for the link! Also check out gensim for fast SVD (gensim targets topic modelling though, not collab filtering).

I don't think so – the README links work.

@kmike this version seems to work – what's your plan to merge & release? The wheels (incl Windows) are not blocking for us – installing from source works fine once...

@cadnce could we squeeze this into the next release, v6.3.0? We're planning to release soon (~in a week or so).

@iliadmitriev this repo seems no longer maintained… would you like to start a fork?

@pauldmccarthy also bitten by this… is it enough if I manually call `seek(tell())` after each read, or is the workaround more involved? **EDIT**: OK I tried the above and it...

Sure, thanks for looking into this. I created this minimal example: ```python import io import tarfile import random from string import ascii_letters from indexed_gzip import IndexedGzipFile def generate_tgz(path, num_files=10000, file_size=5*1024):...

My running hypothesis is that it's somehow related to input buffering – either inside tarfile, or igzip's buffer (set to 4 MB above).

The problem is seek points are not being created as the file is being read (which is why I'm hijacking this particular issue, I thought it's related). Instead, there are...