Dan Luu

Results 101 issues of Dan Luu

When I run the example in the readme, `checkreturntypes(Base)`, I get `The total number of failed methods in Base is 31`. However, it's difficult to see what the errors are,...

1. Write some examples of how to use sort 2. Update help(sort) in julia core.

And other built in data structures.

This seems like it would screw up phrase queries? An alternate fix would be to add Lucene's stopword list to our parser and submit the appropriately modified phrase query.

After processing wikipedia with the fixes as of `274293f3af97c507416f6387020507ee99ca3238`, the tail of the DocFreqTable has a lot of n-grams: ~~~ 724ddeaf8cb3c269,1,0,1.93455e-07,Vasilije Veljko Milovanović e802585d5e004af1,1,0,1.93455e-07,2014 All-Arena Team 7c401744d5d61355,1,1,1.93455e-07,f.a.cortez dafa24ba41b2a01d,1,0,1.93455e-07,Coeliades ramanatek 1a8055b58daaf330,1,0,1.93455e-07,Jeff...

If we look at the wikipedia dump currently hosted on Azure, the modal number of postings per document is `5`, and things drop off rapidly from there: ~~~ Postings,Count 0,5...

See: https://en.wikipedia.org/?curid=28831157 https://en.wikipedia.org/?curid=1468119 https://en.wikipedia.org/?curid=31533859 In combination with the `--lists` issue, this is resulting in empty documents instead of 1 posting documents. With the `--lists` issue fixed, these should turn into...

For example: ~~~ 6d6b8015505c7099,1,1,4.61273e-07,2c_thrissur 3e8f9e5769458e9f,1,1,4.61273e-07,government_medical_college ~~~ We also have terms with double underscores that appear to be some kind of metadata? ~~~ 868661c0426526a7,1,1,0.000557102,__noeditsection__ a135c90cbb896da0,1,1,2.97521e-05,__notoc__ 14a64ebade034c85,1,1,3.11359e-06,__nogallery__ ~~~ As well as weird...

I thought that we were doing this. If we're not and that's on purpose, that's fine, but we have (for example) `downlink`, `downlinks`, and `downlinked` in our DocumentFrequencyTable when we...

I'll update this for Linux, but that will still leave missing instructions for Windows and OS X. I think it's fine to have those be missing, but we should have...