data-science-toolkit icon indicating copy to clipboard operation
data-science-toolkit copied to clipboard

KNN section work

Open pmaji opened this issue 6 years ago • 10 comments

Had a great discussion with @TWilliamBell as a result of my reddit post sharing the work I had started on this rep.

I believe the fruit of that conversation will be the addition of some materials under the classification section that will cover KNN methods building off of / summarizing the excellent work already started over in this repo here.

Opening this issue for the discussion of all things pertaining to the prospective KNN section, and I have opened a subfolder for KNN under classification with a placeholder text file for now.

pmaji avatar Sep 04 '18 02:09 pmaji

I have been busy lately, going to work more on it soon!

TWilliamBell avatar Sep 11 '18 01:09 TWilliamBell

@TWilliamBell Fantastic! Looking forward to the collaboration 👍

pmaji avatar Sep 11 '18 02:09 pmaji

@pmaji There is more in my repo now, I will add to yours soon.

TWilliamBell avatar Sep 11 '18 05:09 TWilliamBell

@pmaji Alright, I have it how I want it I think, I've addressed most of your suggestions (it is in md and I have included an example including a benchmark comparing a library implementation to my version).

Do you have any suggestions for better visuals? I am open to ideas if you've seen something designed for this before (I know you've mentioned it, but if you could be more specific). I realize my visuals aren't as strong as they could be.

TWilliamBell avatar Sep 14 '18 21:09 TWilliamBell

@TWilliamBell This is awesome work! I have one main suggested change that is a high-level formatting proposal. Presently, your markdowns are rendered as straight HTML, which is actually what keeps the plots, etc., from displayed in browser when you view the markdown.

If you check out any of my Rmd's -- say the logit one for example -- you'll see I have a few options there that make it tailored to display via Github. I've pasted the crux of the code below. If you transform your Rmds into that format and submit a new PR, it'll add a files folder, override the old MDs, and all your plots will display :). See below for the header / YAML code that will do the trick for GitHub markdowns. (sadly the pasted code loses its indent structure--which is necessary to get the code to run. As such, I'd recommend just going to any of my markdowns and copying the output: call.

`---

title: "Logistic Regression" author: Paul Jeffries output: github_document: toc: TRUE toc_depth: 2

---`

Finally, the other thing I might do is submit some formatting changes to the main markdown--just to be able to lay out in a slightly more seamless way what sort of progressions folks should go through via the various Rmds you have. Those are edits I can make later though :)

This is a truly awesome start! Once the plots are actually displaying via the fix I suggested I can think through what other viz's might be appropriate.

pmaji avatar Sep 14 '18 23:09 pmaji

I had your thought process in mind because the Rmds don't work on GitHub, but I thought it was good enough that I had the mds display it in a proper literate programming code-visuals-text manner. What do you think is the benefit of the changes over the mds already there?

I'm sorry if you think you've already mentioned it but I might have missed it.

TWilliamBell avatar Sep 16 '18 00:09 TWilliamBell

@TWilliamBell sure thing! I probably wasn't clear--that's on me :)

What i mean is that changing the output type of your markdowns will actually mean that folks will be able to see your outputs and plots on github--presently we can't see any of your plots sadly.

To add more clarity to my previous comment about what to change, it's just one line of code and a re-render. Check out the explanation at the link below and let me know if that makes sense.

https://rmarkdown.rstudio.com/github_document_format.html

pmaji avatar Sep 16 '18 01:09 pmaji

Additional explanation of what I mean here on stack too:

https://stackoverflow.com/questions/39814916/how-can-i-see-output-of-rmd-in-github

pmaji avatar Sep 16 '18 04:09 pmaji

Ah okay I think I understand, I had to do something similar anyways to make the md's I posted so I understand, but now I see the point of doing that as opposed to just making the md's.

TWilliamBell avatar Sep 16 '18 15:09 TWilliamBell

Hey @TWilliamBell -- I was doing some repo maintenance and I never realized that you went back and made the formatting changes I requested, so now all the visuals are viewable on GitHub! This is awesome!

Does this repo here represent the most recent version of your work? If so, I'll fork it shortly and add it into my repository here, cross-linking to your repo.

Excited to add in your contributions!

pmaji avatar Feb 20 '19 15:02 pmaji