xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

Make all tutorials multi-language

Open david-cortes opened this issue 7 months ago • 8 comments

ref https://github.com/dmlc/xgboost/pull/11410

After the PR above gets merged, we'll have a mechanism to show language tabs throughout the online docs, but it would only be used in 2 out of 25 tutorials. This could be a nice way to make all the current tutorials multi-language, by having a tab for 'R' next to the tab for 'Python' - example:

Image

I think most of the tutorials would be straightforward to write as R code (save for ones that don't apply, like distributed mode), but it's quite a lot of work. Leaving the idea here in case someone wants to work on porting at least some tutorials (CC @mayer79 @jameslamb @dfsnow).

david-cortes avatar Apr 17 '25 15:04 david-cortes

I love this!

I have wanted tab-per-language tutorials like this in LightGBM for a while, never took the time to investigate it. I'd love to adopt something like what you did for https://github.com/dmlc/xgboost/pull/11410 in LightGBM, with at least Python and R.

I'd love to expand https://github.com/microsoft/LightGBM/blob/master/docs/Features.rst with one section for each of like "early stopping", "linear trees", "custom objective function", etc. I'd welcome a PR over there to introduce this plugin and one such example (let's say "linear trees") if you have the time and interest. If not, no worries, I'll adopt what you've put together in https://github.com/dmlc/xgboost/pull/11410.

it's quite a lot of work

This can be a good candidate for a good first issue to attract more outside contributors, I think. If you all go that route, I recommend enumerating all of the examples to be done in a to-do list, like I did here: https://github.com/microsoft/LightGBM/issues/6361

I would also be happy to help with some of these examples 😊

jameslamb avatar Apr 17 '25 15:04 jameslamb

Likewise happy to help! But pretty busy these days so would massively appreciate a to-do list ala James' suggestion to make the work more bite-sized.

dfsnow avatar Apr 21 '25 18:04 dfsnow

@dfsnow Thank you for volunteering! Since @david-cortes has already set up the sphinx build, you can focus on the tutorials. Perhaps one tutorial at a time: https://github.com/dmlc/xgboost/tree/master/doc/tutorials ?

trivialfis avatar Apr 22 '25 12:04 trivialfis

Hi, I'd love to help out, should I go the tutorials route as well?

@trivialfis

shivamchhuneja avatar Jun 03 '25 17:06 shivamchhuneja

Thank you @shivamchhuneja . Yup.

trivialfis avatar Jun 04 '25 02:06 trivialfis

Cool thanks!

just wanted to quickly check before jumping in - would it be okay if i start by picking one or two tutorials and port them to R first?

want to get a feel for the scope + flow, and then can plan how to go about the rest.

also, are there any you’d suggest I not start with (like ones that might not translate well to R)?

happy to keep it async and slow paced, just want to make sure we’re all aligned before diving in :)

PS. bear with me as I have limited R exposure so first couple might be slow as I get back to it and then I'll pick up speed. (its been a few years since I actively used R instead of Python)

shivamchhuneja avatar Jun 04 '25 04:06 shivamchhuneja

would it be okay if i start by picking one or two tutorials and port them to R first?

For sure!

also, are there any you’d suggest I not start with (like ones that might not translate well to R)?

Distributed training.

bear with me as I have limited R exposure

No worries. This is a good chance for participants to get familiar with the new R interface, and to provide feedback on what we need to improve.

trivialfis avatar Jun 16 '25 00:06 trivialfis

Awesome, will be picking these up this week - will open up a WIP with the scope and cc you in the same :)

shivamchhuneja avatar Jun 16 '25 16:06 shivamchhuneja