Allocation of units for batch7
This is a work in progress! Template: https://github.com/LDSSA/wiki/issues/330
Description
The goal of this issue is to assess interest and have a pre-allocation of batch7's teaching work and QA.
The units, overall work needed and release and delivery dates are listed below.
Units
Admissions
- Unit
- Adjustment to the new Python and Pandas versions.
- Learning notebooks - minimum change: review
- Example notebooks - minimum change: review
- Exercise notebook - minimum change: review, new datasets
- To be released on 23 October 2023
- To be ready on: soon
| SLU | Name | Last year instructor | Batch 7 instructor | Last year QA |
|---|---|---|---|---|
| SLU01 | Pandas 101 | @majkah0 | @majkah0 | @Jujulian3 |
| SLU02 | Subsetting Data in Pandas | @jgomes959 | @jgomes959 | @jgerebelo |
| SLU03 | Visualization with Pandas & Matplotlib | @Gustavo-SF | @kagglekim | @SaraOGomes |
| Test | @fabiocruz | @danizao @majkah0 @minhhoang1023 @Gustavo-SF | ||
| Test on 23 October 2023 |
- Batch 7 QA Lead SLU01/SLU02/SLU03: @FilipaPereira78
- Batch 7 backup QA SLU01/SLU02/SLU03:@CaitlinHulse
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the test verification
Specialization 1 + Bootcamp
- Project manager: José Rebelo @jgerebelo
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- Learning notebook
- Example notebook
- Exercise notebook
- To be released on:
- SLU04 - SLU10 learning notebooks: 19 November 2023
- SLU04 - SLU10 exercise notebooks: 26 November 2023
- SLU11- SLU19 learning and exercise notebooks: 26 November 2023
- To be ready in November
| SLU | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|---|---|---|---|---|
| SLU04 | Basic Stats with Pandas | @SaraOGomes | @cmm79 | @jgomes959 | @BG2602 |
| SLU05 | Covariance & Correlation | @kagglekim | @cmm79 | @anaritarc | @BG2602 |
| SLU06 | Dealing with Data Problems | @majkah0 | @TeignmouthElectron | @SaraOGomes | @BG2602 |
| SLU07 | Regression with Linear Regression | @jgerebelo | @joaogilsa | @carlacotas | @Mohamedgaber9 |
| SLU08 | Metrics for Regression | @marianahenriques1 | @joaogilsa | @cd702 | @Mohamedgaber9 |
| SLU09 | Classification with Logistic Regression | @majkah0 | @majkah0 | @carlacotas | @caitlinhulse |
| SLU10 | Metrics for Classification | @phgui | @majkah0 | @majkah0 | @caitlinhulse |
| SLU11 | Tree-Based Models | @anaritarc | @margaridantunes | @carlacotas | @Mohamedgaber9 |
| SLU12 | Feature Engineering (aka Real World Data) | @danizao | João Nobre | @anaritarc | @Mohamedgaber9 |
| SLU13 | Bias-Variance tradeoff & Model Selection | @jgerebelo | @rodrigomverissimo | @anaritarc | @BG2602 |
| SLU14 | Model complexity & Overfitting | @Gustavo-SF | @Gustavo-SF | @Jujulian3 | @BG2602 |
| SLU15 | Hyperparameter Tuning | @jgomes959 | @jgomes959 | @SaraOGomes | @BG2602 |
| SLU16 | Workflow | @cimendes | @fabiocruz | @TeignmouthElectron | |
| SLU17 | Ethics & Fairness | @hershaw | @majkah0 | @Gustavo-SF | @TeignmouthElectron |
| SLU18 | Support Vector Machines (SVM) (optional unit) | @cimendes | @majkah0 | @Jujulian3 | |
| SLU19 | k-Nearest Neighbors (kNN) (optional unit) | @cimendes | @majkah0 | @Jujulian3 |
| Group | SLUs | Batch 7 QA lead* | Batch 7 backup QA** |
|---|---|---|---|
| QA1 | SLU04, SLU05, SLU06 | @BG2602 | Caitlin Hulse |
| QA2 | SLU07, SLU08 | @Mohamedgaber9 | Cora |
| QA3 | SLU09, SLU10 | Caitlin Hulse | |
| QA4 | SLU11, SLU12 | @Mohamedgaber9 | |
| QA5 | SLU13, SLU14, SLU15 | @BG2602 | |
| QA6 | SLU16, SLU17 | @@TeignmouthElectron |
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Bootcamp presentations
Bootcamp presentations will be split in two parts. Presentations will be given by senior instructors. This is what is expected from each instructor:
- The presentation should be <= 60 min including student questions. The presentation should be on concepts and insights for the given topic, not the technical implementation in Python.
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions. Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
Bootcamp part 1, Sunday 26. November 2023
- Instructor 1: LTPlabs
- ~ 30 min: Intro to data science, SLU04 - Basic Stats with Pandas, SLU05 - Covariance and Correlation
- ~ 30 min: SLU06 - Dealing with Data Problems
- Instructor 2: José Rebelo, EDP
- 30 - 60 min: SLU07 - Regression with Linear Regression, SLU08 - Metrics for Regression
- Instructor 3: LTPlabs
- 30 - 60 min, SLU09 - Classification with Logistic Regression, SLU10 - Metrics for Classification
Bootcamp part 2, Sunday 3. December 2023
-
Instructor 4: João Ascensão, Stratio - TBC *45-60 min: SLU11 - Tree-Based Models, SLU12 - Feature Engineering
-
Instructor 5: Maria Cristina Dominguez
- ~60 min: SLU13 - Bias-Variance tradeoff & Model Selection, SLU14 - Model complexity and Overfitting, SLU15 - Hyperparameter Tuning
-
Instructor 6: Sam Hopkins, DareData
- 30-60 min: SLU16 - Workflow, SLU17 - Ethics and Fairness
-
Hackathon 1
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 17 December 2023
-
To be ready in November
| Work unit | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|---|---|---|---|---|
| Hackathon 1 | Binary Classification | @wilsonramos1 | @jgomes959 | ? | . |
Specialization 2, 8 January 2024 - 4 February 2024
-
Project manager: Kim Pronk @kagglekim
-
Senior instructor:
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 8 January (BLU01), 15 January (BLU02), 22 January (BLU03)
-
To be ready in ** December 2023**
-
Hackathon 2
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 4 February (Hackathon 02)
-
To be ready in mid January
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @martinb-bb | ||
| BLU01 | Messy Data | @JerBouma | @majkah0 | |
| BLU02 | Advanced Wrangling | @minhhoang1023 | @cd702 | |
| BLU03 | Data Sources | @jmaslek | @anaritarc | |
| Hackathon 2 | Data Wrangling | @martinb-bb @JerBouma @minhhoang1023 @DidierRLopes |
- Batch 7 QA Lead BLU01/BLU02/BLU03: @AhmedEmad2525
- Batch 7 backup QA BLU01/BLU02/BLU03:
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the Hackathon verification
Specialization 3, 5 February - 3 March 2024
-
Project manager: Mária Hanulová @majkah0
-
Senior instructor: Telmo Felgueira, Loka / JungleAI
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 5 February (BLU04), 12 February (BLU05), 19 February (BLU06)
-
To be ready in January
-
Hackathon 3
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 3 March (Hackathon 03)
-
To be ready in mid February
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @TSFelg | @TSFelg | |
| BLU04 | Time Series Concepts | @PedroRibeiro80 | @Sonia-se | |
| BLU05 | Classical Time Series Models | @jgerebelo | @carlacotas | |
| BLU06 | Machine Learning for Time Series | @jdpsc | @TeignmouthElectron | @SaraOGomes |
| Hackathon 3 | Timeseries | @TSFelg | @Gustavo-SF |
- Batch 7 QA Lead BLU04/BLU05/BLU06: @Mohamedgaber9
- Batch 7 backup QA BLU04/BLU05/BLU06:
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the Hackathon verification
Specialization 4, 4 March - 31 March 2024
-
Project manager:
-
Senior instructor:
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 4 March (BLU07), 11 March (BLU08), 18 March (BLU09)
-
To be ready in February
-
Hackathon 4
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 31 March (Hackathon 04)
-
To be ready in ** mid March**
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @CatarinaSilva | ||
| BLU07 | Feature Extraction | @CatarinaSilva | @cd702 | |
| BLU08 | Dimensionality Reduction | @CatarinaSilva | @majkah0 | |
| BLU09 | Information Extraction | @CatarinaSilva | @carlacotas | |
| Hackathon 4 | NLP | BancoBPI |
- Batch 7 QA Lead BLU07/BLU08/BLU09: @CaitlinHulse
- Batch 7 backup QA BLU07/BLU08/BLU09: @BG2602
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the Hackathon verification
Specialization 5 - this will be an optional specialization
- Project manager:
- Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- To be released in March/April
- To be ready in March
| Work unit | Name | Last year instructor | Batch 6 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | |||
| BLU10 | Non-personalised Recommender | @majkah0 @anaritarc | ||
| BLU11 | Personalized Recommenders | @majkah0 @anaritarc | ||
| BLU12 | Workflow | @majkah0 @anaritarc | ||
| Hackathon 5 | Recommender Systems |
- Batch 7 QA Lead BLU10/BLU11/BLU12: @TeignmouthElectron
- Batch 7 backup QA BLU10/BLU11/BLU12:
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the Hackathon verification
Specialization 6, 1 April - 28 April 2024
-
Project manager:
-
Senior instructor: Gustavo Fonseca, LDSA
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 1 April (BLU13), 8 April (BLU14), 15 April (BLU15)
-
To be ready in March
-
Hackathon
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 28 April (Hackathon 06)
-
To be ready in mid April
Extra session about Venture Capital: Armilar
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @cimendes | @Gustavo-SF | |
| BLU13 | Basic model Deployment | @cimendes | @carlacotas | |
| BLU14 | Deployment in the real world | @cimendes | ||
| BLU15 | Model CSI | @cimendes | @carlacotas | |
| Hackathon 6 | Data science in real world | @CatarinaSilva @cimendes @InesPessoa |
- Batch 7 QA Lead BLU13/BLU14/BLU15:
- Batch 7 backup QA BLU13/BLU14/BLU15:
*It will be the responsible for checking the SLUs ** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates) Both QA people do the Hackathon verification
Capstone, 29 April - 15 July 2024
- Preparing a strong dataset and problem
- Help building documents/forms/etc
- Replying to students QA
- Beta-testing/QAing
- Grading capstone
- To be released on 29 April
- To be ready in mid April
| Work unit | Name | Last year instructor (s) | Batch 7 instructor(s) |
|---|---|---|---|
| - | Capstone | @minhhoang1023 @cimendes @fabiocruz @Gustavo-SF @anaritarc @majkah0 |
Other possible extra sessions:
- NOS (LLM, Data Science in Real World);
- AICEP (Classfication, Data Science in Real World);
- BPI (Classfication, Data Science in Real World)
I am available for QA for both the SLU2 and the admission test.
Note: I don't have access to this repository
I'm in for SLU09 logistic regression and something else, maybe the optional SLUs. Also for the exam QA. I would also be in for an optional hackathon training SLU, number 10.5 and I'd like to have Rita for QA for that :slightly_smiling_face: . What will be the schedule for batch 7? If I could have one student success wish, it would be to change the bootcamp structure - 1 week longer, with lectures divided in two and 2 office hours. :grin: We have discussed this a bit during this year, but then no conclusions were made.
Hello, I would also like to know if there is already an update on the schedule?
I would be interested in QA for SLU06 or SLU08 as well as 12 or 13 but it is so quiet here i do not know if this is the right place...
Hi @cd702 thank you, this is the right place! In fact, I was thinking about contacting you today :) I'm starting to bring in some life now. Would you also be in as instructor? We are doing just minimal maintenance this year, fixing the errors.
I will have another look at the units and let you know. Are the dates in the instructors repo confirmed?
@cd702 Hi Cora, sorry for the late reply, the dates will be confirmed this weekend.
@majkah0 Hi Mária, I can still be responsible for SLU02 this year :)
Perfect, thank you @jgomes959 . Just to warn you - Pandas has changed row subsetting, so there might be a lot of corrections to do.
Hi, doesn't Vasco already has a list of who will be responsible for each teaching? Also maybe we should update it, per Spec, even for the bootcamp, and highlight the fact that the process has changes
@cd702 Hi Cora, this is finally moving :) You were interested in QA for SLU 06, 08, 12 or 13. Can I sign you up for some of those? These units will go out at the end of November/beginning of December. We are also looking for QA for the admissions units and test which will go out in mid October. This year, there will be minimal unit development. The QA should happen first, then the instructor will correct all the issues. So you can already start, the instructor repo is set up.
Hi Maria, that's great news :) in this case you can sign me up as QA for units 6 and 8 as well as 12. If you need me there I can also have a look over SLU 1 and try out the admissions test once it's ready :)
Thank you very much Cora!
Added myself to QA of Admissions test and SLU14 Instructor.
i am available for QA SLU11, SLU12, SLU14
Hey @mafaldavs, welcome!, SLU12 is already taken. Is there any other you would like to take?
Hi,
Sorry @mafaldavs, @Gustavo-SF and @cd702 this is not the structure for QA this year, I still have to update, but basically each person will be responsible for 3 SLUs.
I'll update it, this template is not to be used for QA, I'm thinking the best way to incorporate it here.