pythermalcomfort icon indicating copy to clipboard operation
pythermalcomfort copied to clipboard

Add initial versions for charts leading into plotting functionality

Open t-kramer opened this issue 11 months ago • 11 comments

PR introduces preliminary versions for a first set of charts that can be used along and/or integrated with pythermalcomfort. Before we improve the charts, clean the code and add more, we should discuss how we want to integrate such plots in general. See more on this below in the Motivation bit.

Changes

  • add charts.py: contains initial code for an Adaptive chart (+ alternative version with timeseries summary), PMV chart, and a Heatmap (again with a summary option)
  • add theme.py: defines a Plotly template, enhancing consistency and streamlining plot functions - very basic for now, but can be leveraged much more
  • add chart_development.ipynb: loads and preprocesses datasets used as input for charts (to be removed later)
  • add 'test_data` folder: stores EPW files for testing (to be removed later)

Added dependencies

  • https://github.com/building-energy/epw: to process EPW files

Screenshots

Images of charts created using Plotly: Most of them are based of code that I've written earlier and some functions that we used for Clima and the new Comfort Tool version. All plots are interactive, so they provide additional information when hovering over them, e.g. exact x, y, z data behind plotted data points.

adaptive

adaptive_with_summary

pmv

heatmap

heatmap_with_summary

Motivation / Ideas behind the charts

  • big strength of pythermalcomfort over the CBE Comfort Tool: people can use it with multiple samples of input data instead of just one configuration

  • makes the package very interesting for energy modelers and other AEC people, e.g. analyzing time series data

  • plots should focus on these applications

  • in pythermalcomfort, we have mainly three types of functions: PMV-based indexes, Adaptive indexes and others who either generate a temperature-based output or 'comfort' category (e.g. Heat Index)

  • for a start, we could create one plot for each of these types: summarize time-series data, e.g. hourly simulation or measurement results and longitudinal field study data

  • heatmap is very flexible: allows to plot the data (continuous or categorical) for most of the indices we offer

  • added version with a little subplot on the right: summary is important for AEC people, especially when comparing different datasets - visually, this is often not clear enough and needs to be backed by some numbers, e.g. in per cent or hour counts

  • can add something similar to the PMV plot, too

(Potential) Next Steps

  • [ ] visually further enhance charts
  • [ ] develop alternative for psychrometric chart / PMV chart
  • [ ] create one more chart as an alternative to the heatmap, maybe something that doesn't necessarily focus on mapping timeseries data
  • [ ] clean code and improve modularity and versatility

Up for discussion - How do we want to integrate this?

Initially, I was leaning towards integrating the plotting functionality as much as possible with the main codebase, for example using dot notation. However, I changed by mind a bit. I think especially when it comes to plotting, we should grant lots of flexibility. Pythermalcomfort is great, because it simplifies peoples workflows significantly (since they don't have to write all the included functions themselves), but also offer lots and lots of flexibility on how to use these functions.

So for everything that goes beyond the main codebase and it's functions, e.g. the plots, we could just provide a few examples of how to use pythermalcomfort on typical datasets in Pandas dataframe format (the examples don't do that yet, although I think it's the main use case) and provide users with a few insightful and nicely formatted plots, wrapped in functions they can manipulate easily if they want. Internally, we can keep curating these plots and update them on a rolling basis, for example when we create new plots using pythermalcomfort for data processing for papers or our tools. Considering the variety of applications of pythermalcomfort, writing some highly specific plots will limit their application in practice.

@FedericoTartarini @stefanoschiavon

Keen to hear your feedback and thoughts. 🙋🏼‍♂️

\T

t-kramer avatar Dec 31 '24 05:12 t-kramer

@t-kramer great work and thank you for contributing. You have done a great work and I am very happy to see the charts implemented in pythermalcomfort.

I have a few comments below and inline in the code in the pull request.

code

  1. I would create a subdirectory called charts which will contain the code for all the charts. Please add all the files here pythermalcomfort/charts
  2. I would create a Python file for each chart, using the same approach used for the models, where each model has its own file.
  3. chart_development.ipynb can be moved to the examples directory. There we can showcase a few charts. However, we need to make sure that we provide examples so all the charts can be generated without the need of using a Jupiter notebook. I really do not like jupiter files since they do not even render here in GitHub and are impossible to review.
  4. test_data can also be move to the examples folder
  5. epw could be a dev dependency.
  6. for now do not implement more chart since we want to first consolidate how results are calculated and then how charts are called.

charts

the chart look great and very modern.

  1. I am not sure what the round things in chart 2 mean (applicability, acceptability)
  2. we need to make sure that people can pass the chart size as input so the font can be scaled by python.
  3. PMV and adaptive charts - it would be nice to add a count of the datapoints within and outside the comfort region. Similar to chart 5.
  4. Chart 4 and 5 are the same. By default we should show chart 5 and then people can hide the bar chart if they want.

motivation

heatmap is very flexible: allows to plot the data (continuous or categorical) for most of the indices we offer

agree also with the following comment, the bar chart summary is essential, the heatmap alone is beautiful but useless. We should also consider different grouping, month, time, etc.

for a start, we could create one plot for each of these types: summarize time-series data, e.g. hourly simulation or measurement results and longitudinal field study data

next step

  1. add the T, RH chart.
  2. discuss how the code is called and how people can use it.

I think especially when it comes to plotting, we should grant lots of flexibility ... we could just provide a few examples of how to use pythermalcomfort on typical datasets in Pandas dataframe format (the examples don't do that yet, although I think it's the main use case) and provide users with a few insightful and nicely formatted plots, wrapped in functions they can manipulate easily if they want.

agree, we should probably do not include it with dot notation.

FedericoTartarini avatar Jan 07 '25 01:01 FedericoTartarini

@FedericoTartarini I finished most of my work yesterday, but didn't have the time to update the PR, so here it comes.

Adjustments

I implemented most of your requested changes.

  • improved and added summary for all charts
  • created new modules for charts in pythermalcomfort/charts
  • replaced chart_development.ipynb with example scripts and moved these to pythermalcomfort/examples
  • moved theme.py to pythermalcomfort/charts
  • additional minor fixes

More information below and in the responses to your inline comments.

Charts

  1. I am not sure what the round things in chart 2 mean (applicability, acceptability)

This was another way of summarizing the data based on this paper and Stefano's suggestion. The outer circles represent 100% and the inner circles the actually achieved number. Anyway, I removed this style and added the horizontal bar chart summary for all (see screenshot).

  1. we need to make sure that people can pass the chart size as input so the font can be scaled by python.
  2. PMV and adaptive charts - it would be nice to add a count of the datapoints within and outside the comfort region. Similar to chart 5.
  3. Chart 4 and 5 are the same. By default we should show chart 5 and then people can hide the bar chart if they want.

Agree on all of these points, changes have been implemented accordingly.

Here are the updated charts (all including the summary, which can be hidden):

Screenshot 2025-01-11 at 6 47 21 PM Screenshot 2025-01-11 at 6 47 30 PM Screenshot 2025-01-11 at 6 47 41 PM Screenshot 2025-01-11 at 6 47 49 PM

I know, the bar chart for the UTCI seems a bit crowded. Need to find a way how to handle this in a flexible way. Also, the missing data points (white pixels) are those that are out of range and therefore NaN.

Further, I do not agree completely with this comment:

the bar chart summary is essential, the heatmap alone is beautiful but useless

While the heatmap alone can't give you a detailed summary of the whole year, it is very helpful in pointing out specific design problems when plotting indoor environmental data, e.g. too much solar radiation causing overheating at very specific hours of the year. So for some cases, people might want to use the heatmap without the summary.

Next Steps

  1. add the T, RH chart.
  2. discuss how the code is called and how people can use it.

Let's do it! 🚀

Thanks again for providing the detailed review, @FedericoTartarini! ❤️ Let me know if you have any further changes to implement before merging.

t-kramer avatar Jan 12 '25 04:01 t-kramer

Great work. I left a few more comments in line with the code. Below a response to some of your comments and a few observations about the chart. I love how the charts look.

  1. adaptive chart - it says that 0.0% of the data are out of range however this is not true. We should probably say < 0.1%
  2. In the adaptive chart, should we not be showing the bars in the following order? Acceptable, acceptable for 80%, discomfort and out of range?
  3. I've noticed that the chart size is still not an input of the function.
  4. In the PMV chart the X axis and the Y axis have a solid line, but in the adaptive this is not the case.
  5. In the PMV chart should be considering allowing to colour the dot as a function of the PMV value?
  6. In the UTCI chart should not be the green in the center of the color bar?
  7. Heatmap alone - OK let's then allowed the user to choose whether they want to see or not the summary.

Thank you so much for working on this. I really appreciate your help.

I have noticed that I can no longer see all my all comments, however, only a few of them have not been addressed yet.

FedericoTartarini avatar Jan 12 '25 22:01 FedericoTartarini

@FedericoTartarini Thanks again for a great review! Most of the comments are super clear and will be easy to implement. There are a few where I'd like to learn more and understand why you prefer to avoid certain things, but I'll ask for more detail in my response to the comments. 👍🏼

I will try to continue working on this and implement most requested changes before Thursday (PT).

Regarding the comments you can't find, can any of these be the reasons?

  • Responded to most comments directly in VS Code and before pushing: I resolved the conversations for comments that have been addressed; does that delete the conversations? I guess not.
  • Most comments you made were in charts.py, which has been deleted and transformed into individual files for each chart in the last commit. Could that be the reason?
  • After I pushed the changes, I realized that I had forgotten to address two; are these the ones you are referring to?

Anyway, I did not delete any on purpose and will make sure that for the next changes, that I am a bit more careful in dealing with the comments. Whats your usual workflow? Do you respond to the comments in IDE or GitHub? Do respond to them individually and directly after addressing them or at the end, just before pushing?

t-kramer avatar Jan 13 '25 01:01 t-kramer

@t-kramer thank you for dedicating sometime to work on this before the meeting. Don't worry about the comments. I think they are all there. Perhaps I forgot to add a few last time. You did an amazing job replying to my comments, so don't worry. Sorry, sometimes for being annoying and asking you to change some stuff, but I just want to make sure that the code is easy to read and consistent.

I generally have the comments from GitHub, simply because I am lazy, and I've never learned how to do it inside by PyCharm. I am actually feeling embarrassed about this as I am typing the message. This is my new year resolution :)

FedericoTartarini avatar Jan 13 '25 04:01 FedericoTartarini

@t-kramer I also forgot to mention that it would be great if you could extract the thermal comfort function from the plotting function. For example, considering that we have a different PMV model, we don't want to have a chart function for each specific formulation. It is better that the user passes in a column with the PMV results and then after the code does the plotting.

FedericoTartarini avatar Jan 13 '25 11:01 FedericoTartarini

@t-kramer thank you for the great discussion today. It is always great to chat with you. Below is code snippet of the dataclasses. Inside the data class so we can also embed functions if we want for example to return the colors of all the risk categories

from pydantic import BaseModel
from typing import List, Optional
from enum import Enum
from dataclasses import fields, dataclass, field
from datetime import datetime
from pytz import all_timezones

# BaseModel class for recommendations
class Recommendation(BaseModel):
    text: str  # Text of the recommendation
    icon: Optional[str] = None  # Optional icon for the recommendation


# BaseModel class for risk values
class RiskValues(BaseModel):
    risk_name: str
    color: str
    threshold: List[float]
    description: Optional[str] = None
    recommendations: Optional[List[Recommendation]] = None
    control_strategies: Optional[List[str]] = None
    value_risk: Optional[float] = None


# BaseModel class for risk models values
class RiskModelsValues(BaseModel):
    risks: List[RiskValues]  # List of risk values associated with the model
    name_model: Optional[str] = None  # Optional name of the risk model


# Enum class for different risk models
class RiskModels(Enum):
    world_rugby = RiskModelsValues(
        name_model="World Rugby",  # Name of the risk model
        risks=[
            # Defining various risk levels with their associated values and recommendations
            RiskValues(
                risk_name="Low",
                recommendations=[
                    Recommendation(
                        text="No control required", icon="carbon:checkmark-outline"
                    )
                ],
                threshold=[0, 100],
                color="#0096FF",
            ),

FedericoTartarini avatar Jan 17 '25 00:01 FedericoTartarini

@FedericoTartarini I just pushed my latest changes and will share a video link shortly to explain them in more detail (avoiding an overly long description here).

Summary of Changes:

  1. Dataclasses for Adaptive Chart: Partially implemented dataclasses for the adaptive chart. Most settings for the base plot and traces (data samples, overlays, and summaries) are now defined in centralized class definitions with methods.

  2. Modular Plot Wrappers: Refactored the plot functions into wrappers that host class instances, making the design more modular. Adding or removing overlays is now user-configurable and straightforward.

  3. Tweaks to heat_index_rothfusz: Made minor adjustments to allow the function to return stress_categories (similar to the utci model function, as discussed in #168).

Especially on the first two points, I'd like you to review before I continue implementing the some structure for the other charts.

Next Steps:

Most open comments will be resolved once dataclasses are fully implemented across all functions. However, a few items still need clarification/discussion/additional changes:

  • Plotting Functions for Model Variants: Should we create separate plotting functions for different versions (e.g., EN vs. ASHRAE), or implement logic based on the units argument?
  • Responsiveness of charts to units Argument: Do you have a preferred approach for handling this?
  • Docstrings and Comments: I'll refine and expand these once we decide on the general structure to improve the clarity of the code.
  • Better Input Data for Adaptive Model: I still need to make the switch to a more suitable dataset than the ASHRAE Global Thermal Comfort Database II (e.g., one allowing calculation of the running mean outdoor temperature)

Let me know your thoughts on these points and how you'd like to proceed!

t-kramer avatar Jan 28 '25 22:01 t-kramer

Great idea recording the video and great work. Brief comments are below:

  1. please create a separate pull request for the Heat Index so I can add it to pythermalcomfort.
    1. I want to remove the units as input as much as possible especially in pythermalcomfort since are hard to manage and people can figure out themselves how to convert to C. Hence please remove the units from the heat_idex_code and just implement the mapping changes.
    2. Please also add tests for this new functionality.
  2. I really like the dataclass approach.
    1. Not sure we need a base class since basically we need to define everything anyway when we create the AdaptiveBase class. Naming will be important and I would rename AdaptiveBase to AdaptivePlot.
    2. same comment with the Trace class, do we need a base class? can we just have Observations?
    3. Observations should be called ScatterPlot? then the function called generate_trace should be called plot()
    4. I do not like the code row=1 if base.show_summary is not very intuitive.
    5. How will people be able to change the classes?
    6. I am not sure why we need type within the classes
    7. we need to make it very explicity that people will have the columns names defined in line 123
    8. maybe we should not have the functions within the class
  3. I will need to play a bit with the code
  4. Please do not use lambda.
  5. The adaptive function should not be included inside the plot but the users should pass these values. I did not get the point :)
  6. In future please share the code.

Here is the video I recorded.

FedericoTartarini avatar Jan 29 '25 08:01 FedericoTartarini

@FedericoTartarini Thanks for the feedback. I'll keep working on it.

Quick point: I did push the code yesterday before updating this PR, so you should be able to see the changes? Please check again.

t-kramer avatar Jan 29 '25 14:01 t-kramer

sorry, I think I have heard in the video that you mentioned that you did not push the code so I did not check the pull request. Please let me know if my comments make sense and please let me know if you do not agree with them.

FedericoTartarini avatar Jan 29 '25 22:01 FedericoTartarini

@t-kramer

Suggestions to improve the graphs for the adaptive thermal comfort model chart: Screenshot 2025-10-03 at 5 17 22 PM

  1. use the same green colors as the actual tool
  2. Reduce the number of x-labels, from one every 2°C, to 5°C at a minimum.
  3. Reduce the number of y-labels, from one every 2°C, to 5°C at a minimum.
  4. Remove the "comfort temperature", it is not a concept valid within 55
  5. VIP. Acceptable for 80% must include also all the people that is accpetable for 90%, so it should be 64.6%.
  6. Consider rounding at zero decimal points.
  7. Change from "Out for range" to " Our of applicability range"
  8. In general, reduce word capitalization as much as possible.

Suggestions to improve the graphs for the PMV psychrometric model chart: Screenshot 2025-10-03 at 5 24 11 PM

  1. Remove the neutral PMV, this concept does not exist in ASHRAE.
  2. Change from "Too Hot" to "Warmer than comfortable" or "Uncomfortably warm". I prefer the former. Same for "Too cold"
  3. Change from "Out for range" to " Our of applicability range"

Suggestions to improve the graphs for the HI: Screenshot 2025-10-03 at 5 27 22 PM

  1. VIP. Remove "No Risk", this category does not exist in US.
  2. You could use the colors from the NYT https://www.nytimes.com/interactive/2022/us/heat-wave-map-tracker.html or the one from NOAA: https://www.noaa.gov/media/image_download/79db348b-7b5a-4e19-94e0-c0a87209a8ba. I tend to trust NYT more and it seems that your colors are closer to NOAA

For the UTCI and HI index show the hours every 3 o 6 hours, not every 2.

stefanoschiavon avatar Oct 04 '25 00:10 stefanoschiavon

Thank you @stefanoschiavon for the comments. We will incorporate them in the new plotting functions. However, I was hoping to get your feedback on this discussion https://github.com/CenterForTheBuiltEnvironment/pythermalcomfort/discussions/252 not on this pull request.

FedericoTartarini avatar Oct 04 '25 02:10 FedericoTartarini

@FedericoTartarini, yes I know that I have to work on 252, I will do it on Monday. I have also promised to Toby to review his pull, so he can finalize the "educational" examples that will not need to be implemented in the code but could be there as good examples.

stefanoschiavon avatar Oct 04 '25 17:10 stefanoschiavon