calamine icon indicating copy to clipboard operation
calamine copied to clipboard

Implement styles

Open ddimaria opened this issue 5 months ago • 15 comments

Add style support for Calamine.

Basic api for getting styles

fn test_worksheet_style_iter() {
    let mut xlsx: Xlsx<_> = wb("styles.xlsx");
    let styles = xlsx.worksheet_style("Sheet 1").unwrap();

    for (row, styles) in styles.rows().enumerate() {
        for (col, style) in styles.iter().enumerate() {
            println!("row: {}, col: {}, style: {:?}", row, col, style);
        }
    }
}

Here is the relevant structs so far:

struct Style {
    font: Option<Font>,
    fill: Option<Fill>,
    borders: Option<Borders>,
    alignment: Option<Alignment>,
    number_format: Option<NumberFormat>,
    protection: Option<Protection>,
    style_id: Option<u32>,
}

struct Color {
    alpha: u8,
    red: u8,
    green: u8,
    blue: u8,
}

struct Borders {
    left: Border,
    right: Border,
    top: Border,
    bottom: Border,
    diagonal_down: Border,
    diagonal_up: Border,
}

struct Font {
    name: Option<String>,
    size: Option<f64>,
    weight: FontWeight,
    style: FontStyle,
    underline: UnderlineStyle,
    strikethrough: bool,
    color: Option<Color>,
    family: Option<String>,
}

struct Alignment {
    horizontal: HorizontalAlignment,
    vertical: VerticalAlignment,
    text_rotation: TextRotation,
    wrap_text: bool,
    indent: Option<u8>,
    shrink_to_fit: bool,
}

struct Fill {
    pattern: FillPattern,
    foreground_color: Option<Color>,
    background_color: Option<Color>,
}

struct NumberFormat {
    format_code: String,
    format_id: Option<u32>,
}

struct Protection {
    locked: bool,
    hidden: bool,
}

ddimaria avatar Jul 24 '25 21:07 ddimaria

Just a quick note to say that overall this looks good and I think it is going in the right direction.

I'll give more detailed feedback early next week.

For code comments you are already using correct sentence-case but could you also add a full stop/period at the end of each line. This makes it easier to change comments into documentation. It isn't a rule that has been adhered to in the existing code but I would like to improve that going forward.

jmcnamara avatar Jul 26 '25 10:07 jmcnamara

Man, i wish i discovered this pull early, i wanted to know if a cell is stricken or not. Sat whole day and edited the code myself. Came till adding if stricken or not from the styles.

Learnt a lot from my experience though. i'll test this pull against my code.

azarmadr avatar Jul 29 '25 13:07 azarmadr

Is the data and styles coming together? If not, then would we have to read the excel twice to get data, once and then style?

azarmadr avatar Jul 30 '25 16:07 azarmadr

Is the data and styles coming together? If not, then would we have to read the excel twice to get data, once and then style?

Data and Formats have always been processed separately (e.g. worksheet_range() and worksheet_formula()). This PR follows the format pattern (worksheet_style() for styles and worksheet_layout() for column widths and row heights).

When I finish the base work, I'm going to look into optimizations:

  1. More optimal storage (Vec vs Run-Length Encoding) for styles
  2. Looping (to your point, combining read operations to reduce memory/cpu usage)

ddimaria avatar Jul 30 '25 16:07 ddimaria

i got it, even formulae were being used similarly. Was able to satisfy my requirement with worksheet_sytle over a range.

Another doubt, why every item in the style is an option? All the fonts, borders, and others are available as ids in cellxfs?

hope i am not bothering you.😊

azarmadr avatar Jul 31 '25 05:07 azarmadr

Another doubt, why every item in the style is an option?

@azarmadr this is because the particular style may not exist for a given cell.

ddimaria avatar Aug 13 '25 17:08 ddimaria

@jmcnamara We're currently using this fork in production and it's working well. What do I need to do to get this merged with Calamine? Some things to discuss:

  • Is the API agreeable?
  • What type of documentation is needed?
  • Add more testing?
  • Should this be a compiler feature flag?
  • This is only implemented for xlsx, is that OK?

ddimaria avatar Aug 13 '25 17:08 ddimaria

What do I need to do to get this merged with Calamine?

You should create a non-draft PR with all the WIP files and code removed. I will try to do a review on the current code.

Is the API agreeable?

Overall I think it is okay but I may have specific comments during review.

What type of documentation is needed?

Copy the style of the Xlsx documentation:

  • One line description.
  • One-two paragraph explanation.
  • Parameters section.
  • Errors section if the method Result<(), XlsxError>.
  • Examples section.

https://docs.rs/calamine/latest/calamine/struct.Xlsx.html#method.merged_regions_by_sheet

Add more testing?

That is always good. :-)

Should this be a compiler feature flag?

I think not.

This is only implemented for xlsx, is that OK?

Probably not but lets try get it right for Xlsx first.

jmcnamara avatar Aug 14 '25 16:08 jmcnamara

First review done. Some general comments:

  • Please squash and or rebase WIP code and reverts into a small number of logical commits or one commit.
  • Fix the clippy and or format warnings the are breaking the CI.
  • Add a full stop/period at the end of all comments.
  • Avoid /// comments for non-public non-doc comments. Use // instead.
  • Post an explanation here for the setting functions in CellData and Style.

jmcnamara avatar Aug 14 '25 18:08 jmcnamara

@ddimaria Any update on this?

jmcnamara avatar Sep 04 '25 09:09 jmcnamara

@ddimaria Any update on this?

@jmcnamara sorry for the delay, I got pulled into other projects. My goal is to polish this PR and address feedback next week. We've been using the fork in production, so the base of the code is working for us. As a stretch goal I'd like to implement a better storage structure to account for repeating elements (RLE).

ddimaria avatar Sep 04 '25 13:09 ddimaria

My goal is to polish this PR and address feedback next week. We've been using the fork in production, so the base of the code is working for us. As a stretch goal I'd like to implement a better storage structure to account for repeating elements (RLE).

@ddimaria I would still like to merge this since it would fix a lot of other feature requests. I would avoid the stretch goal for now.

jmcnamara avatar Oct 21 '25 12:10 jmcnamara

@ddimaria I would still like to merge this since it would fix a lot of other feature requests. I would avoid the stretch goal for now.

@jmcnamara sorry for the delay on this, it's been chugging along for us in production as well, but I'd love to get this merged to avoid maintaining a fork. I got approval to spend some time today on this.

ddimaria avatar Oct 21 '25 17:10 ddimaria

@jmcnamara can you confirm that these test failures are unrelated to this PR? I'm happy to fix them, but may be lacking in some context.

---- issue_391_shared_formula stdout ----

thread 'issue_391_shared_formula' panicked at tests/test.rs:1824:5:
assertion `left == right` failed
  left: Some((6, 0))
 right: Some((2, 0))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- issue_438_charts stdout ----

thread 'issue_438_charts' panicked at tests/test.rs:1861:10:
could not open worksheet range: XmlEof("sheetData")


failures:
    issue_391_shared_formula
    issue_438_charts

ddimaria avatar Oct 21 '25 19:10 ddimaria

can you confirm that these test failures are unrelated to this PR?

They are unrelated. I think it is due in some way to an incorrect rebase to master. At least the network graph for your branch doesn't look correct: https://github.com/tafia/calamine/network

jmcnamara avatar Oct 21 '25 19:10 jmcnamara