handcalcs icon indicating copy to clipboard operation
handcalcs copied to clipboard

Consider tighter definition of variable in expr_parser

Open ptmcg opened this issue 3 years ago • 4 comments

The definition of variable in expr_parser is:

variable = pp.Word(pp.alphanums + "_.")

This will accept invalid identifiers such as

.__
a..c....d
...
123..456

Consider changing to the 2-argument form of defining a pp.Word:

variable_alphas = pp.pyparsing_unicode.Latin1.alphas
variable = pp.Word(variable_alphas + "_", variable_alphas + pp.nums + "_.")

This is better, since it accepts more identifier characters, and ensures that the leading character is a valid Python leading identifier character.

Still not perfect, as it will also accept multiple consecutive "." characters.

Best would be:

variable_alphas = pp.pyparsing_unicode.Latin1.alphas
identifier = pp.Word(variable_alphas + "_", variable_alphas + pp.nums + "_")  # <-- no "."
variable = pp.delimitedList(identifier, delim=".", combine=True)

This now ensures that you parse only proper "."-delimited variables.

If you want to accept all Unicode alpha characters, change variable_alphas to:

variable_alphas = pp.pyparsing_unicode.alphas

(This may be of help in addressing Issue #92 )

ptmcg avatar Aug 17 '21 09:08 ptmcg

Hi Paul!

Thank you so much for these suggestions! I have not been able to try them out yet but am hoping to do so this weekend. This will probably help with a couple of issues!

Sent from my iPhone

On Aug 17, 2021, at 02:45, Paul McGuire @.***> wrote:

 The definition of variable in expr_parser is:

variable = pp.Word(pp.alphanums + "_.") This will accept invalid identifiers such as

.__ a..c....d ... 123..456 Consider changing to the 2-argument form of defining a pp.Word:

variable_alphas = pp.pyparsing_unicode.Latin1.alphas variable = pp.Word(variable_alphas + "", variable_alphas + pp.nums + ".") This is better, since it accepts more identifier characters, and ensures that the leading character is a valid Python leading identifier character.

Still not perfect, as it will also accept multiple consecutive "." characters.

Best would be:

variable_alphas = pp.pyparsing_unicode.Latin1.alphas identifier = pp.Word(variable_alphas + "", variable_alphas + pp.nums + "") # <-- no "." variable = pp.delimitedList(identifier, delim=".", combine=True) This now ensures that you parse only proper "."-delimited variables.

If you want to accept all Unicode alpha characters, change variable_alphas to:

variable_alphas = pp.pyparsing_unicode.alphas (This may be of help in addressing Issue #92 )

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

connorferster avatar Aug 19 '21 15:08 connorferster

Before you use the "all the alphas defined in Unicode" option, please check the timing. This property is evaluated lazily, so you don't pay the penalty when you import pyparsing, just when you access the property. (Latin1 is a much smaller range of code points, so shouldn't add any measurable time.)

ptmcg avatar Aug 19 '21 15:08 ptmcg

Just ran into an issue that I reckon this would solve: was using greek letters in jupyter-lab as variable names, but the expressions were not rendering; just the results:

%%render
σ_R = V_R * μ_R

σ𝑅 = 341.250 kN

%%render
sigma_R = V_R * mu_R

𝜎𝑅 = 𝑉𝑅⋅𝜇𝑅 = 0.05⋅6.825 MN = 341.250 kN

ccaprani avatar May 30 '22 09:05 ccaprani

Thanks for bringing this one up. I am working on an update right now and will add this to the test suite.

On May 30, 2022, at 02:57, Colin Caprani @.***> wrote:

 Just ran into an issue that I reckon this would solve: was using greek letters in jupyter-lab as variable names, but the expressions were not rendering; just the results:

%%render σ_R = V_R * μ_R σ𝑅 = 341.250 kN

%%render sigma_R = V_R * mu_R 𝜎𝑅 = 𝑉𝑅⋅𝜇𝑅 = 0.05⋅6.825 MN = 341.250 kN

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.

connorferster avatar May 30 '22 13:05 connorferster