Mathics icon indicating copy to clipboard operation
Mathics copied to clipboard

Added calls to replace_wl_with_plain_text in boxes_to_text and boxes_to_xml for String and Symbol

Open GarkGarcia opened this issue 4 years ago • 6 comments

This is a follow up to https://github.com/orgs/mathics/teams/maintainers/discussions/20/comments/32. This PR makes it so that named characters are only replaced selectively (inside of strings or symbols). This makes calling replace_wl_with_plain_text from the clients unnecessary (we only need to set use_unicode field of Evaluation appropriately).

After this is merged, PRs to the clients will follow.

GarkGarcia avatar Jan 28 '21 19:01 GarkGarcia

I am concerned about the implications of this PR. Maybe it's better to leave the conversion to the clients, even if we have to traverse the AST twice.

GarkGarcia avatar Jan 28 '21 20:01 GarkGarcia

I am concerned about the implications of this PR. Maybe it's better to leave the conversion to the clients, even if we have to traverse the AST twice.

@rocky @mmatera Opinions?

GarkGarcia avatar Jan 28 '21 20:01 GarkGarcia

I found similar problems working with the graph engine. Later I send you a resume of how I think we can face this

mmatera avatar Jan 28 '21 22:01 mmatera

Well, this took to me more time than I had expected. After working on #1140, and having a closer look at how to implement the proposal in this PR, I think that @GarkGarcia is right: the best choice seems to add a parameter in Evaluation. Let me explain why, and what could be the alternatives. In #1140 I implemented a way to overwrite the default behavior of MakeBoxes, in a way that a client can redefine them. This mechanism could be used also to deal with Strings: In the client, we could define a custom MakeBoxes for String expressions, that converts or not between the different string representations (wolfram-like utf8, standard utf8, ASCII named characters). However, this solution is difficult to implement for general Symbol and operators.

A second possibility would be to hack the output through the $PrePrint symbol. This is what I did in IWolfram. I used this approach in IWolfram because I needed to make it work also with a WMA kernel, and there I do not have access to the internals. So, if a user tries to overwrite $PrePrint, the interface would get screwed.

On the other hand, the Evaluation class, already controls the formatting step. So it would be natural to introduce a property that controls how the final string should be printed. There would be two different ways to control from the front-end the behavior of Evaluation. One is what @GarkGarcia did: just add a string parameter. We could also allow for this parameter a custom call-back function, that takes the standard (WL)UTF8 and process it in the most convenient way for the front-end.

mmatera avatar Mar 08 '21 17:03 mmatera

Well, this took to me more time than I had expected. After working on #1140, and having a closer look at how to implement the proposal in this PR, I think that @GarkGarcia is right: the best choice seems to add a parameter in Evaluation. Let me explain why, and what could be the alternatives. In #1140 I implemented a way to overwrite the default behavior of MakeBoxes, in a way that a client can redefine them. This mechanism could be used also to deal with Strings: In the client, we could define a custom MakeBoxes for String expressions, that converts or not between the different string representations (wolfram-like utf8, standard utf8, ASCII named characters). However, this solution is difficult to implement for general Symbol and operators.

A second possibility would be to hack the output through the $PrePrint symbol. This is what I did in IWolfram. I used this approach in IWolfram because I needed to make it work also with a WMA kernel, and there I do not have access to the internals. So, if a user tries to overwrite $PrePrint, the interface would get screwed.

On the other hand, the Evaluation class, already controls the formatting step. So it would be natural to introduce a property that controls how the final string should be printed. There would be two different ways to control from the front-end the behavior of Evaluation. One is what @GarkGarcia did: just add a string parameter. We could also allow for this parameter a custom call-back function, that takes the standard (WL)UTF8 and process it in the most convenient way for the front-end.

replace_wl_with_plain_text() is not something we want in the long run because it can't see the structure inside and that is important. Down the line, it would needs to be integrated better and driven by the format routine.

rocky avatar Mar 08 '21 17:03 rocky

I am working on a better proposition. Once I finish with the checks, I am going to do a PR to this branch.

mmatera avatar Mar 08 '21 17:03 mmatera