OrchardCoreContrib.PoExtractor
OrchardCoreContrib.PoExtractor copied to clipboard
Remove some Limitations / Pitfalls
Hi There
I have a proposal to improve the PoExtractor and would like to get some feedback from you whether you the proposed solution is agreeable with the project and whether you would accept a PR.
Motivation
My team discussed using this tool and we anticipate the limitations regarding name convention and msgctxt to lead to some friction:
- the fact that the member name (
T,S,H) is used to identity usage ofIStringLocalizer- it's brittle. Spelling it differently means localization just stops working, without any errors.
- our coding style says field members need to start with a lower case character. We'd have a choice of:
- ignoring "warnings" in the IDE --> this lessens the values of IDE warnings overall
- each and every time add a
// ReSharper disable once InconsistentNaming requireddirective --> cumbersome.
- deriving the
msgctxtfrom the class theIStringLocalizeris used in creates a few scenarios where the PoExtractor uses a differentmsgctxtthanOrchardCore.Localizationwill use. These can happen by accident. The code will still compile, and we'll only find out once someone reports localizations as missing - creating friction in the process. This can happen accidentally, for example, when you move some code usingIStringLocalizer<T>to another class.- theoretically we could write an Analyzer ourselves to ensure code is conformant with the conventions used in the PoExtractor. However, that would duplicate some of what PoExtractor is doing - so not better to maintain. Furthermore, if we can write an analyzer which can judge conformity (correct name for
IStringLocalizer<T>,Tis correctly chosen), we can most likely also just remove the limitations with similar code. And we think that's worth more.
- theoretically we could write an Analyzer ourselves to ensure code is conformant with the conventions used in the PoExtractor. However, that would duplicate some of what PoExtractor is doing - so not better to maintain. Furthermore, if we can write an analyzer which can judge conformity (correct name for
Solution Idea
What's missing in the current implementation is, instead of just reading the AST, also resolving types.
From some quick research it seems to me that for this we need a roslyn SemanticModel besides the SyntaxTree. And the SemanticModel requires a Compilation.
I've done a very dirty proof of concept: See here and here
The result of this is, that the field/property the IStringLocalizer<T> is assigned to can have any name.
The part of generating a different msgctxt depending on the type parameter T is not addressed, but from what I see can be done with the information already made available in this change.
I'm not really experienced with Roslyn, so there's a few questions-marks / risks:
- maybe there's a better way to achieve the same
- haven't tested how it works when the code doesn't compile
- haven't tested whether/how it works when the source references another version of the
Microsoft.Extensions.Localization, which defines theIStringLocalizer<T>interface - maybe a
Compilationdoesn't really mean compiling it... I just don't understand this well enough, yet ;-)
Having said that, I think it's not unlikely that it would only work on compiling (error-free) code (warnings would probably be fine). If that limitation is fine, maybe the PoExtractor could be simplified by removing explicit support for razor (and even liquid?) templates by opening + building the whole project. My expectation here is that this would deal with generating the classes from .razor files automatically and the tool could then work on any class and not care for whether it's razor/liquid or whatnot. I would, however, expect this to slow down processing time.
For us, both would be acceptable.
Alternative Solution
Instead of using Roslyn, and analyze source code, one could also analyze the IL with Mono.Cecil. From past experience, I would say, for the problem at hand, Mono.Cecil is simpler to work with than SyntaxTree + SemanticModel but:
- only ever works on compiled code. The current solution does not need a compiled result, it doesn't even need code without errors / compiling code.
- Cecil development seems to be on the backburner, and I'm not sure about continued support/development
- don't know how well this integrates with AoT / native compilation
- consider me biased because I have experience with Mono.CeCil but almost none with Roslyn
I appreciate your input @BrunoJuchli, all the feedback is welcome:
Regarding localization field names, I remember that we suggested in the past to make optional, the user can provide his/her names
deriving the msgctxt from the class the IStringLocalizer is used in creates a few scenarios where the PoExtractor uses a different msgctxt than OrchardCore.Localization will use.
Could you please elaborate?
Instead of using Roslyn, and analyze source code, one could also analyze the IL with Mono.Cecil.
I don't think it's something that can we go for unless there are some good reasons - regardless of your experience in Cecil :) -
I'm glad to collaborate with you guys to make the PO Extractor works as expected with your project(s)
I suggest that you write a failing test so that we can start to fix the issues that you are facing
I appreciate your input @BrunoJuchli, all the feedback is welcome:
I appreciate your quick feedback and willingness to hear what I have to say :-) Thanks!
Regarding localization field names, I remember that we suggested in the past to make optional, the user can provide his/her names
This can address the problem of the convention being in violation of the "users" coding style guide.
However, it does not address the problem where these are accidentally misnamed. These cases are likely to fall through safety nets and make it to production. In our case, our release cycle works with late-localization. That is, we first deploy a release where not all texts are localized. Localization then happens some time after (few days). That is, it's not unexpected to see "developer texts". On the translation side, we have tooling support to tell translators that there's translations missing. Of course, it doesn't pick up what isn't extracted to the PO, so probably no-one will notice these untranslated texts for quite a while. The result is:
- degradation in the quality of our software. Instead of taking days, it could take months.
- customers need to reach out to us to inform us about a bug. This costs the customer, our support team, and ultimately binds development resources. All involved parties of course have better use for their time ;-)
Therefore my goal is to get rid of the conventions in the PO Extraction tool (you can still use them in your SW, it doesn't matter). Instead, it should follow the rule: If it compiles, extraction works.
deriving the msgctxt from the class the IStringLocalizer is used in creates a few scenarios where the PoExtractor uses a different msgctxt than OrchardCore.Localization will use.
Could you please elaborate?
namespace ExampleApp;
public class Foo(
IStringLocalizer<Bar> S)
{
public void DoSomething()
{
Write(
S["this is some translatable text"];
}
}
The po extractor will create:
msgctxt "ExampleApp.Foo"
msgid "this is some translatable text"
msgstr ""
And OrchardCore.Localization will search for:
msgctxt "ExampleApp.Bar" msgid "this is some translatable text"
Note: I originally wrote "a few scenarios". For code-examples, it's just this one. I was thinking in scenarios how you could accidentally get to this state of violating the convention and expressed myself poorly. But here's how you can accidentally achieve this scenario:
- copy + paste or move the
IStringLocalizer<T>definition to another class. - duplicate a class, but forget to adapt the
TofIStringLocalizer<T>. - (automated) extract class refactoring
Instead of using Roslyn, and analyze source code, one could also analyze the IL with Mono.Cecil.
I don't think it's something that can we go for unless there are some good reasons - regardless of your experience in Cecil :) -
I understand and concur with your assessment.
I suggest that you write a failing test so that we can start to fix the issues that you are facing
I'll do that.
If you've got some idea/knowledge about the compilation / SemanticModel aspect I'd appreciate it very much if you could give me some feedback whether you think it would be acceptable to use it.
In your example above, there's an issue that we are facing frequently, the generic type SHOULD be the same as the class name, we suggested creating an analyzer instead
@sebastienros @Skrypt your thoughts
In your example above, there's an issue that we are facing frequently, the generic type SHOULD be the same as the class name, we suggested creating an analyzer instead
Writing an analyzer is more work than removing these conventions from PO extractor.
The reason for this is, that the analyzer would use the same tools (roslyn) that PO extractor is already using. It requires the same code to determine whether the correct type is used as the PO extractor would use to get rid of the convention.
From a usability perspective, I would also argue that the analyzer is worse.
Do you agree?
I'm writing an analyzer in the client app that uses this library.
I think we could fix the generated localization resource here but the issue no one will tell the app the types should match, right?
I'm writing an analyzer in the client app that uses this library.
I think we could fix the generated localization resource here but the issue no one will tell the app the types should match, right?
When the PoExtractor is changed, why should the types match? It'll be up to consumer to decide whether they should match. Each developer/team can decide for their own conventions (or lack thereof). If they decide for a convention, then an analyzer would be useful --> so I see why offering an Analyzer makes sense.
Is there a reason why the convention would need to be followed by everyone using Orchard? And the PoExtractor?
(In my case, I would certainly prefer it to be opt-in, but I'm not using orchard itself).
Even if you are not using OC if you check ASP.NET Core samples, you will notice generic types for the logger, string localizer .. etc should match the class, so this is not OC specific. I agree for naming the localizer fields they should be framework agnostic
Even if you are not using OC if you check ASP.NET Core samples, you will notice generic types for the logger, string localizer .. etc should match the class, so this is not OC specific. I agree for naming the localizer fields they should be framework agnostic
Ah yes. I blame microsoft for establishing this practice 🙄 - a pity. I do understand that as component authors it makes sense to follow "established practice". Even though arguably there exists better practice, and so I would say it would be nice to support both ;-)
Instead of using a type parameter, pass the info into the ctor, say like:
public class Logger : ILogger
{
public Logger(
Type contextType)
{
...
}
}
And when injecting the ILogger into FooService, the DI container would pass typeof(FooService) as the type parameter to the Logger implementation. There's plenty of DI containers which are capable of doing so. For some examples, see here.
That's less typing for developers, no chance for mistakes, and it doesn't require writing an analyzer... ;-)
What do you think about supporting both patterns?
The PoExtractor could also emit an error if the type parameter T of IStringLocalizer<T> doesn't match the type of the class it's used in.
Ah yes. I blame microsoft for establishing this practice 🙄 - a pity.
lol, could you please elaborate on why you are using a different type in the generic type?
The PoExtractor could also emit an error if the type parameter T of IStringLocalizer<T> doesn't match the type of the class it's used in.
Could be, but I think this is not a tool responsibility, right?
Ah yes. I blame microsoft for establishing this practice 🙄 - a pity.
lol, could you please elaborate on why you are using a different type in the generic type?
Because I make mistakes, and for our localization tooling it doesn't matter much ;-)
But, after our discussion, I understand see that there's no value in deliberately choosing a different T. And therefore, the analyzer makes sense, because it's better to fail fast in the IDE than later when executing extractpo.
So thanks for being that patient with me ;-)
(Like I said, it would be even nicer if the type would be left out instead, and would be deduced reliably per convention (so, basically, what PoExtractor is doing, but not what OrchardCore.Localization is doing)).
The PoExtractor could also emit an error if the type parameter T of IStringLocalizer doesn't match the type of the class it's used in.
Could be, but I think this is not a tool responsibility, right?
I think this depends on intended usage of the tool. If you pass it invalid input, the tool should give back an error. That is:
- if
extractporespect theTargument ofIStringLocalizer<T>(that is, deduce themsgctxtfrom it), then it should not warn in caseTdoesn't equal the classIStringLocalizer<T>is used in. - if
extractpodeducesmsgtxtfrom the classIStringLocalizer<T>is used in, then usingIStringLocalizer<Bar>inside ofFoois invalid data. There is no point in creating POT entries for these, instead a warning/error should be generated.
Vision
So, to cover all cases optimally, I think it should work as follows:
Analyzer
Searches for usages of IStringLocalizer<T>. When the T doesn't match the type it's used in (for example, using IStringLocalizer<Bar> inside of Foo) the analyzer generates an error.
Optionally, it could also offer a fix-up, which would be to change the T from Bar to Foo.
ExtractPo
Generates POT files from usages of IStringLocalizer[...] and IStringLocalizer<T>[...].
The name of fields and properties the IStringLocalizer/IStringLocalizer<T> is assigned to does not matter.
The following rules would be used to define msgctxt:
IStringLocalizer<T>--> full name ofT.IStringLocalizer--> full name of the type theIStringLocalizeris used in
Notes
The Analyzer would not necessarily be part of this Repo because the PoExtractor does not care if you follow this convention or not.
Also, OrchardCore.Localization doesn't care either, so it could be something completely separate?
(sidenote: here's an anlyzer doing the same for ILogger<T>: https://github.com/PavelStefanov/MicrosoftLogger.Analyzer)
- To make it clear I didn't say that the analyzer should be a part of this repo :)
- For
T,S, andHI agree with you, I will try to make it optional, there's a related issue in the repo - For generic parameters, we could use the passed type and log a warning during the extraction process
Hope the above suggestions are fine for you