Add an option to split generated pilota files
Fixes #454
@Millione @PureWhiteWu Hi. I couldn't find tests for the related builders, could you please point me to one if they exist?
Otherwise, I changed one example to use the new split_generated_files option. It compiles and runs. I also verified that the files are actually split and that the RustRover IDE still navigates to the symbols in the generated split files.
Perhaps you can add one integration test in the tests folder for workspace mode?The builder is here https://github.com/cloudwego/volo/blob/main/volo-build/src/workspace.rs#L11
@missingdays Hi, are things going well here?
@Millione Hi, I'm waiting for the reply here - https://github.com/cloudwego/volo/pull/510#discussion_r1819020976
Sorry for that, I forgot to say that can you check out why the generated code fails to compile in split mode? It seems related to the split mode implementation in pilota-build
@Millione As I mentioned, Windows cannot handle files that only differ by case. For testService and TestService, the generated files are considered the same, so the compilation fails.
Maybe we can change the filename that only differ by case a little bit? For testService and TestService, it can be TestService1 and TestService2 (just for example)
We can do that, I'll try to implement it
@Millione Please see https://github.com/cloudwego/pilota/pull/280. I added a test as well
@missingdays https://github.com/cloudwego/pilota/pull/280 gets merged, can you check out whether it works fine for here?
Here can we split it in files?
There is one more question left. There are some structs generated in one file because of volo-build, I think we should split it either.
@Millione
gets merged, can you check out whether it works fine for here?
The tests pass now
There are some structs generated in one file because of volo-build
In impl pilota_build::CodegenBackend for VoloThriftBackend, there's no API that would allow splitting the stream into multiple files, since there's no access to the directory where the additional files should be placed. We could change the API in a couple of ways:
- Provide a base directory in which additional files can be placed. This will solve this specific problem, but it's a bit specific I think
- Instead of passing
stream: &mut String, passstream: PilotaStreamBuilder, which would have methods such aspush_str,start_new_file_generation(name),finish_file_generation. So it could be used like
stream.start_new_file_generation("ServiceResponseRecv.rs");
stream.push_str(...); // push the impl of ServiceResponseRecv
stream.finish_file_generation();
This way, PilotaStreamBuilder can always be extended with new methods which will allow more precise generation without changing CodegenBackend each time.
- Provide the base directory in
Contextwhich is accessible fromCodegenBackend
@missingdays Great thoughts. Point 2 and 3 look good to me, I think you can go for it.
@Millione After exploring the code a bit more, I realized that we could use Context#mode + Context#item_path to get all the required information to locate the files. But mode is pub(crate) and so cannot be accessed from volo. I sent an mr to make it pub - https://github.com/cloudwego/pilota/pull/286
@Millione Hi. Splitting service_ImageService requires more work than I originally anticipated, basically re-writing all the logic from pilota again in volo.
Considering all the other splitting, I'd think if it's not split, it will still not get too large to cause any problems, wdyt?
@missingdays I think it will still be too large and leaving it unsplit looks inconsistent to the user. Is the workload heavy? Can you elaborate on what you're going to do? We can work on it together to find out the best solution.
@Millione you are right, it would be inconsistent.
Is the service name deduplication needed here as well? Can there be 2 services named articleService and ArticleService in the rpc_article/src/article module for example?
@missingdays Yes, I think it is needed here either.
@Millione I added the functionality to split the services generated by volo into several files. Let me know if this splitting is enough or if we should split it into more files.
As for the name deduplication, I didn't find a good way to do it. What would be the correct way to get all the needed service names so that we could check if the service name already exists?
There are also so more changes to pilota required, so I'm using a custom branch fix/context-mode-pub from my repo https://github.com/missingdays/pilota. We can make all the changes there and then merge them all together before merging this MR, to avoid a bunch of small MRs to pilota.
@missingdays
I added the functionality to split the services generated by volo into several files. Let me know if this splitting is enough or if we should split it into more files.
I think we can split it into more files, commented in the code specifically.
What would be the correct way to get all the needed service names so that we could check if the service name already exists?
As for this, you can refer to cx.names(def_id), if its name duplicates, it will return true.
As for this, you can refer to cx.names(def_id), if its name duplicates, it will return true.
The deduplication has to check all the names, not just the current one. Because we need to transform all names to the lowercase and only then check for duplicates. For example, service ArticleService and service articleService will have different def_id, but their related directory names would clash.
The deduplication has to check all the names, not just the current one. Because we need to transform all names to the lowercase and only then check for duplicates.
What's behind cx.names(def_id) is that we've already converted all the names to the rust style to check for duplicates, and stored all the corresponding def_ids in a HashSet. I think this will work as you describe?
but their related directory names would clash.
I haven't understood it well for how their related directory names affect this, can you explain it more? As far as I'm concerned, the cx.names(def_id) for service ArticleService and service articleService will both return true, and then we know we should make a change for the filename.
Thanks, I see your point. However, we need to know how to deduplicate. If for both ArticleService and articleService we get true, how to know for which service to generate a postfix _1 and for which _2? If the ordered list of all names was available, we could check if ArticleService or articleService is first in that list and assign it the _1 postfix, and the other _2.
Oh, I get it.
If for both ArticleService and articleService we get true, how to know for which service to generate a postfix _1 and for which _2?
I have thought about using a global AtomicU64, but maybe not a best way to solve this (the postfix maybe different every time generated and influences the IDE cache)
If the ordered list of all names was available, we could check if ArticleService or articleService is first in that list and assign it the _1 postfix, and the other _2
Great, I think you can put any information you need into cx, and the ordered list of all names information I think you can get from https://github.com/cloudwego/pilota/blob/main/pilota-build/src/middle/context.rs#L382.
@missingdays Any help needed here? I can help implementing the rest if you don't have enough time.
@Millione Yeah sorry, I've been busy most of the last week.
I think only deduplication of the folders is left to be implemented? If you could implement it, that would be great.
I added you as a collaborator to both pilota and volo forks, feel free to push any changes to the fix/context-mode-pub branch if necessary.
@missingdays
I think only deduplication of the folders is left to be implemented?
Yes, I just implement it, please take a look if there's something I'm missing. Futhurmore, for the rest of job, I think there is only some detailed code style to change.
@Millione Nice!
I think there is only some detailed code style to change.
I'm ready to fix them if you have any comments.
@missingdays Well done! I think it's ready to merge. May you submit a PR to pilota first?