cecil icon indicating copy to clipboard operation
cecil copied to clipboard

Saved modules grow in size

Open KrisVandermotten opened this issue 15 years ago • 13 comments

I compile an assembly using Visual Studio 2010, C#. Target Framework is version 4 Client Profile. Output type is class library. Platform target is Any CPU. Optimize code is on. Allow Unsafe code is off. There are no managed resources and no native resources. The assembly is signed (strong name).

The PE Header says: Section Alignment = 8192 File Alignment = 512

.text section is 254 KB .rsrc section is 1 KB .reloc section is 512 bytes

CLR Header says MetaData is 147,59 KB

Total file size is 256 KB.

I then open the module using

ModuleDefinition md = ModuleDefinition.ReadModule(modulepath);

and save it using

WriterParameters wp = new WriterParameters
{
    WriteSymbols = md.HasSymbols
};

md.Write(modulepath, wp);

The new file is now 258.5 KB.

The .text section has grown to 255.5 KB A .sdata section has been added, 512 bytes

CLR Header says MetaData is now down to 147.5 KB.

The .text section should not grow. A .sdata section should not be added. The total file size should not increase (It would be ok if it went down without loosing information).

I have not tested this on other modules.

KrisVandermotten avatar May 12 '10 08:05 KrisVandermotten

For your information, these are the section flags in the original module:

.text : 254 KB (ContainsCode, MemExecutable, MemReadable)
.rsrc : 1 KB (ContainsInitializedData, MemReadable)
.reloc : 512 bytes (ContainsInitializedData, MemDiscardable, MemReadable)

These are the section flags in the new module: .text : 255,5 KB (ContainsCode, MemExecutable, MemReadable) .sdata : 512 bytes (ContainsInitializedData, MemReadable, MemWritable) .rsrc : 1 KB (ContainsInitializedData, MemReadable) .reloc : 512 bytes (ContainsInitializedData, MemDiscardable, MemReadable)

As you can see, the original module had no section with the MemWritable flag.

KrisVandermotten avatar May 12 '10 08:05 KrisVandermotten

Please make available a sample module that I can roundtrip to reproduce this enormous 2kb growth.

jbevain avatar May 12 '10 10:05 jbevain

I used Mono.Cecil.dll source code.

BTW, i would not say that a 1% growth is "enormous".

KrisVandermotten avatar May 12 '10 11:05 KrisVandermotten

Then what's the big deal? As long as the metadata is preserved, and the resulting file fully functional ?

jbevain avatar May 12 '10 11:05 jbevain

I didn't say the growth per se was a big deal; it's nothing like a true functional bug. But if it can be avoided, it should be. I'm thinking about basing a peephole optimizer on Cecil, and it looks bad if an optimizer increases the size of the module, especially if it's supposed to optimize for space.

What I care more about is the nature of the change. For example, I wonder why the output module needs a MemWritable section, when the input module does not. I feel that, unless the growth is fully understood, it could be an obscure bug waiting to surface. To put it in your own words: will the resulting files always be fully functional? What about non-functional differences?

Also, I can imagine scenarios in which even 1% growth is a big deal. Modules that are downloaded over slow networks for example. Or modules used in memory-constrained situations.

Anyway, I report issues to you for you to decide what to do about them, if anything at all. After all, it's your product, so you decide and prioritize. Some of the things I report may be due to my lack of understanding the product, to different goals, or other factors.

In this specific case, this issue may be blocking my use of Cecil; I haven't decided yet (and it depends on how Cecil evolves).

KrisVandermotten avatar May 12 '10 12:05 KrisVandermotten

The issue with feeling about obscure bug waiting to surface is that you can have them about pretty much every single line of code in every single product.

Cecil, like System.Reflection.Emit, emits fields initializers in a sdata section, as it's easier to patch than the .text section. The price to pay is that this section has to be aligned. I for one don't mind doing the extra work of moving the data to the .text section, but using a sdata section was the easiest way to get going.

But it doesn't change the fact that both .net and Mono handle them the same way.

jbevain avatar May 12 '10 12:05 jbevain

On the other hand, the .text increase is totally worth investigating, as Cecil 0.6 usually did a better job than most compilers at emitting it.

jbevain avatar May 12 '10 12:05 jbevain

I understand the point on the sdata section.

Could the increase in the .text size be related to the fact that you use a .sdata section, in the sense that "stuff" serializes differently if the sdata section is used?

If so, removing the use of the sdata section is a "very nice to have".

If not, removing it is still "nive to have", but I then also look forward to the results of your investigation.

KrisVandermotten avatar May 12 '10 14:05 KrisVandermotten

You said :The issue with feeling about obscure bug waiting to surface is that you can have them about pretty much every single line of code in every single product.

True. And there can be a gas leak in about every building that uses gas for heating. No smell doesn't prove there is no leak (or bug), just like a smell doesn't prove there is one. Still, I feel more comfortable in a building where it doesn't smell like gas.

Just an analogy to clarify my position.

KrisVandermotten avatar May 12 '10 14:05 KrisVandermotten

You said: But it doesn't change the fact that both .net and Mono handle them the same way.

As you know very well, there's nothing wrong with mono doing a better job than .net. In fact, it wouldn't be the first time...

KrisVandermotten avatar May 12 '10 14:05 KrisVandermotten

That's a fear driven decision process. I have a test suite saying me that my assemblies are fully functional after round-trip. Don't get me wrong, I'm all for optimizing the resulting assembly, and make it as compact as possible. I'm just saying that until I'm proven that something is actually not working properly, I don't see a need to take action, based on a feeling.

jbevain avatar May 12 '10 16:05 jbevain

I don't think it's fear, rather prudence, being careful. I've tracked down many bugs in my time based on a smell.

Also, unit tests are very valuable, but they're just that: unit tests. In addition to unit tests, I always run a battery of static analysis tools on my code. Often I also perform stress tests and other kinds of tests.

In this specific case, I run evertything that comes out of Cecil (after being modified by my code) through its battery of tests and validations, but also through peverify and other low level pe file analysers. I will also look at JIT compiler output, to verify that my modifications do not disable JIT optimizations.

Like you say, there can be an obscure bug everywhere, but the more eyes (tests and validaion tools) look at the code (in different ways), the smaller the chances are.

KrisVandermotten avatar May 12 '10 18:05 KrisVandermotten

@jbevain think this one can be closed

SimonCropp avatar Sep 06 '14 01:09 SimonCropp