PowerShellPracticeAndStyle icon indicating copy to clipboard operation
PowerShellPracticeAndStyle copied to clipboard

Would organization of Modules be in scope?

Open RamblingCookieMonster opened this issue 9 years ago • 13 comments

Would it be worth including bits on organizing files and general structure for modules? For example:

  • Do functions get their own files? If so, how to name these files? If mixed, when to separate these into individual files? My take: Regardless of lines of code, I prefer to separate every function into it's own file
  • Is there a preferred / 'best practice' for organization? For example:
    • Repository Root
      • Tests
        • Integration
        • Unit
      • ModuleName
        • ModuleName.psd1
        • ModuleName.psm1
        • \Public - Public functions in here (or at root?)
        • \Private - Private functions in here
        • \lib - Optional folder for libraries
        • \bin - Optional folder for binaries

I could see this spiraling out of control, and most (or all) of it would be subjective, so might not be appropriate here.

Cheers!

RamblingCookieMonster avatar May 29 '15 00:05 RamblingCookieMonster

Yeah it could get messy, for example I have Assembly as the name for the folder where I put all assemblies and don’t use folder to separate private and public functions since I use the main module psm1 to hold all private functions and dot source the other parts of the module as PS1 files. Each person has a diff org style depending on their needs and experience. Some people put a single ps1 per fucntion

On May 28, 2015, at 8:28 PM, Warren F. [email protected] wrote:

Would it be worth including bits on organizing files and general structure for modules? For example:

Do functions get their own files? If so, how to name these files? If mixed, when to separate these into individual files? My take: Regardless of lines of code, I prefer to separate every function into it's own file

Is there a preferred / 'best practice' for organization? For example:

Repository Root Tests Integration Unit ModuleName ModuleName.psd1 ModuleName.psm1 \Public - Public functions in here (or at root?) \Private - Private functions in here \lib - Optional folder for libraries \bin - Optional folder for binaries I could see this spiraling out of control, and much (or all) of it would be personal preference, so might not be appropriate here.

Cheers!

— Reply to this email directly or view it on GitHub https://github.com/PoshCode/PSStyleGuide/issues/22.

darkoperator avatar May 29 '15 01:05 darkoperator

I don't think there is actually a best practice here yet -- I think there are dozens of conflicting practices from different users, and I know that I'm still trying different things in different projects. That said, I'm not shy (nor bothered if you convince me I'm wrong) so here's my (current) take on the subject:

The primary reason to write a module is to get a private shared scope for your functions.

Given that, I believe that separating your module into many script files is an anti-pattern, because it separates what is, in fact, one variable scope into a bunch of files.

  1. Separating the files means that you have to keep track of everything that's in scope in your head. Your tools won't help you remember the name of the variable you defined in another file, or even the name of the commands. In fact I think doing this may actually break some tools, because they can't tell which files share scope.
  2. When we talk about best practices and style, we're thinking of working with teams of people, and about the scripters who may come along months or years later to make a bug fix or add a feature. I think that breaking the scope up makes discovering shared private variables and functions harder, and makes it more likely that you can end up with conflicts without catching it in a code review.
From a C# developer's point of view:

We always tell people one-file per class (so in C# you would actually have one file per cmdlet), but that's because in C# the class is the unit of variable scope, whereas in PowerShell script modules, you share a single variable scope across the whole module.

I would further argue that in a PowerShell script module, functions are like object methods (see what happens with Import-Module -AsCustomObject), and I would suggest that you only need a file per scope.

Counter Counter Points

The only arguments I've seen in favor of splitting up int many files are about not being able to deal with the size of the code base in a single file, and I guess at that point we're down to age old arguments about whether the length of files matters when we've got code editors that can fold on function names -- which seems as silly as arguing about an 80 column line length, when we're all using 24" widescreen monitors.

If it's that large, maybe you should be developing more than one module. You can always create sub-modules for any set of commands that share access to variables or internal functions, or that share the same noun, and there are no problems with multiple files if those files are each modules...

Jaykul avatar May 29 '15 06:05 Jaykul

Funny, that's not my primary reason for writing a module. My primary reason is to create a package that is easy to distribute/share with others that can be loaded or unloaded on demand. Across the modules I work on, there aren't that many where I care about sharing a private scope with other commands, except for being able to access internal private functions to keep code dry.

In modern tools, where module layout is represented much like a C# project is laid out in a tree in Visual Studio, if you have public/private folder names with files whose names mirrors the names of the commands that they contain, there really isn't a question about what is public/private when you are reviewing code and see a command you don't recognize. Is it in the file list? Is it in a public/private folder? Then you know. Otherwise it's outside of the module scope. I find very high value in being able to jump into a file containing the one and only function I want to modify. It makes maintenance fast and easy.

Regarding scope, each function introduces their own scope, so I don't know that I agree with the whole scope argument.

And really, if you define/initialize private variables in your psm1 file, it's trivial to get their name right in other files, even without variable name completion. But I know, we're in a world where people without their GPS are like ants whose path has been rubbed out. :)

*Kirk Munro * Poshoholic, Microsoft MVP Blog http://poshoholic.com/ | Twitter http://twitter.com/poshoholic | LinkedIn http://ca.linkedin.com/in/kirkmunro | GitHub http://github.com/KirkMunro | Facebook http://www.facebook.com/#%21/kirk.munro

Need a PowerShell SME for a project at work? Contact me http://poshoholic.com/contact-me/!

On Fri, May 29, 2015 at 2:41 AM, Joel Bennett [email protected] wrote:

I don't think there is actually a best practice here yet -- I think there are dozens of conflicting practices from different users, and I know that I'm still trying different things in different projects. That said, I'm not shy (nor bothered if you convince me I'm wrong) so here's my (current) take on the subject: The primary reason to write a module is to get a private shared scope for your functions.

Given that, I believe that separating your module into many script files is an anti-pattern, because it separates what is, in fact, one variable scope into a bunch of files.

Separating the files means that you have to keep track of everything that's in scope in your head. Your tools won't help you remember the name of the variable you defined in another file, or even the name of the commands. In fact I think doing this may actually break some tools, because they can't tell which files share scope. 2.

When we talk about best practices and style, we're thinking of working with teams of people, and about the scripters who may come along months or years later to make a bug fix or add a feature. I think that breaking the scope up makes discovering shared private variables and functions harder, and makes it more likely that you can end up with conflicts without catching it in a code review.

The only arguments I've seen in favor of it this are about not being able to deal with the size of the code base in a single file, and I guess at that point we're down to age old arguments about whether the length of files matters when we've got code editors that can fold on function names -- which gets as silly as arguing about the right line length.

If it's that large, maybe you should be developing more than one module. You can always create sub-modules for any set of commands that share access to variables or internal functions, or that share the same noun, and there are no problems with multiple files if those files are each modules... From a C# developer's point of view:

We always tell people one-file per class (so in C# you would actually have one file per cmdlet), but that's because in C# the class is the unit of variable scope, whereas in PowerShell script modules, you share a single variable scope across the whole module.

I would further argue that in a PowerShell script module, functions are like object methods (see what happens with Import-Module -AsCustomObject), and I would suggest that you only need a file per scope.

— Reply to this email directly or view it on GitHub https://github.com/PoshCode/PSStyleGuide/issues/22#issuecomment-106710541 .

KirkMunro avatar May 29 '15 13:05 KirkMunro

ISE is still the primary PowerShell editor. Last I checked it doesn't even have "goto definition" -- never mind any of the rest of that. It just has tabs, and they don't show the path at all.

I'm trying really hard not to make style or practice recommendations that start with "get a better code editor" :speak_no_evil: because my anti-ISE stance has already gotten me some flack.

In any case, it's pretty clear there is no consensus about this stuff.

Jaykul avatar May 30 '15 05:05 Jaykul

So, it's been many months, and I thought I'd come back and say that for the project, I suppose that I do like @RamblingCookieMonster's original suggestion, with the caveat that I still think you should build a single .psm1 for the module package for distribution.

That is, I don't really think it's a big deal if you ship it all separated, but there is a slight penalty at load, and it's simpler for end-users to get their heads around. Plus, creating a single file gives you something to run PSScriptAnalyzer against without getting false warnings or errors.

My original warnings are still valid, and you should know that if you have script-scope module level variables your tools will refuse to help you with them (and may even warn you against them). I recommend at least declaring them in a file in Private\ that starts with an underscore (to ensure it ends up at the top of any list)...

I'm curious, what you think if we flesh it out like this:

  • ModuleName (this is the Project Root)
    • README.md (or ReadMe.md or just README)
    • LICENSE (or license.md)
    • build.ps1
    • packages.config (optional nuget config for package dependencies)
    • Tests\ (or Specs\ for the BDD crowd)
    • Source\ (or "src" or "ModuleName" again)
      • ModuleName.psd1
      • ModuleName.psm1
      • Public\ - Public functions in here
      • Private\ - Private functions in here

I would recommend that the build.ps1 script should create a folder with a version number as the output, right in the project root. This way, if the project root is in a PSModulePath folder, the versioned build will be immediately importable.

I would strongly recommend that you avoid putting binary dependencies in your source control if you can help it -- that is, please don't create an "Assemblies" folder with .dll files that you include in your git repository. Instead:

  • Assembly dependencies that are available in Nuget should be handled by a packages.config and then downloaded by the build script to a Packages\ folder in the root. The build script should copy just the relevant .dll(s) into a "lib" (or "libraries" or "assemblies") folder in the actual module output package.
  • Alternatively, if you need sub-repositories rather than nuget dependencies, put them in a "lib" (or "libraries" or "assemblies") folder in the root, and trigger their builds in your build.ps1 script.

Jaykul avatar Aug 18 '16 05:08 Jaykul

README.md and LICENSE (or LICENSE.txt) - check. build.ps1 should do a basic build without requiring any parameters, if possible. If the project uses PSake, then put the psake targets in build.psake.ps1 and then write build.ps1 to import psake and pass the correct parameters to invoke-psake on the build.psake.ps1 file. Given the prevalence of what I see on GitHub, I'd recommend src and test.

As for Public/Private, I prefer to put to private helper functions in the same script file as the public functions that use them. That's probably my C#/OO encapsulation bias coming through. :-) When I have shared, private helper functions, I still wouldn't segregate them into Public/Private folders. That convention just seems odd to me. That said, I don't think the default New Module Template for Plaster will go into that level of detail. I do see it creating src/test folders - test if you want it to scaffold Pester test support.

Regarding the build output directory, this is something we've been toying with in Plaster for a default New Module Template. I've vacillated between {workspaceRoot}/Release/<module-name> and {workspaceRoot}/.publish/<module-name>.

This way, if the project root is in a PSModulePath folder, the versioned build will be immediately importable.

I don't see that being very common. I've about given up on messing with PSModulePath. It just seems too brittle.

rkeithhill avatar Aug 18 '16 05:08 rkeithhill

@rkeithhill what do you mean, PSModulePath is brittle?

I modify it in my profile all the time -- I add, at a minimum, a folder in my "synced" location (e.g. ~/OneDrive/PowerShell/Modules) ...

I keep my in-development module projects in: ~/Projects/Modules/ so I generally add that root to my PSModulePath too ... which is why at the end of the day, ~Projects/Modules/ModuleName/1.0.0.0/ModuleName.psd1 ends up being a valid module location, and you can even Publish-Module from there without problems. Plus I can keep old versions around without even trying 😉

I played with a Output or Release folder, but ended up deciding I don't need it if the output folder is a version number anyway, and in particular, it was getting in the way of working on multiple modules together.

One thing I am leaning toward is making it so the build script outputs the module folder so that people know where it went. If we can't agree on a standard for output locations, that's particularly important...

For what it's worth, the whole thing with creating /Public or /Private folders is something I just don't care very much about --I'm still playing with it for various reasons. Personally, I think that if you do split things up, you should put them back together to ship them, just for simplicity and performance, and I feel like you should commit: do or avoid it. Either:

  • Have only module files (.psm1)
  • Break each function into it's own file

Going halfway doesn't make sense to me.

Jaykul avatar Aug 25 '16 05:08 Jaykul

what do you mean, PSModulePath is brittle?

Remember the MVP alias thread on PSModulePath back in July? How some module uninstalls were breaking PSModulePath. You said:

Any change could break things -- that's why we're talking about it -- third party installers DO break things when they change it, today.

Some PowerShell updates (5.1) reset PSModulePath. Creating the environment variable PSModulePath will prevent PowerShell from appending the standard locations. That's what I mean by brittle. It's an unfortunately easy to break mechanism.

rkeithhill avatar Aug 27 '16 16:08 rkeithhill

Would there be any reason not to simply leave the .Tests. files with the respective functions? I say that because New-Fixture automatically creates this for you and it would make it easy to call it for new functions/etc. I am also strongly in support of compiling all functions into a single .psm1 for distribution.

Root
|
+-- Builds
|    |
|    +-- ModuleName
|        |
|        +-- 0.1.0
|            +-- README.md
|            +-- LICENSE.md
|            +-- ModuleName.psm1
|            \-- ModuleName.psd1
+-- .git
|
+-- src
|    |
|    +-- Private
|        |
|        +-- Private-Function.ps1
|        \-- Private-Function.Tests.ps1
|    |
|    +-- Public
|        |
|        +-- Public-Function.ps1
|        \-- Public-Function.Tests.ps1
|    |
|    +-- ModuleName.psm1
|    \-- ModuleName.psd1
|
+-- README.md
+-- LICENSE.md
+-- build.ps1
+-- .gitignore
\-- packages.config

miketheitguy avatar Jan 09 '19 07:01 miketheitguy

However, I'm a bit more partial to using a Builds subfolder because you can include that easily in your .gitignore file when developing the module :)

miketheitguy avatar Jan 09 '19 07:01 miketheitguy

I much prefer to have my tests in a separate folder to my functions, it keeps things contained in a more logical structure and allows me to just ship out my src folder if I'm dot sourcing in the psm1 (which is very rare now).

I also ensure my tests are importing my compiled module rather than their individual function files, this way I can ensure I'm testing the same thing as I'm publishing.

ChrisLGardner avatar Jan 09 '19 08:01 ChrisLGardner

I came here to say exactly what @ChrisLGardner just has. I do exactly the same thing. I had a few issues around importing the compiled module without hard coding it in the Pester tests themselves but solved it with InvokeBuild environment variables.

I do agree with @Jaykul:

Have only module files (.psm1) Break each function into it's own file

Going halfway doesn't make sense to me.

Trying to go through the flow of a function when it's half 'n half is pain that you don't need to be inflicting in people. Be consistent. One or the other.

pauby avatar Jan 09 '19 08:01 pauby

The ModuleBuilder project is building tools for modules that want to have file-per-function and ship a single psm1 -- the file layout(s) which we came up with for that was a compromise between a half dozen or more people starting at last year's PowerShell Summit and Microsoft MVP Summit...

Having spent almost a year building and using that, I guess I'm fine with two separate questions:

First: how do you organize a module for shipping to the PowerShell Gallery?

  • Ship a psd1 and a single psm1 and/or dll.

Some of the notable exceptions:

  • You may also ship OVF tests.
  • You may also include help xml files (this is mandatory if you have any compiled cmdlets).
  • You might want to write an about_ModuleName.help.txt (or ship your ReadMe renamed as that)
  • If you have binary dependencies, put them in a "lib" or "bin" folder
  • If you need to split your module up for scope reasons, create a psm1 per scope, and list them as "NestedModules" in your manifest.

Second: how do you organize a module during development?

It depends. Is it a binary (i.e. C#) module, or one with some binary code? Is it a small module, with just a few commands? Do you have tests?

My honest recommendation for script module development structure is basically: use ModuleBuilder and follow the patterns we're establishing there. It basically matches the recommendations I've already given above.

  • If you see something you don't like, open an issue there, and let's have a conversation.
  • If you need examples, look for modules that are using ModuleBuilder (like any of mine that I've republished in the last couple of months).

Frankly, if you read through this whole thread (and if you look through the code/commits/issues on ModuleBuilder) I think you'll see exactly why I've been unwilling to put any of this into the best practices guide so far.

The "best practices" for module development are still changing all the time, and are largely made up of workarounds for problems with the way PowerShell handles code-signature checks and module and command discovery, and how Pester and other tools work.

Jaykul avatar Jan 10 '19 04:01 Jaykul