cpython icon indicating copy to clipboard operation
cpython copied to clipboard

bpo-22253: Add support to unnamed sections in ConfigParser

Open pslacerda opened this issue 8 years ago • 37 comments

Add support to files that have key-value pairs before any section or no section at all. Unnamed sections are enabled when passing allow_unnamed_section=True to configparser.ConfigParser and are stored as an empty string.

pslacerda avatar Jul 16 '17 19:07 pslacerda

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

the-knights-who-say-ni avatar Jul 16 '17 19:07 the-knights-who-say-ni

This PR closes https://bugs.python.org/issue22253 from 2014

While waiting for this change to land in Python, a huge amount of people seem to use these workarounds:

  • https://stackoverflow.com/questions/2819696/parsing-properties-file-in-python
  • https://stackoverflow.com/questions/22501121/configparser-missingsectionheadererror-when-parsing-rsyncd-config-file-with-glob

ottok avatar Apr 25 '21 20:04 ottok

This PR is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Feb 20 '22 00:02 github-actions[bot]

This PR should be merged and not closed for being "stale".

ottok avatar Feb 20 '22 00:02 ottok

This PR is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Mar 25 '22 00:03 github-actions[bot]

Why is this not merged yet? It would be great to have!

ChristianSi avatar Jul 01 '22 10:07 ChristianSi

@ChristianSi Probably because https://github.com/python/cpython/pull/2735#pullrequestreview-50481801 is not addressed.

arhadthedev avatar Jul 01 '22 10:07 arhadthedev

This PR is stuck for five years, so cc @ambv as a configparser expert.

arhadthedev avatar Jul 01 '22 10:07 arhadthedev

Any news on this one?

ChristianSi avatar Aug 02 '22 09:08 ChristianSi

This PR is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Sep 02 '22 00:09 github-actions[bot]

@pslacerda It seems this PR got stalled because you didn't address the proposal to use a sentinel object instead of the empty string (as per your own suggestion).

Could you implement that change to get this moving again? Also, the branch would have be to rebased or main merged into it to get it up-to-date.

Would be great if we could finally get this completed!

ChristianSi avatar Sep 20 '22 07:09 ChristianSi

Seems someone already picked and implemented this feature on the meantime. It is named 'DEFAULT'.

pslacerda avatar Sep 20 '22 10:09 pslacerda

It seems that it write '' empty named sections with 'DEFAULT' which isn't the desired behaviour. No way to fix that without break retrocompatibility.

Now, after the changes, maybe this would be a WONTFIX.

pslacerda avatar Sep 20 '22 10:09 pslacerda

Why did you close this?

The DEFAULT section is a named section can be used to provide default values for all other sections. But it must be named, just like any other section. It's not possible to read or write config files without a section header using that "default" feature.

But that's exactly what this PR was meant to be about. So it's not obsolete, and it's still needed.

As for the parser treating '' as a reference to the DEFAULT section: I don't know why it does so, nor whether it's a bug or just a "surprise feature".

So '' can't be used to refer to the unnamed section, true. But that idea was already rejected anyway.

But if you switch to a sentinel object such as configparser.UNNAMED_SECTION, this feature can still be implemented as intended.

ChristianSi avatar Sep 20 '22 12:09 ChristianSi

But if DEFUALT is a string so UNNAMED must be also... and so break backwards compatibility.

Em ter., 20 de set. de 2022 09:02, Christian Siefkes < @.***> escreveu:

Why did you close this?

The DEFAULT section is a named section can be used to provide default values for all other sections. But it must be named, just like any other section. It's not possible to read or write config files without a section header using that "default" feature.

But that's exactly what this PR was meant to be about. So it's not obsolete, and it's still needed.

As for the parser treating '' as a reference to the DEFAULT section: I don't know why it does so, nor whether it's a bug or just as "surprise feature".

So '' can't be used to refer to the unnamed section, true. But that idea was already rejected anyway.

But if you switch to a sentinel object such as configparser.UNNAMED_SECTION, this feature can still be implemented as intended.

— Reply to this email directly, view it on GitHub https://github.com/python/cpython/pull/2735#issuecomment-1252253929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABEO3UOO3P6YF7XWR3BUG3V7GRTVANCNFSM4DTFSEYQ . You are receiving this because you modified the open/close state.Message ID: @.***>

pslacerda avatar Sep 20 '22 12:09 pslacerda

Why must it be a string? In contrast to the name of the DEFAULT section, which is written to the file just like any other section header (and so must be a string), it's never written to the file – it rather signals the absence of any section header. So it can be anything. Simply using UNNAMED_SECTION = object() should work, and you had earlier suggested it yourself.

ChristianSi avatar Sep 20 '22 13:09 ChristianSi

Both test cases were successful: test_no_first_section (Lib.test.test_configparser.SectionlessTestCase) ... ok test_no_section (Lib.test.test_configparser.SectionlessTestCase) ... ok SO examples work, but if you print the key you get DEFAULT and if you want to see the get() you need to use a blank key ''.

What about this case? It should print something like '<UNNAMED_SECTION>' or somewhat?

pslacerda avatar Sep 20 '22 13:09 pslacerda

Why must it be a string? In contrast to the name of the DEFAULT section, which is written to the file just like any other section header (and so must be a string), it's never written to the file – it rather signals the absence of any section header. So it can be anything. Simply using UNNAMED_SECTION = object() should work, and you had earlier suggested it yourself.

I see your point and agree with it. I'll try to implement it next week.

pslacerda avatar Sep 20 '22 13:09 pslacerda

@ChristianSi, I got time and implemented the feature. Can you check the source code?

EDIT: I'm still missing documentation...

pslacerda avatar Sep 21 '22 14:09 pslacerda

Also, it seems that Misc/NEWS.d/next/ still needs to be updated. I guess that's what you mean with "I'm still missing documentation"?

ChristianSi avatar Sep 22 '22 08:09 ChristianSi

I think it's done now. Otherwise you're welcome to commit on this PR. I also added your name in Misc/NEWS.d.

pslacerda avatar Sep 23 '22 12:09 pslacerda

I was on a congress, monday I'll continue. Sorry for the delay.

pslacerda avatar Sep 30 '22 10:09 pslacerda

I will accept this. My only ask is to change the term from "unnamed section" to "top-level section". My reasoning for this is that an "unnamed section" brings to mind something like

[]
a=b

Here, we actually want a section for top-level options.

What do you think?

ambv avatar Oct 04 '22 18:10 ambv

Or single-level options like https://manual.gromacs.org/documentation/current/reference-manual/file-formats.html#mdp.

Maybe name it "sectionless".

pslacerda avatar Oct 04 '22 19:10 pslacerda

That makes me think: a top-level section suggests that values in it will be easily accessible via interpolation from other sections. This is not the case in the current form of the PR. I guess it would make sense to support that? Like

project_path = /home/user/Documents/project

[assets]
images_path = %(project_path)s/images
scripts_path = %(project_path)s/scripts

or

scripts_path = ${project_path}/scripts

using ExtendedInterpolation.

In effect, it seems to me like the top-level section should just be a nameless DEFAULTSECT. That way interpolation works. So it would rather be implemented by setting default_section=configparser.TOP_LEVEL? (it cannot be an empty string because that is already possible and produces files with a [] section... that cannot be read back, but that's another story)

ambv avatar Oct 04 '22 21:10 ambv

@ambv, just implemented as you said. Hope now you'll agree.

pslacerda avatar Oct 05 '22 13:10 pslacerda

No no no, I would urgently suggest to revert these latest commits! I think mixing the DEFAULT section and the unnamed section is a very bad idea. It would make key/values pairs from the unnamed section leak into all other sections. That would be very unexpected and I think in at least 99 out of 100 cases it's not what users will want or need.

DEFAULT sections are a Python extension, as far as I can tell, there aren't otherwise used in INI files.

Some INI files don't have any section header, or they have keys before the first section header. Therefore it's good to support such a nameless section, which is what this PR was meant to be about. But such INI files don't support or provide any DEFAULT behavior. Making the unnamed section the DEFAULT section would in fact make it impossible to parse them correctly.

Therefore, let's keep these things separate, as they should be. Sectionless key/value pairs are good for some use cases, a DEFAULT section is good for certain cases. But one should be explicit if one wants that, so merging the two is not the right thing to do.

ChristianSi avatar Oct 05 '22 16:10 ChristianSi

As for the section name, I honestly don't care much. It could be UNNAMED_SECTION, NAMELESS_SECTION, TOP_LEVEL_SECTION, or just TOP_LEVEL. All are fine with me.

But note that, if we change the name, we need of course adjust the comment explaining it accordingly. And I think we should adjust the name of the private object as well, so as not to confuse future coders.

I don't think we could name it SECTIONLESS_SECTION. While that would be logical in a way, it would also be confusing. Just SECTIONLESS might be possible, but somehow "a section named SECTIONLESS" doesn't strike me as right.

ChristianSi avatar Oct 05 '22 16:10 ChristianSi

Some INI files don't have any section header, or they have keys before the first section header. Therefore it's good to support such a nameless section, which is what this PR was meant to be about.

Yes, the current implementation in this PR does this.

DEFAULT sections are a Python extension, as far as I can tell, there aren't otherwise used in INI files.

There isn't really an INI standard so every implementation is different. Top-level options are themselves an "extension" only supported by a few parsers. What's most important is internal consistency within a single parser.

But such INI files don't support or provide any DEFAULT behavior.

But Python's configparser does and it would be entirely unexpected for configparser users to have a top-level section from which you cannot interpolate values. Like, literally, such interpolation would be impossible because it wouldn't work in the basic interpolation and the nameless section couldn't be addressed in extended interpolation.

Therefore, treating a top-level section as the default is the only solution here.

In fact, other APIs in configparser already treat "an empty string" to denote DEFAULTSECT, for instance:

https://docs.python.org/3.10/library/configparser.html#configparser.ConfigParser.has_option

Making the unnamed section the DEFAULT section would in fact make it impossible to parse them correctly.

In what way?

ambv avatar Oct 05 '22 17:10 ambv

I never saw in the wild INI with a nameless section followed by regular sections, so both ideas are acceptable for me. Would be nice to have interpolation but nasty to have top-level section leaking configs to other sections.

pslacerda avatar Oct 05 '22 17:10 pslacerda