freeradius-server
freeradius-server copied to clipboard
rlm_python3: attribute values that fail utf8 conversion will be returned as bytes, not str
which will return a byte string if it a string can't be interpreted as UTF8.
Also improve exception handling so that no python exceptions remain after the do_python_single call. Otherwise the next request will immediately fail.
Maybe just do the conversion by default? Don't see the need for a magic knob.
I'm fine with that. However, I added the knob because it's a potentially breaking change. Without the knob, suddenly, someone using the python3 module who is expecting every string in authorize() (for example) to be a str is suddenly getting bytes objects too where they didn't before.
All RADIUS attributes should be UTF8, so it's unlikely they were getting binary data in string attributes anyway.
The problematic attribute is "MS-CHAP-Error". It frequently has binary data embedded in it when an MS-CHAP error is encountered.
https://datatracker.ietf.org/doc/html/rfc2548#section-2.1.5 says
This field contains specially formatted ASCII text, which is
interpreted by the authenticating peer.
So anything putting non-ASCII data in there is likely wrong.
For FreeRADIUS, the rlm_mschap module prints it as ASCII text. See function mschap_error()
I'll capture an example and post it in here; it's possible this has exposed another issue elsewhere.
Here's the attribute:
Error: Conversion to Unicode failed, returning MS-CHAP-Error as bytes: ?E=691 R=1 C=34991e4639c657a2c4870ec5fce94731 V=3 M=Authentication rejected
No obvious non-ascii chars. This is formatted with the vp_prints_value() function (with arguments vp_prints_value(buf, sizeof(buf), vp, '\0'))
vp_prints_value() uses fr_prints() which says it will Escape any non printable or non-UTF8 characters in the input string so I'm confused as to why Python is getting (or thinks it's getting) any non-UTF8 to begin with.
Is there an easy way to dump the attribute in hex with just the vp pointer? (if I had access to the request object it looks like I could call rad_print_hex on request->packet but it's not easily available in this context.
@aren The key is:
returning MS-CHAP-Error as bytes: ?E=691 R=1 C=34991e4639c657a2c4870ec5fce94731 V=3 M=Authentication rejected
The first ? is really a non-ASCII character. It's the one octet "ident" which is used to match requests and responses.
So the MS-CHAP-Error attributes is partly binary:
https://www.rfc-editor.org/rfc/rfc2548#section-2.1.5
...
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Vendor-Type | Vendor-Length | Ident | String...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
My take is to skip the flag, and automatically convert the attribute to bytes if it isn't UTF-8.
The issue is that while the RFCs say things like User-Name is UTF-8, there is no way to enforce that. So clients can send any random garbage in any attribute of type string. And the server just has to live with it.
Since the previous behavior was to fail conversion, I think it's safe to just convert the bad data to bytes.
The only issue is that when it's converted back from Python to VALUE_PAIR, it has to check the destination data type. An attribute of type string can have arbitrary binary data added to it, and the length must not depend on doing strlen(data). Instead, the string length is the length of the Python data, plus one for a trailing zero.
Hi, I have something similar in the logs. Sometimes, I see this sometimes
Mon Sep 18 10:01:11 2023 : Error: python_error_log:209, Exception type: <class 'UnicodeDecodeError'>, Exception value: 'utf-8' codec can't decode byte 0xdd in position 1: invalid continuation byte
Mon Sep 18 10:01:11 2023 : Error: do_python_single:496, authorize - mod_populate_vps failed
Mon Sep 18 10:01:11 2023 : Error: do_python_single:675, authorize - RLM_MODULE_FAIL
Mon Sep 18 10:01:11 2023 : Error: mod_populate_vptuple:402, vp->da->name: User-Password
I think if I understand it correctly I can't do anything about it, i.e. users need to send the password or other field in correct encoding to not have such errors in the logs. I saw this issue, and just want to confirm my understanding of it. Am I right or I am missing something?
@vo-va If it's complaining about the User-Password containing invalid data, then the shared secret is wrong. When you run the server in debug mode it will give you a large error message saying this.
The only issue is that when it's converted back from Python to
VALUE_PAIR, it has to check the destination data type. An attribute of typestringcan have arbitrary binary data added to it, and the length must not depend on doingstrlen(data). Instead, the string length is the length of the Python data, plus one for a trailing zero.
@alandekok circling back to this PR so I can get it closed out. Does another module (perl?) do this, so I can use it as an example?
@aren You can just use the function fr_pair_value_bstrncpy(). It takes the parent VP, a pointer to data, and a length of data. It will produce the correctly formatted VP.
@alandekok fr_pair_value_bstrncpy() didn't work on its own, but when I mimicked what rlm_perl does (by using fr_pair_value_from_str() in the non-string case) it seems to work. Is this what you had in mind? Feedback welcome.
Why doesn't fr_pair_value_bstrncpy() work? if you pass it a string + length, it will copy all of the data into the pair.
fr_pair_value_bstrncpy() doesn't work with if vps type is something other than PW_TYPE_STRING.
For example, this assignment: config:Auth-Type=Accept which is of type integer in dictionary.freeradius.internal (and probably translates to PW_TYPE_INTEGER)
Should it handle this case? I can dive deeper if so. But it's a special case in rlm_perl so I assumed fr_pair_value_bstrncpy() could only handle PW_TYPE_STRING types.
The fr_pair_value_bstrncpy() function is only for assigning string values to a string type. It doesn't parse the a string into a data type like integer.
Ok great, that's what I figured. Then I think this PR does the right thing.