freeradius-server icon indicating copy to clipboard operation
freeradius-server copied to clipboard

rlm_python3: attribute values that fail utf8 conversion will be returned as bytes, not str

Open aren opened this issue 1 year ago • 18 comments

which will return a byte string if it a string can't be interpreted as UTF8.

Also improve exception handling so that no python exceptions remain after the do_python_single call. Otherwise the next request will immediately fail.

aren avatar Dec 06 '23 23:12 aren

Maybe just do the conversion by default? Don't see the need for a magic knob.

arr2036 avatar Dec 25 '23 11:12 arr2036

I'm fine with that. However, I added the knob because it's a potentially breaking change. Without the knob, suddenly, someone using the python3 module who is expecting every string in authorize() (for example) to be a str is suddenly getting bytes objects too where they didn't before.

aren avatar Dec 26 '23 19:12 aren

All RADIUS attributes should be UTF8, so it's unlikely they were getting binary data in string attributes anyway.

arr2036 avatar Dec 27 '23 12:12 arr2036

The problematic attribute is "MS-CHAP-Error". It frequently has binary data embedded in it when an MS-CHAP error is encountered.

aren avatar Dec 27 '23 17:12 aren

https://datatracker.ietf.org/doc/html/rfc2548#section-2.1.5 says

      This field contains specially formatted ASCII text, which is
      interpreted by the authenticating peer.

So anything putting non-ASCII data in there is likely wrong.

For FreeRADIUS, the rlm_mschap module prints it as ASCII text. See function mschap_error()

alandekok avatar Dec 27 '23 17:12 alandekok

I'll capture an example and post it in here; it's possible this has exposed another issue elsewhere.

aren avatar Dec 27 '23 17:12 aren

Here's the attribute:

Error: Conversion to Unicode failed, returning MS-CHAP-Error as bytes: ?E=691 R=1 C=34991e4639c657a2c4870ec5fce94731 V=3 M=Authentication rejected

No obvious non-ascii chars. This is formatted with the vp_prints_value() function (with arguments vp_prints_value(buf, sizeof(buf), vp, '\0'))

vp_prints_value() uses fr_prints() which says it will Escape any non printable or non-UTF8 characters in the input string so I'm confused as to why Python is getting (or thinks it's getting) any non-UTF8 to begin with.

Is there an easy way to dump the attribute in hex with just the vp pointer? (if I had access to the request object it looks like I could call rad_print_hex on request->packet but it's not easily available in this context.

aren avatar Dec 29 '23 07:12 aren

@aren The key is:

returning MS-CHAP-Error as bytes: ?E=691 R=1 C=34991e4639c657a2c4870ec5fce94731 V=3 M=Authentication rejected

The first ? is really a non-ASCII character. It's the one octet "ident" which is used to match requests and responses.

So the MS-CHAP-Error attributes is partly binary:

https://www.rfc-editor.org/rfc/rfc2548#section-2.1.5

  ...

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Vendor-Type  | Vendor-Length |     Ident     |    String...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

My take is to skip the flag, and automatically convert the attribute to bytes if it isn't UTF-8.

The issue is that while the RFCs say things like User-Name is UTF-8, there is no way to enforce that. So clients can send any random garbage in any attribute of type string. And the server just has to live with it.

Since the previous behavior was to fail conversion, I think it's safe to just convert the bad data to bytes.

The only issue is that when it's converted back from Python to VALUE_PAIR, it has to check the destination data type. An attribute of type string can have arbitrary binary data added to it, and the length must not depend on doing strlen(data). Instead, the string length is the length of the Python data, plus one for a trailing zero.

alandekok avatar Dec 29 '23 18:12 alandekok

Hi, I have something similar in the logs. Sometimes, I see this sometimes

Mon Sep 18 10:01:11 2023 : Error: python_error_log:209, Exception type: <class 'UnicodeDecodeError'>, Exception value: 'utf-8' codec can't decode byte 0xdd in position 1: invalid continuation byte
Mon Sep 18 10:01:11 2023 : Error: do_python_single:496, authorize - mod_populate_vps failed
Mon Sep 18 10:01:11 2023 : Error: do_python_single:675, authorize - RLM_MODULE_FAIL
Mon Sep 18 10:01:11 2023 : Error: mod_populate_vptuple:402, vp->da->name: User-Password

I think if I understand it correctly I can't do anything about it, i.e. users need to send the password or other field in correct encoding to not have such errors in the logs. I saw this issue, and just want to confirm my understanding of it. Am I right or I am missing something?

vo-va avatar Jan 10 '24 17:01 vo-va

@vo-va If it's complaining about the User-Password containing invalid data, then the shared secret is wrong. When you run the server in debug mode it will give you a large error message saying this.

alandekok avatar Jan 10 '24 17:01 alandekok

The only issue is that when it's converted back from Python to VALUE_PAIR, it has to check the destination data type. An attribute of type string can have arbitrary binary data added to it, and the length must not depend on doing strlen(data). Instead, the string length is the length of the Python data, plus one for a trailing zero.

@alandekok circling back to this PR so I can get it closed out. Does another module (perl?) do this, so I can use it as an example?

aren avatar Feb 21 '24 01:02 aren

@aren You can just use the function fr_pair_value_bstrncpy(). It takes the parent VP, a pointer to data, and a length of data. It will produce the correctly formatted VP.

alandekok avatar Feb 21 '24 15:02 alandekok

@alandekok fr_pair_value_bstrncpy() didn't work on its own, but when I mimicked what rlm_perl does (by using fr_pair_value_from_str() in the non-string case) it seems to work. Is this what you had in mind? Feedback welcome.

aren avatar Mar 08 '24 06:03 aren

Why doesn't fr_pair_value_bstrncpy() work? if you pass it a string + length, it will copy all of the data into the pair.

alandekok avatar Mar 08 '24 19:03 alandekok

fr_pair_value_bstrncpy() doesn't work with if vps type is something other than PW_TYPE_STRING.

For example, this assignment: config:Auth-Type=Accept which is of type integer in dictionary.freeradius.internal (and probably translates to PW_TYPE_INTEGER)

Should it handle this case? I can dive deeper if so. But it's a special case in rlm_perl so I assumed fr_pair_value_bstrncpy() could only handle PW_TYPE_STRING types.

aren avatar Mar 08 '24 22:03 aren

The fr_pair_value_bstrncpy() function is only for assigning string values to a string type. It doesn't parse the a string into a data type like integer.

alandekok avatar Mar 09 '24 14:03 alandekok

Ok great, that's what I figured. Then I think this PR does the right thing.

aren avatar Mar 10 '24 22:03 aren