FlashMQ icon indicating copy to clipboard operation
FlashMQ copied to clipboard

Another protocol violations or bugs in FlashMQ

Open songxpu opened this issue 1 year ago • 11 comments

Hi @halfgaar, some findings still have not been reported in #103, so I have submitted this issue and apologize for any inconvenience caused.

Running command:

./FlashMQBuildRelease/flashmq -c /home/ubuntu/conf/flashmq.conf

flashmq.conf:

listen {
    protocol mqtt
    inet_protocol ip4
    inet4_bind_address 0.0.0.0
    port 1883
}

bridge {
    address 192.168.222.128
    port 1884
    tls off
    protocol_version mqtt5
    keepalive 600
    bridge_protocol_bit false
    publish topic 1
    subscribe topic 1
}

allow_anonymous true
log_file    /tmp/flashmq/flashmq.log
storage_dir /tmp/flashmq

songxpu avatar May 13 '24 11:05 songxpu

According to the specification of MQTTv3.1.1 and MQTTv5.0:

[MQTT-3.1.0-1]
After a Network Connection is established by a Client to a Server, the first packet sent from the Client to the Server MUST be a CONNECT packet.

However, FlashMQ unexpectedly responded to any PINGREQ initiated by clients who had not yet established an MQTT connection with the broker, that is, the first request could be a PINGREQ message instead of a CONNECT, which violates protocol specification.

echo c000  | xxd -p -r | nc 172.17.0.5 1883

image


Similarly, as defined by the specification, PINGREQ is sent from the client to the server.

However, this seems to be violated in bridged mode, where FlashMQ and the remotely bridged broker have the identities of client and server, respectively.

It is permissible behavior for FlashMQ to initiate a PINGREQ to the remote broker (IP 192.168.222.128) and receive a PINGRES.

However, I found that if the remote broker accidentally sends a PINGREQ message to FlashMQ and FlashMQ returns a PINGRESP, this seems to be a bug, and I've noticed that other open-source MQTT brokers (EMQX、HiveMQ、Vernemq、NanoMQ) don't seem to have this problem.

I think this is essentially due to the mistake of treating the remote broker and the local client as the same identity, but essentially they are different.

image

songxpu avatar May 13 '24 11:05 songxpu

According to the specification of MQTTv5.0:

[MQTT-3.3.4-6] 
A PUBLISH packet sent from a Client to a Server MUST NOT contain a Subscription Identifier.

However, if we send such a packet:

echo 106100044d5154540540b3b037119afb60e317001901215c5326000f766e72366d4541644d78553c44327800173049574d36324268715a6179524b5a62536749534a31360013317675535434755733374e64397846585a38570008676235716d5836363554000841684b3146454962f71a16010109000c79a506aff5eef39ed5210cd60bba849b523155596942337761334c376e765936573739413862666a46414e4e3172647544345773415778724a6667386d3258653363 | xxd -p -r | nc 172.17.0.5 1883

FlashMQ unexpectedly returned a response instead of rejecting such a request. image

songxpu avatar May 13 '24 11:05 songxpu

According to the specification of MQTTv5.0:

[MQTT-3.4.2-1] 
The Client or Server sending the PUBACK packet MUST use one of the PUBACK Reason Codes.

# note that Table 3‑4  define the PUBACK Reason Codes.

However, if we send such a PUBACK (with invalid reason codes like 0x01) to FlashMQ, FlashMQ seems to have received such messages normally.

# the packet contain CONNECT -> PUBACK -> PUBLISH (QoS1)
echo 10800100044d51545405c20001411701212e3b22464426001a5a554c7742485a7a676a6e6579664f77447230496a374a46304c001a38436f4c643657616b6a6a4e74306c436b6f78536b593143356c001b6d6c33456b677234313272796d6d707a5a7664534e6c735965594200086732514e52776a38000b4743684f424f6c7a7967624004b65d0100320f0005746f706963000100746f706963  | xxd -p -r | nc 172.17.0.5 1883

The response is CONACK->PUBACK, which proves that the FlashMQ received PUBACK was not rejected as expected.

image

songxpu avatar May 13 '24 11:05 songxpu

According to the specification of MQTTv5.0:

[MQTT-3.6.1-1]

Bits 3,2,1 and 0 of the Fixed Header in the PUBREL packet are reserved and MUST be set to 0,0,1 and 0 respectively. The Server MUST treat any other value as malformed and close the Network Connection.

However, the situation is different:

# The PUBREL (the reserve field in fixed header is 0, 0, 0, 0)
echo 107900044d5154540574625e0a1700190121de6022e03400126c5141593872614674494b73536a7956444d250101027f4801a708001b4a43394d424c416c5033654d704d457a626a5177527259663777330008465a416b4f4f4367001638344d77536b3430774f526451496463617259773337000636725930564c603ac6b000361f0017736d73496a39436331524a6d7541307644425a373968792600136e656f3666454367794c4a457948686e64696f000479326175 | xxd -p -r | nc 172.17.0.5 1883 

The expected behavior was considered illegal and the network connection was closed, but the corresponding request was unexpectedly received.

image

songxpu avatar May 13 '24 11:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include Request Problem Information more than once, or to have a value other than 0 or 1.

If we send a Connect (the value of Resequest Problem Information is set to 2), FlashMQ returns the CONNACK with successful code unexpectedly.

echo 105600044d5154540580404724118580d11d16000478697441170219012600075869563769414100083237746a6f4c3972001169587932653251324d6a3845697435517100126e63424f613833736f464e4e4d5944366442 | xxd -p -r | nc 172.17.0.5 1883

image image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include Authentication Data if there is no Authentication Method.

Replay:

echo 107500044d515454054069d62216001a3836444175657452417668423854684a735a7474744935793341274e45fab6001974496e5645793058343872467a735361747763776c6849787a001d575a796d784c4e55317a335a7175744936434e673748576f34307a4f42000c4b4c32484b46385933343065 | xxd -p -r | nc 172.17.0.5 1883

image

image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Receive Maximum value more than once or for it to have the value 0.

Replay such packet:

echo 105100044d5154540540ea5c2e11a5b8420316001956343176417a6c647a593745557238787133336b526e324850190021e74721e7472785e0551c000000147648777435796e39333565667a586a5773763766 | xxd -p -r | nc 172.17.0.5 1883

image

image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Message Expiry Interval more than once.

Replay such packet:

echo 108a0100044d515454059469d30519012181a700154e4838466b6a54617152696e326c797052316b50664103001c4935556b465530434d3778366d736371444e5a777a6b35654637386508000950624a747043434d4a0900132cc6f6912ac734b5e6a8b2acdc301886dd273600095a6f6a6632473268710007526a744d756576000b64364d61664257353733593c8d0100036e6b45dd3b510252ab83340252ab833403001854556873333351354b6f42643261696c4258364c6a547138080005724f4a4e6a09001e06ec9601ab2f750b9df8c182d84259566a61556c59d1a95fd38688652caa23409657777a6d484d344e5342766631556c5248653130516865757a665768594b787957636c5a7143394b713848737070504942687469 | xxd -p -r | nc 172.17.0.5 1883

The expected behavior is to consider this message as a protocol error, but unexpectedly received the corresponding PUBREC. image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Response Topic more than once.

Replay:

echo 10ae0100044d51545405c6f8d7021701000f46356d30636e4b34633958413030332c020748306803000026000449454148001b366f694536686241734132583462704368426b6a77355832314d450018416b484869364752656558436c53725a36534a6f5a393962001733555a41655261594764374341486f4f4a78474d756c300014425243504679514b6e42534a4373453552434c570018496451446b6a736d44366a66796d513556557a7a6c4a68703d4a00025959c10f1a010002d31586140800013108000132260003417968000350754f4e616b6937656a796a696e77434a66775573544e377376644f4c38544f32434e6431365972304c4f6ece00 | xxd -p -r | nc 172.17.0.5 1883

The expected behavior is to consider this message as a protocol error, but unexpectedly receive the corresponding PUBREC.

image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Content Type more than once.

Replay such packet:

echo 10950100044d515454054e28d00a1120b290b627036e6116000261453d01000800074679685377495218ef612d5926000c72376a72376d747062733257001b735a595569453539795a487333616b4833494e77496b4762494157000658415873325a00166c47547172693658526b536f4e724666496d486f5573001c306d54567544763978426f526b59374c3672346a433142514c503838318d010014556b7768636d324d5371307667584f696943623351010103001a4e624553556651574a394b354f58367333397a6b6a636157373203001a4e624553556651574a394b354f58367333397a6b6a636157373308000d6856344a4e7873487062727a390bbeaee12f52675542547a5373475876796d644d68373161785a336d715430305a414b5346524c746e55ce00  | xxd -p -r | nc 172.17.0.5 1883

I debug with gdb and found it seems that FlashMQ is iteratively reading multiple ContentType property values and saving them together.

image

image

songxpu avatar May 13 '24 12:05 songxpu

According to the specification of MQTTv5.0:

It is a Protocol Error to include the Topic Alias value more than once.

The situation is the same as the above. Replay such packet:

echo 101000044d5154540502003c03210014000030150006746f706963310623000a23000b313233313233 | xxd -p -r | nc 172.17.0.5 1883

songxpu avatar May 13 '24 12:05 songxpu

Hi @halfgaar, some findings still have not been reported in #103, so I have submitted this issue and apologize for any inconvenience caused.

No apologies necessary. I just didn't know you weren't done. I'm OK with you opening a separate issue per finding to avoid that problem.

Anyway, fixes are underway. I'll push some code soon.

Some open up some considerations. Like the invalid puback code. FlashMQ doesn't actually do anything with it; the only thing it can do is clear the message from the queue, which it must do always. Checking it would just add overhead to a potentially hot path of the code. :thinking:

halfgaar avatar May 13 '24 23:05 halfgaar

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

bigsmoke avatar May 14 '24 08:05 bigsmoke

Thanks, I'm honored to help make FlashMQ better!

songxpu avatar May 14 '24 08:05 songxpu

I addressed most of the issues in this branch: https://github.com/halfgaar/FlashMQ/tree/more-protocol-fixes

It's not final yet.

Currently, two are in an undecided state:

  • Sending PINGRESP even when it's a client/bridge. It doesn't really do harm, and the spec doesn't say it's a protocol error? I've seen weirder BTW. I've seen websocket clients that decide on their own to send pongs ...
  • The puback response code is currently still ignored. I just kind of dislike having to parse it merely for checking the protocol violation.

halfgaar avatar May 14 '24 12:05 halfgaar

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

How come a smiling face is "abuse"?

JaylinYu avatar May 14 '24 12:05 JaylinYu

According to the specification of MQTTv5.0:

[…]

I don't understand why and who marked this excellent, well-written comment by @songxpu as “abuse”. I know for a fact that @halfgaar loves to receive such detailed bug reports. Excellent stuff, @songxpu! 👏

How come a smiling face is "abuse"?

That comment was marked as abuse. I unmarked it again. I'm not sure what happened there.

halfgaar avatar May 14 '24 13:05 halfgaar

I had forgotten one mqtt5 property to check for 'more than once'. I fixed that. It's been pushed to master. A new release with this included will be made soon.

I decided to leave the ping/pong and suback codes unchanged. If you can show how it will actually be a problem, you can always open a new issue, or perhaps a discussion.

Thanks for the reports, and the good reproduction code :+1:. These 'not more than once' things were somewhere on my TODO list, so your report was good incentive.

halfgaar avatar May 15 '24 02:05 halfgaar

Hi, while organizing previous data, I discovered some bugs that I forgot to report earlier. Upon verification, they still exist in the latest version. Therefore, I have attached their information below.

# MQTT v5
3.3.2.3.2 Payload Format Indicator

· 0 (0x00) Byte Indicates that the Payload is unspecified bytes, which is equivalent to not sending a Payload Format Indicator.
· 1 (0x01) Byte Indicates that the Payload is UTF-8 Encoded Character Data. The UTF-8 data in the Payload MUST be well-formed UTF-8 as defined by the Unicode specification [Unicode] and restated in RFC 3629 [RFC3629].

According to this description, when the Payload Format Indicator in the PUBLISH message is set to 1, the payload in the PUBLISH message must be UTF-8 encoded. However, it appears that FlashMQ does not validate this.

PoC:

echo 106e00044d515454056e866e0711c4f23d2e17010012764d78635176754e4b6163615164477247521d0300027a4e260009577852505166527043000a6e467067664149535563000f487a4134464c4d3352526c6231486c00104e364958486246414654713051374b79000559304c56433d3900087751586170725838415f060101090001b8423663546c627950717a306b363233664d3249426c36717a6a5a474546456a6f3845363038fe | xxd -r -p | nc 127.0.0.1 1883

image

songxpu avatar Jul 26 '24 04:07 songxpu

Similarity,

# MQTT v5
3.1.3.2.3 Payload Format Indicator

Followed by the value of the Payload Format Indicator, either of:
· 0 (0x00) Byte Indicates that the Will Message is unspecified bytes, which is equivalent to not sending a Payload Format Indicator.
· 1 (0x01) Byte Indicates that the Will Message is UTF-8 Encoded Character Data. The UTF-8 data in the Payload MUST be well-formed UTF-8 as defined by the Unicode specification [Unicode] and restated in RFC 3629 [RFC3629]

According to this description, when the Payload Format Indicator in the CONNECT message is set to 1, the will message in the CONNECT message must be UTF-8 encoded. However, it appears that FlashMQ does not validate this.

POC:

106300044d51545405a6fd850c11987da8451900270b5434c60004534d763907010118e7f5bf8600104371794d4a3374706e773872744b364300126c397a7a71763353325647714f6a475951fe0016354278463574494f433735413474654d745869475467

image

songxpu avatar Jul 26 '24 07:07 songxpu

MQTTv5
format: $share/{ShareName}/{filter}

The ShareName MUST NOT contain the characters "/", "+" or "#", but MUST be followed by a "/" character. 
This "/" character MUST be followed by a Topic Filter.

If we send a packet the shared topic is "$share/{ShareName}/", i.e., filter is empty, and the broker will not reject it. I guess it is an unexpected behavior.

Packet:

101000044d5154540502003c032100140000
820f00010000092473686172652f312f00

image

songxpu avatar Jul 26 '24 08:07 songxpu

# MQTTv5
It is a Protocol Error to set the No Local bit to 1 on a Shared Subscription [MQTT-3.8.3-4].

But, if we send such an invalid packet to the broker, it responds to a SUBACK (code: successful) message rather than reject.

105700044d5154540540a3110d118fb95f59190121012f22cc75001e66304e79317a416b5369314f46396937397464756d53665a41686f4f6575001d774d51623130793346627158754c3043305455556b55364b5132364b45
8218ff000000122473686172652f653862676633616e2f474706

image

songxpu avatar Jul 26 '24 09:07 songxpu

In MQTT 5.0, the requester can specify an expected Response Topic in the request message. After taking appropriate action based on the request message, the responder publishes a response message to the Response Topic carried in the request. If the requester has subscribed to that Response Topic, it will receive the response.

Ref 1: https://www.emqx.com/en/blog/mqtt5-request-response

In MQTT 5.0, I believe this property field should not be allowed to be empty, because it will play a role in message transmission in some scenarios.

Packet:

10a10100044d51545405a6d79b13170122e432260002766100024d5127c1f00b75000c3667713668573039766670314102dd44adf20300144b526a5343646e53427066393358395531756b470800002600094835774f764d6e4f37001466345a314b636936466648545078614461337063000f476234475670636c737132496830440005307648626400194d46344f6d6c72764951773877755a6a666f3331636c457278

image

songxpu avatar Jul 26 '24 09:07 songxpu

I noticed that when a client communicates with the broker using MQTT version 5.0, after the client sends a PUBLISH packet with QoS==1, the broker returns a PUBACK in MQTT version 3.1.1 format. I believe that such protocol version downgrading is not expected behavior and may impact some clients.

# MQTTv5
+----------------+-----------------+------------+------------------+
| Fixed Header   | Message ID      | Reason Code| Properties       |
+----------------+-----------------+------------+------------------+
| 0x40           | Message ID (2 bytes) | 0x00  | Properties (if any) |
+----------------+-----------------+------------+------------------+
#MQTTv3.1.1
+----------------+-----------------+
| Fixed Header   | Message ID      |
+----------------+-----------------+
| 0x40           | Message ID (2 bytes) |
+----------------+-----------------+

replay.py

import socket
import binascii

broker_address = '127.0.0.1'
broker_port = 1883

messages = [
    '10800100044d51545405c20001411701212e3b22464426001a5a554c7742485a7a676a6e6579664f77447230496a374a46304c001a38436f4c643657616b6a6a4e74306c436b6f78536b593143356c001b6d6c33456b677234313272796d6d707a5a7664534e6c735965594200086732514e52776a38000b4743684f424f6c7a796762',
    '320f0005746f706963000100746f706963'
]

def send_and_receive_message(sock, message):
    binary_message = binascii.unhexlify(message)
    sock.sendall(binary_message)
    response = sock.recv(1024)
    hex_response = binascii.hexlify(response).decode()
    return hex_response

def main():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect((broker_address, broker_port))
    
    try:
        for message in messages:
            hex_response = send_and_receive_message(sock, message)
            print(f"Sent message: {message}")
            print(f"Received response: {hex_response}")
    finally:
        sock.close()

if __name__ == "__main__":
    main()

FlashMQ: image

image

songxpu avatar Jul 26 '24 10:07 songxpu

I noticed that when a client communicates with the broker using MQTT version 5.0, after the client sends a PUBLISH packet with QoS==1, the broker returns a PUBACK in MQTT version 3.1.1 format. I believe that such protocol version downgrading is not expected behavior and may impact some clients.

Actually, from the spec:

Byte 3 in the Variable Header is the PUBACK Reason Code. If the Remaining Length is 2, then there is no Reason Code and the value of 0x00 (Success) is used.

Property length: If the Remaining Length is less than 4 there is no Property Length and the value of 0 is used.

halfgaar avatar Jul 26 '24 18:07 halfgaar

The other issues are addressed in a branch. It needs some reviewing. Will push later.

halfgaar avatar Jul 26 '24 20:07 halfgaar

I noticed that when a client communicates with the broker using MQTT version 5.0, after the client sends a PUBLISH packet with QoS==1, the broker returns a PUBACK in MQTT version 3.1.1 format. I believe that such protocol version downgrading is not expected behavior and may impact some clients.

Actually, from the spec:

Byte 3 in the Variable Header is the PUBACK Reason Code. If the Remaining Length is 2, then there is no Reason Code and the value of 0x00 (Success) is used.

Property length: If the Remaining Length is less than 4 there is no Property Length and the value of 0 is used.

really interesting. Thanks for your replay.

songxpu avatar Jul 27 '24 02:07 songxpu

The other issues are addressed in a branch. It needs some reviewing. Will push later.

Fixes for those others have been pushed to master.

halfgaar avatar Jul 27 '24 17:07 halfgaar