PcapPlusPlus Add Modbus Protocol Support

Hello PcapPlusPlus maintainers,

I’d love to contribute to the library by implementing support for the Modbus protocol, a widely used communication protocol in industrial automation. Would this be a welcome addition to the project? If so, I'd appreciate any guidance or recommendations for implementing it according to PcapPlusPlus' standards.

Looking forward to your feedback!

May 12 '25 21:05 yahyayozo

Sure @yahyayozo , every contribution is highly welcome and appreciated, thank you! 🙏

Can you provide some info about the Modbus protocol? Mostly about the different PDUs and message types. Based on this, I can recommend a similar protocol you can use as a refrence

May 13 '25 09:05 seladb

Modbus is a client/server data communications protocol in the application layer, used mainly in industrial automation. It is basically for reading and writing register values in the server.

The Frame starts by a MBAP (MODBUS Appicaiton Protocol Header) (7 bytes):

Transaction Identifier (2 bytes): A unique ID for matching client requests with server responses.
Protocol Identifier (2 bytes): Set to 0x0000 for Modbus protocol.
Length (2 bytes): Specifies the length of the remaining bytes (Unit Identifier + PDU).
Unit Identifier (1 byte): Identifies the target device.

The PDU consists of:

Function Code (1 byte): Defines the operation (e.g., read/write)
Data (variable): Contains operation-specific parameters (e.g., register address, data values). For responses, it includes requested data or exception codes.

You can refer to this official document from Modbus for its specifications. You can find all the function codes and their meanings there

May 13 '25 18:05 yahyayozo

Haven't looked at it in depth, but a way I can see it implemented in the framework can be something like this:

A main ModbusLayer that parses the frame header.
A collection of helper structures that represent known PDU configurations.

Each PDU must contain a method to serialize / deserialize itself to/from a byte span. The method must be standard across all PDUs. Deserialization is probably better to be a static member function that acts as a factory for the PDU.

The main layer can have the following methods:

general frame header accessors/mutators.
getPDUFunctionCode for parsing the first byte of the PDU.
decodePDU<T>() - templated method where T is a class satisfying the requirements for PDU, namely static T T::deserialize(uint8_t const* data, size_t dataLen);. The method passes the encoded PDU section of the packet data to the deserialization function and returning the constructed type.
setPDU<T>(T const& pdu) - templated method where T is a class satifying the requirements for PDU, namely size_t T::serialize(uint8_t* data, size_t dataLen);. The method passes the allocated PDU data buffer of the layer to the PDU class for writing and handles any need for reallocation, extension of the layer, or shrinking if overwriting a larger PDU.

PDU concept requirements:

static T T::deserialize(uint8_t const* data, size_t dataLen) - constructs a valid PDU from the buffer. Exceptions on error, or optional<T> if we want to backport it from cpp17. Depending on the expected failure rate.
size_t T::serialize(uint8_t* data, size_t dataLen) - writes the serialized representation of the PDU (function code included) to the buffer. Returns the number lf bytes written. If dataLen is insufficient or data is nullptr, nothing is written, and the required number of bytes in the buffer is returned instead. The data written by this method must be able to be parsed with T::deserialize.

IMO, this framework should be robust and flexible enough to allow for both the public function code PDUs, as well as allowing users to define custom PDUs, as I read that the protocol has function codes reserved for user defined PDUs.

This does put the burden of selecting the correct PDU on the user code, but IMO, the user would already be doing that anyway if direct interaction with the PDU is required.

PS: Function names can be subject to change.

May 13 '25 19:05 Dimi1010

@yahyayozo my understanding is that Modubs layer has:

7 first bytes - always present and have the same structure (as mentioned above)
1 byte - to specify the message type
Variable size - according to the message size

Is that correct? If yes, maybe we can build something to the BgpLayer which has this general structure:

ModbusLayer - an abstract base class that includes getters/setters to the common header field (the first 8 bytes)
A child class for each specific message that inherits from ModbusLayer

Please let me know what you think

May 14 '25 08:05 seladb

@seladb Imo, a composition approach would be better than an inheritance one here, as there are a lot of PDU functions and the potential for people wanting to define their custom PDUs. What do you think about my proposal?

May 14 '25 08:05 Dimi1010

@Dimi1010 how do you envision people using this layer (once it's been desirialized)?

auto modbusLayer = modbusPacket.getLayerOfType<pcpp::Modbus>();
// get the common header fields
modbusLayer->getTransactionId();
modbusLayer->getLength();

// get the specific PDU and use its fields - how do you envision that? How can the user know which PDU it is?

May 14 '25 08:05 seladb

@seladb Here is an example.

auto modbusLayer = modbusPacket.getLayerOfType<pcpp::Modbus>();
// get the common header fields
modbusLayer->getTransactionId();
modbusLayer->getLength();

auto fnCode = modbusLayer->getPDUFunctionCode(); // Parses only the first byte of the PDU payload.
/* select PDU type based on function code. Via switch probably. Can possibly be encapsulated in a support class later, if needed. */

// May possibly return optional or unique_ptr<PDU> to allow for graceful error condition if the user selects the wrong type?
PDUType pdu = modbusLayer->decodePDU<PDUType>();

/* User uses the PDU in its decoded state as he wishes */
auto someField = pdu.someField;

// If the pdu needs to be updated or constructed the user constructs/modifies a PDU
pdu.value = 55;
modbusLayer->setPDU(pdu); // <- The layer updates the internal buffer with data from pdu variable.

This approach does mean the user must select the correct PDU to decode with, but the same thing must also be done with dynamic_casting to specific layer types, if the PDUs were represented by different layers and the user wanted to access a specific PDU information.

It also allows people to add new user defined PDU's without having to implement a whole new layer class.

May 14 '25 10:05 Dimi1010

@Dimi1010 @seladb thank you for your comments @Dimi1010, for your last explanation, is "PDUType" a struct for a public PDU type? so, we need to define a different struct for each public message type? also, can you please add an example of how the user can add his own defined PDU Type?

May 14 '25 20:05 yahyayozo

@yahyayozo yes, PDUType is a placeholder for a public PDU struct.

Here is an example implementation using the ReadCoils public function code. For simplicity the example ignores endianness.


class ModbusLayer : public Layer
{
  /* common header accessors / mutators */

  template<typename T>
  T decodePDU() const
  {
     auto pduPtr = /* set to the beginning of the PDU */;
     auto pduLen = /* set to the total length of the PDU buffer */;
    
    // May alternatively be constructor?
    return T::deserialize(pduPtr, pduLen);
  }

  template<typename T>
  void setPDU(T const& pdu)
  {
     auto pduPtr = /* set to start of pdu buffer allocated in the packet */;
     auto pduLen = /* set to the length of the pdu buffer allocated in the packet */;

     auto writtenBytes = pdu.serialize(pduPtr, pduLen);
     if(writtenBytes < pduLen)
     { 
        // Buffer is larger than required
        // Shrink the PDU buffer to the written size, discarding the extra space.
     }
     else if (writtenBytes > pduLen)
     {
        // Buffer is smaller than required
        // Extend the buffer to the required size and redo serialization.
        extendLayer(/* args */);
        pduPtr = ...;  // Recalculate - reallocation would have invalidated the pointer
        pduLen = ...;  // Recalculate - reallocation would have changed the length
        writtenBytes = pdu.serialize(pduPtr, pduLen); // Retries the write.
        if (writtenBytes != pduLen)
        {
           // Throw exception. The retry should have worked.
        }
     }
  }
}

struct ModbusReadCoilsPDURequest
{
   // These probably would benefit from getters and setters to enforce constraints. 
   uint16_t startingAddress; // allowed range - full
   uint16_t coilQuantity;      // allowed range - [1, 2000]

    size_t serialize(uint8_t* data, size_t dataLen) const
    {
       constexpr size_t requiredSize = 1 /* function code */ + 2 /* address */  + 2 /* quantity */;
       // TODO: Also return requiredSize if data == nullptr and dataLen == 0;
       //  If data == nullptr, but dataLen != 0 throw exception as that is probably a logic error.
       if(dataLen < requiredSize)
           return requiredSize;

       // TODO: This ignores endianness for brevity.
       data[0] = 0x01;
       data[1] = static_cast<uint8_t>(startingAddress & 0xFF); // Write low
       data[2] = static_cast<uint8_t>((startingAddress >> 8) & 0xFF); // Write high
       
       data[3] = static_cast<uint8_t>(coilQuantity& 0xFF); // Write low
       data[4] = static_cast<uint8_t>((coilQuantity>> 8) & 0xFF); // Write high

       return requiredSize;
    }

    static ModbusReadCoilsPDURequest deserialize(uint8_t const* data, size_t dataLen)
    {
       if(data == nullptr) { /* throw exception, nullptr buffer */}
       if(dataLen < 5) { /* throw exception, insufficient buffer size to deserialize */ }

       if(data[0] != 0x01 /* function code */) { /* throw exception, bad function code */ }

       ModbusReadCoilsPDURequest result;
       
       // TODO: This ignores endianness for brevity.
       result.startingAddress = static_cast<uint16_t>(data[1]) | (static_cast<uint16_t>(data[2]) << 8);
       result.coilQuantity = static_cast<uint16_t>(data[3]) | (static_cast<uint16_t>(data[4]) << 8);

       return result;
    }
}

struct ModbusReadCoilsPDUResponse
{
   // Leaving coil status as a byte buffer for now. It is a packed bitmask, so we can add helper functions later?
   // Could also be an std::array<uint8_t, 255> to avoid heap allocations?
   std::vector<uint8_t> coilStatus;  

   size_t serialize(uint8_t* data, size_t dataLen) const
   {
       // TODO: Add exception if coilStatus.size() is over 255 as it will overflow the 1 byte length slot of the PDU spec.
       
       const size_t requiredSize = coilStatus.size() + 1 /* function code (1 byte) */ + 1 /* buffer len (1 byte) */;

       // TODO: Also return requiredSize if data == nullptr and dataLen == 0;
       //  If data == nullptr, but dataLen != 0 throw exception as that is probably a logic error.
       if(dataLen < requiredSize)
           return requiredSize;

       data[0] = 0x01; // Function code. Possibly have an enum?
       data[1] = static_cast<uint8_t>(coilStatus.size());  // Check for overflow at start of serialize.
       std::copy(coilStatus.begin(), coilStatus.end(), data + 2);
       return requiredSize;
   }

   static ModbusReadCoilsPDUResponse deserialize(uint8_t const* data, size_t dataLen)
   {
       if(data == nullptr) { /* throw exception, nullptr buffer */}
       if(dataLen < 2) { /* throw exception, insufficient buffer size to deserialize */ }

       if(data[0] != 0x01 /* function code */) { /* throw exception, bad function code */ }
       
      ModbusReadCoilsPDUResponse result;

       uint8_t* bufferPtr = data + 2;
       size_t bufferLen = data[1];
       if(bufferLen > 0)
       {
           result.coilStatus.resize(bufferLen);
           std::copy(bufferPtr, bufferPtr + bufferLen, result.coilStatus.begin());
       }

       return result;
   }
}

The way it can be decoded. Modbus protocol is used and both the Request and Response PDUs use the same function code. Unless I have missed anything, I think the user has to decide if the protocol is a request or response?

auto modbusLayer = modbusPacket.getLayerOfType<pcpp::Modbus>();
/* ... */

auto fnCode = modbusLayer->getPDUFunctionCode(); // Assume returns 0x01 for the purposes of this example;

ModbusReadCoilsPDUResponse pdu = modbusLayer->decodePDU<ModbusReadCoilsPDUResponse>();

// Do whatever with the data.
auto xxx = pdu.coilStatus;

The way a layer can be constructed.


ModpubReadCoilsPDURequest requestPdu; // Possibly have a constructor for direct assignment?
requestPdu.startingAddress = 14574;
requestPdu.coilQuantity = 500;

MobusLayer modbusLayer; // Possibly have a constructor for direct construction from a PDU?

modbusLayer.setPdu(requestPdu);

/* use the layer */

also, can you please add an example of how the user can add his own defined PDU Type?

By defining his own structure that implements:

size_t UserStruct::serialize(uint8_t* data, size_t dataLen);
static UserStruct UserStruct::deserialize(uint8_t const* data, size_t dataLen);

Due to the way ModbusLayer::decodePDU and ModbusLayer::setPDU are defined as a template methods, any structure that satisfies the requirements can be used to encode or decode to the layer buffer.

Hope that answered your questions.

May 14 '25 21:05 Dimi1010

@Dimi1010 thank you for the detailed explanation, it makes it much clearer. For the pull request, do you suggest I should update it continuously with my progress? Or submit after I finish the whole implementation? I prefer the first approach as it makes me know if I'm on the current track!

May 14 '25 22:05 yahyayozo

@yahyayozo The first approach is preferred, but its up to you.

You can open a PR to the dev branch in draft mode while you are working. That way you can ask questions or receive input while in-progress. It would also allow you to have the CI pipeline check the work.

May 14 '25 22:05 Dimi1010

Thank you @Dimi1010 for the detailed explanation! I think this approach is cleaner than what I proposed and makes a lot of sense.

Few comments / questions:

What should happen if the user tries to call decodePDU with the wrong type? Should it throw an exception?
We can probably add another method bool isPduOfType<T>() that can look at the PDU function code and tell whether T is the PDU type

May 15 '25 06:05 seladb

What should happen if the user tries to call decodePDU with the wrong type? Should it throw an exception?

Ideally, yes. Although for performance reasons, we should probably also have a tryDecode that operates without throwing anywhere in the path if the decode fails for an expected reason. (Bad data, null/insufficient buffer, etc).

Ideally, I would use an std::optional<T> for return type, but that is cpp17. We might want to add an optional backport implementation or use std::unique_ptr<T> at the cost of heap allocation.

The method can be something like this:

Run a helper function bool T::canDeserialize(uint8_t const* data, size_t dataLen) 2.1 If 1 returns true, run decodePDU. 2.2 If 1 returns false, return nothing (nullptr, nullopt).

We can probably add another method bool isPduOfType<T>() that can look at the PDU function code and tell whether T is the PDU type

That can be a good addition. It can run a check on the function code and maybe forward to a static method of T for additional checks?

May 15 '25 09:05 Dimi1010

return type, but that is cpp17. We might want to add an optional backport implementation or use std::unique_ptr<T> at the cost of heap allocation.

I guess decodePDU<T>() can return std::unique_ptr<T> which can return nullptr if decoding is not possible. I agree that we should probably avoid throwing exceptions.

May 16 '25 05:05 seladb

@yahyayozo is there anything else you need from us in order to get started?

If you want, you can implement it gradually using multiple PRs, which will probably be easier to review. For example:

1st PR: a minimal implementation of ModbusLayer, without any PDU decoding
2nd PR: add PDU decoding infrastructure with one PDU as an example
3rd PR: add the rest of the PDUs

Any other breakdown could also work. Please let me know what you think

May 16 '25 05:05 seladb

@seladb i think i can start Is there any code style i need to figure follow? Or doxygen documentation needed?

May 16 '25 06:05 yahyayozo

@seladb i think i can start Is there any code style i need to figure follow? Or doxygen documentation needed?

Yes, you can read most of it in CONTRIBUTING.md. We use cppcheck and clang-format for code formatting. We add doxygen for every API exposed to users, you can look at other layers to know the style. Our doxygen documentation is here: https://pcapplusplus.github.io/api-docs/v25.05/ Please let me know if you have any additional quetsions

May 16 '25 06:05 seladb

@seladb thanks I'll let you know with any update

May 16 '25 06:05 yahyayozo

Done in #1823 . Thank you so much @yahyayozo for working on it! 🙏

Aug 26 '25 06:08 seladb

PcapPlusPlus PcapPlusPlus copied to clipboard

Add Modbus Protocol Support

PcapPlusPlus
PcapPlusPlus copied to clipboard