opcua-asyncio icon indicating copy to clipboard operation
opcua-asyncio copied to clipboard

load_data_type_definitions() fails on string data type identifier that contains double quotes

Open sroy66 opened this issue 3 years ago • 13 comments

Description
When loading the data type definitions from a Siemens S71500 PLC that has a UDT (user data type), the identifier of that data type is a string that also contains double quotes. In my case the node identifier is ns=3;s=DT_"sData". That name is generated by the PLC. The method fails with the following output:

Failed to execute auto-generated code from UA datatype: 

@dataclass
class sData:

    '''
    sData structure autogenerated from StructureDefinition object
    '''
    data_type = ua.NodeId.from_string("ns=3;s=DT_"sData"")

    int_value: ua.Int16 = 0
    real_value: ua.Float = 0
    timestamp: ua.UInt64 = 0
Traceback (most recent call last):
  File "/home/stroy/src/pydaq/env/lib/python3.7/site-packages/asyncua/common/structures104.py", line 245, in _generate_object
    exec(code, env)
  File "<string>", line 9
    data_type = ua.NodeId.from_string("ns=3;s=DT_"sData"")
                                                      ^
SyntaxError: invalid syntax
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 107, in run
    asyncio.run(self.schedule_coroutines())
  File "/usr/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 104, in schedule_coroutines
    await asyncio.gather(self.opc_subscribe(), self.opc_reply(), self.tcp_stream())
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 78, in opc_subscribe
    await client.load_data_type_definitions()
  File "/home/stroy/src/pydaq/env/lib/python3.7/site-packages/asyncua/client/client.py", line 676, in load_data_type_definitions
    return await load_data_type_definitions(self, node, overwrite_existing=overwrite_existing)
  File "/home/stroy/src/pydaq/env/lib/python3.7/site-packages/asyncua/common/structures104.py", line 324, in load_data_type_definitions
    env = await _generate_object(dts.name, dts.sdef, data_type=dts.data_type)
  File "/home/stroy/src/pydaq/env/lib/python3.7/site-packages/asyncua/common/structures104.py", line 245, in _generate_object
    exec(code, env)
  File "<string>", line 9
    data_type = ua.NodeId.from_string("ns=3;s=DT_"sData"")
                                                      ^
SyntaxError: invalid syntax

To Reproduce
The only way I know how to reproduce this would be to create a UDT named sData on a Siemens S71500 PLC then call load_data_type_definitions() on the client

    async def opc_subscribe(self):
        client = Client(url=self.config["opc-server"])
        logger.info(f'{self.config["opc-server"]}')
        async with client:
            await client.connect()
            await client.load_data_type_definitions()
...

Expected behavior
A class sData should be created

Workaround
I changed make_structure_code() in structures104.py to use single quotes with the to_string() method

def make_structure_code(data_type, struct_name, sdef):
    """
    given a StructureDefinition object, generate Python code
    """
    if sdef.StructureType not in (ua.StructureType.Structure, ua.StructureType.StructureWithOptionalFields):
        # if sdef.StructureType != ua.StructureType.Structure:
        raise NotImplementedError(f"Only StructureType implemented, not {ua.StructureType(sdef.StructureType).name} for node {struct_name} with DataTypdeDefinition {sdef}")

    code = f"""

@dataclass
class {struct_name}:

    '''
    {struct_name} structure autogenerated from StructureDefinition object
    '''
    data_type = ua.NodeId.from_string('{data_type.to_string()}')
...
"""

Version
Python-Version: 3.73
opcua-asyncio Version :0.9.92

sroy66 avatar Feb 01 '22 22:02 sroy66

thats an issue with python not with the library! https://realpython.com/invalid-syntax-python/

most safest way is to use tripple quotes 😅

AndreasHeine avatar Feb 02 '22 05:02 AndreasHeine

I think it's a bit more subtle.

The code in question is already in triple quotes as it's attempting to dynamically write a Python class on the fly, based on information from the OPC server (PLC). In general with Python, a string may be defined by double or single quotes, they are functionally equivalent, until you put a double or single quote into your string. You can wrap a double quote in a single quote (or vice-versa), but if you must use all double quotes, then you need to escape with \. PEP does not have any opinion on this, save for using triple quotes for multiline/docstring type strings.

The problem in this case, is that the Siemens PLC OPC server injects double quotes into node IDs and when trying to parse the user data type (UDT), you need to know that the quotes are in there.

I wonder if a better place to resolve this is in the to_string() method, where we could check if the node ID contains single or double quotes, and parse/return the string accordingly?

tdpu avatar Feb 02 '22 06:02 tdpu

I "think" that perhaps you could do something like:

# asyncua/ua/uatypes.py NodeId.to_string()
...
    def to_string(self):
        string = []
        if self.NamespaceIndex != 0:
            if '"' in self.NamespaceIndex:
                string.append(f'ns={self.NamespaceIndex}')
            else:
                string.append(f"ns={self.NamespaceIndex}")
...

This seems a bit hacky, and I'm not sure if there are other places where this might come up.

(note: @sroy66 and I work together, we can potentially test and submit a PR, I don't have the hardware with me right now)

tdpu avatar Feb 02 '22 06:02 tdpu

We already have code to remove strange characters from extension object names. It should be possible to also use it for strange string nodeids

oroulet avatar Feb 02 '22 07:02 oroulet

tried to have a short look: https://github.com/FreeOpcUa/opcua-asyncio/pull/784

oroulet avatar Feb 02 '22 07:02 oroulet

Depending on the combination, a wrapping may work. Otherwise we need to add escape sequences before such things:


>>> a = "normal wrapped ' stuff' "
>>> a
"normal wrapped ' stuff' "
>>> b = 'other way "round"'
>>> b
'other way "round"'
>>> c = 'with escape it\'s fine'
>>> c
"with escape it's fine"
>>> d = "even with \"double quotes\" this works"
>>> d
'even with "double quotes" this works'
>>> 

So we need to regex stuff for single and double quotes and may have to add an \ to it.

swamper123 avatar Feb 02 '22 09:02 swamper123

I suppose that technically the Node ID is compatible with Python syntax (w.r.t. #784), but not if you try to wrap that particular string in double quotes. You could argue (as I do) that putting quotes in something that might be used as a variable name is a bad idea, but apparently that's what happens by default.

Does the Node ID for the UDT have to match, or can we change it (by removing quotes) - I don't know enough about the load_data_type_definitions(), but the workaround that the OP suggested in fact works for us.

The name of the UDT is DT_"sData", which when inserted into the Node ID string "ns=3;s=DT_"sData"" gives you the offending, unescaped double quote situation.

tdpu avatar Feb 04 '22 06:02 tdpu

The workaround should work for Node IDs of type string that contain double quotes but is otherwise brittle. I cannot find anything in the standard limiting the use of strings in Node IDs so something like ns=3;s=DR_'sData' (single quotes) could occur, too. The proper Python-way to represent a string so that it becomes a string literal (which can be inserted into a code fragment like in the template), is using repr(). So maybe a better replacement for line structures104.py:169

data_type = ua.NodeId.from_string("{data_type.to_string()}")

would be

data_type = ua.NodeId.from_string({repr(data_type.to_string())})

Not sure about implications in other parts of the code, though.

padelt avatar Feb 14 '22 13:02 padelt

does the last version here solves the issue? https://github.com/FreeOpcUa/opcua-asyncio/pull/784

oroulet avatar Feb 26 '22 16:02 oroulet

I does not solve the issue. I get the same error however the solution from padelt works

data_type = ua.NodeId.from_string({repr(data_type.to_string())})

sroy66 avatar Mar 04 '22 00:03 sroy66

@sroy66 Can you show the output? The solution with repr changes the node id by adding some extra quotes one more time, it will break things. The only correct thing to do is to escape the quotes using backslash

oroulet avatar Mar 04 '22 07:03 oroulet

My PLC program has changed a bit so it fails on a different struct first but it's appears to be the same problem.

2022-03-07 09:17:08,010 - pydaq.plugin.demo - INFO - setup channel: demo/channels/time
2022-03-07 09:17:08,011 - pydaq.plugin.demo - INFO - setup channel: demo/channels/AI00
2022-03-07 09:17:08,011 - pydaq.plugin.demo - INFO - setup channel: demo/channels/AI01
2022-03-07 09:17:08,011 - pydaq.plugin.demo - INFO - setup channel: demo/channels/AI02
Requested session timeout to be 3600000ms, got 30000ms instead
2022-03-07 09:17:08,218 - pydaq.widget.real_time_chart - INFO - Connecting buffer chart1/time to channel demo/channels/time
2022-03-07 09:17:08,218 - pydaq.widget.real_time_chart - INFO - Connecting buffer chart1/BUF00 to channel demo/channels/AI00
2022-03-07 09:17:08,218 - pydaq.widget.real_time_chart - INFO - Connecting buffer chart1/BUF01 to channel demo/channels/AI01
2022-03-07 09:17:08,219 - pydaq.widget.real_time_chart - INFO - Connecting buffer chart1/BUF02 to channel demo/channels/AI02
None_ - EnumField(Value=0, DisplayName=LocalizedText(Locale=None, Text='None'), Description=LocalizedText(Locale=None, Text=None), Name='None') False 

class TraceLevel(IntEnum):

    '''
    TraceLevel EnumInt autogenerated from EnumDefinition
    '''

    Error = 32
    Warning = 48
    System = 56
    Info = 60
    Debug = 62
    Content = 63
    All = -1
    None_ = 0

PgmTest - EnumField(Value=19, DisplayName=LocalizedText(Locale=None, Text='PgmTest'), Description=LocalizedText(Locale=None, Text=None), Name='PgmTest') False 

class SimaticOperatingState(IntEnum):

    '''
    SimaticOperatingState EnumInt autogenerated from EnumDefinition
    '''

    NotSupported = 0
    StopFwUpdate = 1
    StopSelfInitialization = 3
    Stop = 4
    Startup = 6
    Run = 8
    RunRedundant = 9
    Halt = 10
    RunSyncUp = 11
    SyncUp = 12
    Defective = 13
    ErrorSearch = 14
    NoPower = 15
    CiR = 16
    STOPwithoutODIS = 17
    RunODIS = 18
    PgmTest = 19

renamed "OpcMethodSetPlcTime"."UAMethod_InParameters" to _OpcMethodSetPlcTime_UAMethod_InParameters_ due to Python syntax
Failed to execute auto-generated code from UA datatype: 

@dataclass
class typeOpcUaStatus:

    '''
    typeOpcUaStatus structure autogenerated from StructureDefinition object
    '''

    data_type = ua.NodeId.from_string("ns=3;s=DT_"typeOpcUaStatus"")

    done: ua.Boolean = True
    busy: ua.Boolean = True
    error: ua.Boolean = True
    status: ua.UInt16 = 0
Traceback (most recent call last):
  File "/home/stroy/src/opcua-asyncio/venv-idname/lib/python3.7/site-packages/asyncua/common/structures104.py", line 245, in _generate_object
    exec(code, env)
  File "<string>", line 10
    data_type = ua.NodeId.from_string("ns=3;s=DT_"typeOpcUaStatus"")
                                                                ^
SyntaxError: invalid syntax
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 315, in run
    asyncio.run(self.schedule_coroutines())
  File "/usr/lib/python3.7/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 311, in schedule_coroutines
    await asyncio.gather(self.opc_subscribe(), self.tcp_client())
  File "/home/stroy/src/pydaq-plugins/pydaq-opcua/opcua_client_plugin/opcua_client.py", line 252, in opc_subscribe
    self.typeDefinitions = await self.client.load_data_type_definitions()
  File "/home/stroy/src/opcua-asyncio/venv-idname/lib/python3.7/site-packages/asyncua/client/client.py", line 676, in load_data_type_definitions
    return await load_data_type_definitions(self, node, overwrite_existing=overwrite_existing)
  File "/home/stroy/src/opcua-asyncio/venv-idname/lib/python3.7/site-packages/asyncua/common/structures104.py", line 324, in load_data_type_definitions
    env = await _generate_object(dts.name, dts.sdef, data_type=dts.data_type)
  File "/home/stroy/src/opcua-asyncio/venv-idname/lib/python3.7/site-packages/asyncua/common/structures104.py", line 245, in _generate_object
    exec(code, env)
  File "<string>", line 10
    data_type = ua.NodeId.from_string("ns=3;s=DT_"typeOpcUaStatus"")
                                                                ^
SyntaxError: invalid syntax

sroy66 avatar Mar 07 '22 17:03 sroy66

I think the escape quote approach should work, but it doesn't appear to be getting through to the make_structure_code function.

If we hardcode in our data type with escape characters on the doublequotes:

#structures104.py
def make_structure_code(data_type, struct_name, sdef):
...
    code = f"""

@dataclass
class {struct_name}:

    '''
    {struct_name} structure autogenerated from StructureDefinition object
    '''

    #data_type = ua.NodeId.from_string("{data_type.to_string()}")
    data_type = ua.NodeId.from_string("ns=3;s=DT_\\"sData\\"")

"""
...

It actually works... but currently with #784 the data_type.to_string() method returns "ns=3;s=DT_"sData"" and raises a SyntaxError.

As an additional point, we don't see why the repr fix shouldn't work... at the point where the SyntaxError occurs you're adding double-quotes around the return value of data_type.to_string() - repr also does this, but defaults to single quotes - but it will escape quotes (double or single) automatically such that you get the correct string out. With limited testing we also confirmed that it will work for both single and double quotes.

tdpu avatar Mar 08 '22 22:03 tdpu