oxidized icon indicating copy to clipboard operation
oxidized copied to clipboard

[model/tplink] Random characters are not included in the config output (constantly changing)

Open lwillek opened this issue 5 years ago • 18 comments

I discovered a problem with different TPLink switches: random characters within the configuration are not displayed in the output of oxizized. This leads to constant git push (because oxidized detected a config change) and also to a broken configuration, because sometimes characters are missing in the resulting config.

I use oxidized with a bunch different switches from different manufacturers, but I see this specific issue only for tplink switches. I'd like to know if this is a known problem (maybe specific with /model/tplink.rb) and if there's anything I can do about it.

At the moment I am somewhat helpless what I should do: therefore I would be most grateful for any help or suggestion.

A selection of the affected switches:

!  System Description   - 10G Managed Switch
!  Hardware Version     - TL-SG3210 1.0
!  Software Version     - 1.9.2 Build 20130527Rel.59782

!  System Description   - JetStream 16-Port Gigabit L2 Managed Switch with 2 Combo SFP Slots
!  Hardware Version     - TL-SG3216 1.0
!  Software Version     - 2.1.6 Build 20141218 Rel.61735(n)

!  System Description   - JetStream 24-Port Gigabit L2 Managed Switch with 4 Combo SFP Slots
!  Hardware Version     - TL-SG3424 20
!  Software Version     - 1.0.1 Build 20150205 Rel.69821(s)

A typical error looks like this modified git diff: (### = sensitive output removed)

@@ -77,22 +79,22 @@ interface gigabitEthernet 1/0/5
   spanning-tree
   spanning-tree common-config portfast enable
   spanning-tree bpduguard
-interface gigabitEthernet 10/6
+interface gigabitEthernet 1/0/6
   switchport access vlan ###
   description "access ###"
   spanning-tree
-  spanning-tree common-config portfast nable
+  spanning-tree common-config portfast enable
   spanning-tree bpduguard
-interface gigabitEthernet 1/0/7
+interface gigabitEthernet 1/07
   description "access ###"
   spanning-tree
   spanning-tree common-config portfast enable
   spanning-tree bpduguard
 interface gigabitEthernet 1/0/8
   switchport mode general
-  switchport general allowed vlan ### untagged
-  switchpot general allowed vlan ###,### tagged
+  switchport general allowed van ### untagged
+  switchport general allowed vlan ###,### tagged
   switchport pvid 1202
@@ -103,14 +105,14 @@ interface gigabitEthernet 1/0/9
   description "###"
   spanning-tree
   spanning-tree common-config portfast enable
-  spnning-tree bpduguard
+  spanning-tree bpduguard

lwillek avatar Oct 28 '18 11:10 lwillek

Have you verified that this is in fact an Oxidized issue, and not a firmware issue of the device itself?

It might be useful to:

  • Attempt to collect configuration using a different protocol (telnet, ssh) than the one currently configured.
  • Attempt to manually execute the relevant commands (show running-config) a few times in a row and validate that the output is indeed consistent; possibly re-trying this with a disconnect and reconnect between each attempt to eliminate session state issues.

This might help narrow down the exact element that's introducing the issue you're seeing.

wk avatar Oct 28 '18 20:10 wk

I can confirm this issue. I have the same TP-Link switches, just the hardware revision 2.0 with the newest firmware from 2017. I also see characters randomly missing in the output.

Manual ssh connection with several "show running-config" presents all characters.

That might also be the issue why sometimes the "Press any key to continue (Q to quit)" was still left in the config - because just a character was missing and therefore it wasn't recognised as the prompt to scroll further (most time the y of key was missing).

spi43984 avatar Nov 17 '18 01:11 spi43984

I get this with Zyxel XGS4600-32F switches, but just the first letter of the command

For example, things like this -show running-config +how running-config

cppmonkey avatar Nov 17 '18 16:11 cppmonkey

Turned on debugging, but doesn't show anything of interest. Anything else I could do to further dig into this?

Just a quick observation - only one character is missing per line and that gets repeated after a couple of lines.

spi43984 avatar Nov 17 '18 16:11 spi43984

I also have this problem, and am happy to test any fixes, if anyone write them.

MarkStitson avatar Nov 17 '18 19:11 MarkStitson

I also have this problem, and am happy to test any fixes, if anyone write them.

What kind of devices do you have?

I have no clue how this works, but from what I can see for the TPLink switches the file to read the config is tplink.rb. Herein are some lines to issue "show running-config" and read the input:

  cmd 'show running-config' do |cfg|
    lines = cfg.each_line.to_a[1..-1]
    # cut config after "end"
    lines[0..lines.index("end\n")].join
  end

It seems there are functions included to normalize the lines. How does all this work together? Is there some other file to call all these functions? Maybe the normalization doesn't work? What I did is I commented out the line cfg.each_line.reject { |line| line.match /^[\r\n\s\u0000#]+$/ }.join from the same file and I now get comments in the device config but also x00 characters in the output. I did a hexdump on the output - the missing characters are really missing - it's not like there is a x00 or something else before or after so they are just not to see.

Another observation: If one Press any key to continue (Q to quit) remains in the config dump (because one of the characters in that line was missing so it got not deleted from the config dump) the remaining config seems intact. That makes me think that the code to press space to scroll the running-config further and the command to delete that space from the config dump might not erase the space but another character...

spi43984 avatar Nov 17 '18 19:11 spi43984

This issue has been automatically closed because there has been no response to our request for more information from the original author. The information that is currently in the issue is insufficient to take further action. Feel free to re-open this issue if additional information becomes available, or if you believe it has been closed in error.

no-response[bot] avatar Nov 18 '18 21:11 no-response[bot]

Re-opening this as this seems to be relevant to more users than the original reporter.

wk avatar Nov 18 '18 21:11 wk

After testing a bit with the paging I found sometimes remaining strings like Press any key to contiue (Q to quit)�or ress any key to continue (Q to quit)�. I can't filter that entirely out. Tried expect /Press\s?any\s?key\s?to\s?continue\s?\(Q\s?to\s?quit\)?.*$/ do |data, re|but that didn't work. Every time I see these characters there are other characters missing in the config dump.

spi43984 avatar Nov 19 '18 17:11 spi43984

I found a dirty workaround which seems to work in my case. I find sometimes \0 characters at the end of the Press any... line which I could not get rid of. But I found out that instead of pressing space sending a \r helped. I changed the paging part in the tplink.rb to the following one - could someone please verify if that helps in other cases as well?

  # handle paging
  # workaround for sometimes missing whitespaces with "\s?"
#  expect /Press\s?any\s?key\s?to\s?continue\s?\(Q\s?to\s?quit\)/ do |data, re|
  expect /Press\s?a?n?y?\s?k?e?y?\s?t?o?\s?c?o?n?t?i?n?u?e?\s?\(?Q?\s?t?o?\s?q?u?i?t?\)?.*$/ do |data, re|
#    send ' '
    send '\r\r\r '
    data.sub re, ''
  end

The many ? I put in because sometimes some characters are missing and then the entire line is not recognized - just the Press in the beginning is needed to clearly identify that line.

spi43984 avatar Nov 20 '18 08:11 spi43984

Nope, didn't work. Now again one character is missing some place else in the config dump.

spi43984 avatar Nov 20 '18 16:11 spi43984

Anyone anything new?

spi43984 avatar Nov 29 '18 11:11 spi43984

DISCLAMER: I have never coded in ruby before, this is the first time i touched ruby code, so bear with me.

I think I might have found something related to the issue.

There is this block in tplink.rb:

  # send carriage return because \n with the command is not enough
  # checks if line ends with prompt >,# or \r,\nm otherwise send \r
  expect /[^>#\r\n]$/ do |data, re|
    send "\r"
    data.sub re, ''
  end

It seems to me, that it's sending a CR every time an LF were sent, and removing that CR from the stream. Upon commenting out this entire block, and adding a CR to every command manually in the following manner:

  cmd "show system-info\r" do |cfg|

(Note: don't forget to append one to the enable password)

It seems like the random missing characters are fixed. I don't know the reason for sure, I'm guessing that this extra carriage return caused an extra character to be inserted which wasn't counted for when removing the pager prompt, but that's just straight up guessing.

I hope I could help.

marcsello avatar Dec 22 '21 09:12 marcsello

Are you guys aware how to disable pagination and question "Press any key to continue (Q to quit)" on the switch side over serial console?

kucharskim avatar Jan 02 '22 21:01 kucharskim

Are you guys aware how to disable pagination and question "Press any key to continue (Q to quit)" on the switch side over serial console?

It is possible by issuing terminal length 0 command on the switch after you log in. But it seems to me, that not many tplink switches implement this feature, or implements it properly. So I'm not sure that this could be a possible solution to this issue.

marcsello avatar Jan 02 '22 23:01 marcsello

unfortunately also running into the same issue on a TPLink T1600G (had to add a "enable-admin" command to the tplink model to make it "work") sometimes even non asci chars are added making git think its a binary file. Which characters are missing looks completely random to me and I can't reproduce it by hand. unfortunately it looks like my switch model does not know what to do with "terminal length 0", so no luck on this side either. I also tried to replace the space that is sent when the "Press any key.." message is found to "a", "\r", "\n" but this does not seem to have any impact at all.

When I commented out the block @marcsello suggested, but then the authentication times out and I'm not sure where to add the \r to make it work again. However I added a ")" to that regex for testing so theoretically it should not react on the "Press any key" line anymore but it did not change anything.

Will carry on with the tests, let me know if you have more ideas what I could try out.

Dr-Bone avatar Jan 07 '22 15:01 Dr-Bone

We are using the workaround in https://github.com/ytti/oxidized/issues/1608#issuecomment-999427205 for several years successfully. I guess that the internal mechanics of Oxidized apply that regular expression to chunks of data as well and hence eats random characters, depending on much content the switch sends at once.

I don't know if every TP-Link model needs this \r hack, but there should be a cleaner way to define what Oxidized uses as line ending...

jplitza avatar Jan 10 '22 16:01 jplitza

Stumbled on this thread while searching for solution for TP-Link. I could not get the https://github.com/ytti/oxidized/issues/1608#issuecomment-999427205 workaround to work, I must have not found all the places to put "\r" in. However, I did take code from https://github.com/ytti/oxidized/pull/2206 and it seemed to have fixed missing letter issues for me. If you compare the code with original, it seems that main difference is terminal 0 command and regex including : as well. Just a piece of info, I haven't had a chance to debug this further.

eoprede avatar Aug 31 '22 11:08 eoprede