solaredge icon indicating copy to clipboard operation
solaredge copied to clipboard

Network Decrypt Key Question

Open JohnOmernik opened this issue 6 years ago • 13 comments

Hello!

I am finding that if I leave my capture running a long time (This is the tshark live piped to unhexilfy.py piped to semonitor.py ) that it will sometimes take a REALLY long (hours, but it will eventually start) time to start with data. I wonder if why this is and one Idea had to think about the network key.

I think I've read that the key from the RS485 or RS232 interface has to be mixed with a message from the server to get the full key. Is this a temporal thing? Does this resultant key change or is it static?

I guess, I was thinking about diving into the code, and when we get the resultant key from the message, saving it off (pickle to a file, or just save in a txt file) so we can preload the keys and not have to wait for the magic packet. Would there be value in this? If so I will get on it. Just trying to understand the setup before I go reversing things to add in a functionality.

John

JohnOmernik avatar Nov 30 '18 20:11 JohnOmernik

You could try that. The place that the decryption stuff happens is in msg.py, in the function parseMsg. The message it is waiting for is the 0503 message, which comes from the inverter, not the server. I don't know if the contents are always the same or if it changes periodically.

jbuehl avatar Dec 01 '18 16:12 jbuehl

So I've done some checking on this by locally modifying my msg.py to log when the 0503 messages come through. The times below show that it's not often, every perhaps 5-6 hours on my HDWave SE6000 inverter (other inverters or firmwares may send it more often). So if you have to stop your monitoring, you app will be missing the the last key until it sends it again. This can be frustrating when monitoring live. I believe this acts an a sort of poor man's key rotation, it's not super secure, but this is inverter data, not banking data. Looking at this, I do see value in saving the current 0503 message to a text file and loading that at startup. Essentially, if it's in the window. (Say I stopped my monitoring at 2018-12-04-19:43:00 and needed to restart it at 2018-12-04 19:45:00) it should be able to pickup monitoring right away then, and then when the key rotates, it will work as planned. If the window is closed, Let's say my script errorred out at 2018-12-04 19:43:00 but I didn't restart it until 2018-12-05 03:00:00, then it just won't decrypt until 2018-12-05 05:11:03 rolls around. (using last night as an example). This is acceptable. In other words, I will gain some, but there may be times, I have to wait for a key.
To keep things simple, I am going to stop this comment, and move to another comment for design input.

Times my key changed last night


2018-12-04 18:43:14 2018-12-05 00:46:04 2018-12-05 05:11:03

JohnOmernik avatar Dec 05 '18 13:12 JohnOmernik

So onto design.

Currently, I've modified the msg.py to store the key (append) with a date and time to a file. I will discuss that in my next comment, here I want to discuss the design of the rotated key save.

Initial Design


  • I want to store the 0503 message, the 34 bytes, hex encoded into a file. Every time the msg.parseMsg gets a 0x0503 message and a Key , it clobbers the global cipher object by instantiating a new SECrypto object. It will be here, in that init that automatically saves the most recent 34 bytes 0x0503 message to a file. (Overwriting what ever is there, no need to save a history)
  • One question I have is this: Should I make where I store that file an option? My gut tells me no, this is an internal thing, a user can read and find it, but they shouldn't need to change where it is. (just need to add the name to the .gitignore so it doesn't get uploaded). Given that, I think just the root of solaredge is the proper place, I need to get the full path of the semonitor.py script and store it there, I'll call it last0503.msg or something like that. I am open to discussion here.
  • On line 95 (# cryptography object \n cipher = None) I think here is when I will automatically look for last0503.msg (or if the previous point is that I SHOULD make it a use customizable location, I will look for that). Essentially a try except block that tries to load, validate that it's 34 bytes hex, and then call cipher = SECrypto(keyStr.decode("hex"), data) and load it right away. If I have it, no sense in not trying to use it. If the file doesn't exist, it just doesn't load a SECrypto object (leaving it None). If the data in the file doesn't validate (34 bytes of hex encoded stuff) it doesn't load the SECrypto object (and perhaps logs an invalid last last0503.msg message in the logger.warning()?
  • Basically, I "believe" that having the WRONG cipher will be the same or similar effect as having no cipher (except "Decryption key not yet available" won't get logged, instead it will be invalid checksum). The program will still function, all messages will just get invalid crcs until a proper 0x0503 is recieved. I wonder if part of the invalid checksum message, I should add "Checksum error, or loaded Decryption Key incorrect" Just so we can be clear. Advice welcome there. Would there be a way to identify an wrong key during the validateMsg function? I am guessing no. Is this acceptable. (i.e. acceptable to have an invalidCRC when the decryption key is just wrong?)
  • So when I log the last 0x0503 message, I am going also log a date/time stamp (or just Epoch seconds, probably just epoch seconds). If the message is older than say 24 hours, I won't even bother trying to load it, that should help the previous point as we won't try to load obviously old rotating key messages. If it's in the 24 hour window, than file, we'll potentially accept the danger of a window of invalid CRCs rather than Decryption Key not yet available messages, but if the 0x0503 is old, no point in confusing things.

That should cover it. Advice sought, and welcome.

JohnOmernik avatar Dec 05 '18 14:12 JohnOmernik

So in one of my previous posts, I mentioned modifying the msg.py to log all the 0x0503 messages received, this was an append log. Thus it kept getting larger (not by much, I only logged the date/time and the 34 byte message). The question I have here is this: Should I add an debug option in semonitor.py to make this a feature rather than a debugging step? I don't see it as being super useful. This is independent of the "storing the last rotating key" feature discussed above. It's not hard to do, but I am big fan of only doing things that are actually useful in a program. I think this troubleshooting step I did proved that storing the key IS a valuable thing for us to do, but storing all the keys I do not believe is valuable. If you are looking at previous data, you should (maybe?) also have a previous key in the data. Thoughts? I would do the features separate if I did both, and this would not be a priority.

JohnOmernik avatar Dec 05 '18 14:12 JohnOmernik

Some comments from my side because I went through all this before when I was setting up my own datalogging project.

  • A new key is generated whenever the inverter starts a new communication session with the server. If you pull the network cable for a couple of minutes, it'll send a new key as soon as it gets back online.
  • I didn't see much added value in storing each key. They are in the pcap files anyway, so you can easily scrape them out anyway if you'd want to do forensics on them.
  • You can tell whether you have a working key by checking the contents of the deciphered message. It should start with the usual four-byte header (12 34 56 79), since it should be a valid command message.
  • If you are using the man-in-the-middle setup (as opposed to a network hub or a replicated port ), you could bring down the inverter's network port for a short while when you detect that your saved 0503 key has expired. That will prevent the inverter from sending any more data you can't decipher and it will give you a new 0503 key once you bring it back up.

Jerrythafast avatar Dec 05 '18 16:12 Jerrythafast

One useful improvement to semonitor.py would be to detect an invalid key. Since I am not configured to test this, I won't make the change, but if someone wants to try it out and then submit a pull request, it would be appreciated.

Here's a proposed modification to msg.py

diff --git a/se/msg.py b/se/msg.py
index 3ebd131..7bbf00e 100644
--- a/se/msg.py
+++ b/se/msg.py
@@ -153,7 +153,11 @@ def parseMsg(msg, keyStr=""):
                 # decrypt the data and validate that as a message
                 logger.data("Decrypting message")
                 (seq, dataMsg) = cipher.decrypt(data)
-                (msgSeq, fromAddr, toAddr, function, data) = validateMsg(dataMsg[4:])
+                if dataMsg[0:4] != magic:
+                    logger.data("Invalid decryption key")
+                    return (0, 0, 0, 0, "")
+                else:
+                    (msgSeq, fromAddr, toAddr, function, data) = validateMsg(dataMsg[4:])
             else:  # don't have a key yet
                 logger.data("Decryption key not yet available")
                 return (0, 0, 0, 0, "")

jbuehl avatar Dec 05 '18 17:12 jbuehl

Solid, I will try this @jbuehl and see if it works. If it does, I will preload the key, and if it's wrong, unload it right away so that we can get the proper messages.

@Jerrythafast

  • Interesting. Given the time frame of the logging, I think it's easy enough to store the most recent working key, and try it, especially with @jbuehl detection of invalid key.
  • Storing each key? Are you refering to the overall project of storing the most recent, or the sub comment of storing/logging each key. I agree there is not much value in storing the historical keys, but the current key has already shown value in my logging. I can make changes, and update things easily.
  • I am assuming this is in conjunction with the check for invalid key. I will check for this, and if cipher is it not none, then set it to none if it's an invalid key
  • This should become less of a problem if we have the active detection code... correct? It will try to load the file, once, and on the first message it doesn't decrypt, it will set to none, and wait for the proper key as usual.

JohnOmernik avatar Dec 05 '18 17:12 JohnOmernik

Storing each key? Are you refering to the overall project of storing the most recent, or the sub comment of storing/logging each key. I agree there is not much value in storing the historical keys, but the current key has already shown value in my logging. I can make changes, and update things easily.

I was indeed referring to storing the entire history of keys. Storing the most recent key is quite useful, as you already noticed.

I am assuming this is in conjunction with the check for invalid key. I will check for this, and if cipher is it not none, then set it to none if it's an invalid key

@jbuehl's implementation above is precisely the check I suggested. if dataMsg[0:4] != magic, the decryption key used to create the cipher object has expired.

This should become less of a problem if we have the active detection code... correct? It will try to load the file, once, and on the first message it doesn't decrypt, it will set to none, and wait for the proper key as usual.

What I was saying was that you can avoid having to wait for hours for a new key (and losing all data sent in that time) by disconnecting the inverter temporarily when this happens. Would be convenient if the script does this for you automatically, but of course you can pull the network cable manually as well.

Jerrythafast avatar Dec 05 '18 20:12 Jerrythafast

Ya, disconnecting can work, but that's physical. I wonder if there is a packet we can send, either way, to Solaredge or to the Inverter which makes it say "muerp, somethings wrong, time for a new key"

JohnOmernik avatar Dec 05 '18 20:12 JohnOmernik

Also, after we get @jbuehl's detection stuff into main. (I rolled to many changes into one pull, he's deconflicting now). I will push up my code to save the key, then start looking at the reset packet.

JohnOmernik avatar Dec 05 '18 20:12 JohnOmernik

Perhaps you could send a TCP packet with the RST flag set, don't know if that's possible. I believe that's how the Great Firewall works too.

Jerrythafast avatar Dec 06 '18 05:12 Jerrythafast

@Jerrythafast That may work, but that's at the network layer, and not the application layer. Many applications will just wait for the network to resync and continue on their merry way. You are on the right path though, I want to figure out if I can craft a packet that force confusion and hope that the error handling in the protocol itself will send the proper resync. I have some ideas there, but that's a different issue/pull request :)

Back to the conversation on the key savings, what I am hearing thus far on these various topics:

Historic Keys

  • No need to store historic keys, don't add extra code, even for debug to do so.

Bad Key Detection

  • This is a needed, but separate item addressed in Pull Request #118. When a bad key is detected, the application will set the cipher object to None, and wait for the proper packet. There will be discussion on forcing a key exchange through an active manner as a secondary priority. I will open an issue for that discussion separately. (#119 )

File location customization of last 0x0503 message

  • I've seen no strong opinions on this. What I've done in my dev branch is found the location of the running semonitor.py, and essentially created a file called last0503.msg there. I don't see a need to add logic to customize this. It's stored hex encoded, so opening the file won't cause weird shell issues. And it's validated on open, if the validation fails, the program continues, a message of "Bad last 0503 message" or something to that effect is logged, and the program continues with cipher = None.
  • Also of note, the file is loaded on the first message processed by parseMsg.py. This is done to contain the whole loading logic in a function. It will not try to load the message every time a message is parsed and Cipher is none. It will try one, and only once. After that it waits for a new 0503 message.

When to attempt last 0x0503

  • As noted above, I am storing the timestamp of the last time the 0503 was loaded and worked. If it's too old, I am not even going to bother loading it.

This is where I am now on the project. It is already working under these assumptions, I just want to make sure we are all good with these assumptions before I start cleaning things up to make ready a pull request.

Thanks!

John

JohnOmernik avatar Dec 06 '18 14:12 JohnOmernik

I merged the PR, but can you look at adding a test for your changes. This would involve a .pcap input file, a .json output file, a .key file, and a last0503.msg file in the test/ directory. Thanks.

jbuehl avatar Dec 08 '18 16:12 jbuehl