LibreVNA icon indicating copy to clipboard operation
LibreVNA copied to clipboard

Feature proposal: over-temperature protection

Open nbgsmk opened this issue 3 years ago • 16 comments

Jan, another thought on temperatures. I discussed with Hugen briefly and he assures me that most of the heat comes from the mixers and plls, so any kind of slowing down the acquisition rate would not give any reduction in temperatures. Is that true?

Yes or no, but possible fimware feature comes to my mind: in case of any dramatic increase of temperature (since we are already measuring), how about an internal shutdown of some kind, to save the device before damaging itself?

nbgsmk avatar Nov 25 '20 16:11 nbgsmk

most of the heat comes from the mixers and plls

Yes, that is correct. Power consumption is about 6W (depends a bit on the operating mode) and the two PLLs (source and 1.LO) together with the three first stage mixers use about 3.6W of that. The rest is split between the second stage mixers, various LDOs and of course the ADCs and FPGA. Using lower ADC samplerates could save a little bit but it won't really make a significant difference.

Over-temperature shutdown could be useful feature but I won't work on that anytime soon (unless you operate the VNA in something like a 60°C environment, it will not damage itself). I'll put it on the list and if I ever get bored, I'll implement it ;) And of course you are always welcome to add it yourself :) By the way: while you can disable the PLLs, the mixers don't have a true shutdown. The only way to really get the consumption down would be to shut down the 6V rail, but that is not possible without changing the hardware.

jankae avatar Nov 25 '20 17:11 jankae

Nice challenge. I will have that in mind. In case I get bored :) :)

I'll put it on the list and if I ever get bored, I'll implement it ;) And of course you are always welcome to add it yourself :)

nbgsmk avatar Nov 25 '20 20:11 nbgsmk

It does sometimes get very hot to touch , particularly if I'm running it quite fast.

Here's a speeded up video (original length is 3 minutes) showing the S11 drift immediately after a Load/Isoln (50R loads on both ports) re-calibration, the S21 quickly follows after but my arms gave out holding the camera.

https://youtu.be/vRYIbsdYJuw

I don't like relying on fans, noisy or otherwise as it's more cumber sum, requires more power and adds a failure point, passive cooling is the best.

OneOfEleven avatar Dec 02 '20 13:12 OneOfEleven

I couldn't quite see on your video, if upper curve is S11, and how big is the drift? Funny thing, it gets a less noisy towards the end. How do we explain that?

Here's a speeded up video (original length is 3 minutes) showing the S11 drift immediately after a Load/Isoln (50R loads on both ports) re-calibration, the S21 quickly follows after but my arms gave out holding the camera.

nbgsmk avatar Dec 02 '20 16:12 nbgsmk

I couldn't quite see on your video, if upper curve is S11, and how big is the drift? Funny thing, it gets a less noisy towards the end. How do we explain that?

Yes the upper brown curve is S11, the lower blue curve is S21.

Thinking out aloud I'd say it gets less noisy because it's moving up away from the noise floor, and so the base noise level has less effect on the signal that is received from the RF bridge sensor area. But I can't say for sure as I don't yet know the exact cause for the large changes my unit goes through with temperature. I will take the cases off and re-clean and re-thermal compound the areas (again) that contact the various chips for heat sinking, will see if that helps. The case does get quite hot so I'd say the thermal contact areas to the chips are working well as is, other wise the case wouldn't get so warm in a short space of time.

I don't know if you guys want me to report these happenings or not really, please let me know either way really, I really love the VNA so want to help. It makes the Nano's feel like toys when I go back to them.

OneOfEleven avatar Dec 02 '20 18:12 OneOfEleven

I am thankful for every reported problem or suggested improvement, so please keep posting any issues you might find. I might not always be able to do something about it, but I always add it to my long list of possible improvements.

jankae avatar Dec 03 '20 17:12 jankae

Inspired by @pentti12 https://github.com/jankae/VNA2/issues/14#issuecomment-736566119 I put mine on an old uhf amplifier. No big deal about the amplifier, it's just a larger metal block I had (no heatsink in the lab, can you believe it!). The idea was to see if I could lower the temperature with very little effort. After several hours, now it hovers at ~45degrees +- a few, which I find to be excellent (used to go up to 65deg.). I will find a nicer heatsink and I consider this "problem" permanently solved. But Jan's challenge to add some kind of over-temperature protection in firmware, that one remains. ;-)

IMG_20201207_170447_resized_20201207_050543186

nbgsmk avatar Dec 07 '20 16:12 nbgsmk

Inspired by @pentti12 #14 (comment) I put mine on an old uhf amplifier. No big deal about the amplifier, it's just a larger metal block I had (no heatsink in the lab, can you believe it!). The idea was to see if I could lower the temperature with very little effort. After several hours, now it hovers at ~45degrees +- a few, which I find to be excellent (used to go up to 65deg.). I will find a nicer heatsink and I consider this "problem" permanently solved. But Jan's challenge to add some kind of over-temperature protection in firmware, that one remains. ;-)

Hi,nbgsmk What material is the shell you use, and I am considering which material has better shielding effect.

blackberryer avatar Dec 22 '20 16:12 blackberryer

Hello @blackberryer, I believe this is some variant of aluminum, although I didn't make the shield myself so it was not my choice. In terms of rf shielding, as long as you don't use silver or gold-plating, in my oppinion there should not be much difference between other common metals. I believe aluminum is used most often due to best compromise among price, thermal characteristics and ease of machining. Regarding thermal, the bottom block (on my attached photo) is also aluminum, just differently anodized (if that is the right english word). I would be glad to hear if someone has more scientific info than just my oppinions. Zoran

nbgsmk avatar Dec 23 '20 19:12 nbgsmk

When my board is finished I plan to take an IR photo as unit warms up then machine another layer of aluminium which will have small noctua fan blowing air through cooling channels which will be routed in the extra alu layer over hot spots - these fans are really quiet

hedleyd avatar Jan 02 '21 07:01 hedleyd

Here is termal foto of my device img_thermal_1612115606162 img_thermal_1612115647269 img_thermal_1612115719569 img_thermal_1612115740062 img_thermal_1612115749999

DiSlord avatar Feb 01 '21 18:02 DiSlord

Hello Jan! Just considering this part-time, not ready to implement yet. Lowest priority question... :) I'm thinking two options: a) Check every device log message within the pc application->DeviceLog::addLine. Search the string for "temperature" + some arbitrary limit (suggestions?) and alert if it is crossed. Easiest way, but searching for a literal string like that is a bit of a hack, imho.

b) In the embedded application->Hardware.cpp, below this

GetTemps(&tempSource, &tempLO);
LOG_INFO("PLL temperatures: %u/%u", tempSource, tempLO);

check if temperature is over a limit and send additional eg. LOG_CRIT. Capture that in the PC application. Since I only use STMCubeIDE, I can not build the embedded app and test this myself.

It seems device log messages can be captured even if the device log dock is closed, so the idea looks ok.

In any case, if we proceed: how to handle this event in the PC application? Warning popup only, (suggest) device disconnect or something else?

Ideas, comments? Thanks!

nbgsmk avatar Apr 18 '21 16:04 nbgsmk

Check every device log message within the pc application->DeviceLog::addLine

The string for that line is created from the DeviceInfo packet, it would be easier to use the temperatures of that directly instead of converting the string back to a number. But I think the overtemperature handling should be done by the device (in case of a hardware upgrade with possibly different temperature limits the application should not have to handle that).

I'd suggest adding a status flag to the DeviceInfo struct. There are already error flags for other conditions, a "overtemperature" flag could be added easily. The implementation is not difficult but what should happen? Maybe disable whatever hardware can be disabled (mixers will stay active) and show a warning to the user?

jankae avatar Apr 18 '21 20:04 jankae

The string for that line is created from the DeviceInfo packet, it would be easier to use the temperatures of that directly instead of converting the string back to a number. But I think the overtemperature handling should be done by the device (in case of a hardware upgrade with possibly different temperature limits the application should not have to handle that).

I'd suggest adding a status flag to the DeviceInfo struct. There are already error flags for other conditions

Agree on all points. Getting the temperature from the log line string is a hack, for sure. I already didn't like my own idea :) :) I just couldn't find the right place for it. So thanks for the suggestion! I will look into that.

What should happen? Well, exactly that: disable what is possible, issue a warning. if the device could handle this alone, it would be very bullet-proof (imagine a problem when powered through a DC jack and pc not connected). The only further action I can imagine, is maybe that a pc application issues a device disconnect. But is this any different in terms of what gets disabled or stays powered on? And do we need to go that far? Sorry I tend to overkill error checks and protection issues ha ha...

nbgsmk avatar Apr 20 '21 23:04 nbgsmk

But is this any different in terms of what gets disabled or stays powered on?

There is no difference in power consumption between setting the device to idle and (logically) disconnecting the USB (physically disconnecting gets you down to zero of course). In both cases, the PLLs, amplifier, second stage mixers, ADC drivers and ADCs are powered down. The first stage mixers are always active as they don't have a true shutdown feature.

In my mind the perfect over-temperature protection would work like this:

  • a hard temperature limit (e.g. the max. operating point of the PLLs, +85°C). If this temperature is exceeded, the device itself goes into idle mode (and maybe disconnects the USB?)
  • a configurable soft temperature limit with a user-selectable action (e.g. do nothing, just display a warning, set the device to idle)

Some hints for a possible implementation: In the device firmware:

  • check for the hard limit and and disconnect the USB (off the top of my head I don't know how to disconnect, but it should be possible in the USB driver) and set the hardware to idle mode if it is exceeded
  • Add an over-temperature flag to the DeviceInfo struct, send a message with this flag set just before disconnecting (so the application knows what is happening)

In the GUI:

  • Add the soft temperature threshold and the selectable action (QComboBox?) to the preferences dialog
  • Whenever a DeviceInfo message is received, check the temperatures in that and trigger the selected action if threshold exceeded
  • If at any point the over-temperature flag is set in the DeviceInfo, disconnect the device (it will do so on its own anyway) and show a message about that

Let me know if you want to take that on (or need help along the way)

jankae avatar Apr 21 '21 16:04 jankae

It would be interesting to be able to buy the heatsink specially designed for this tool. It would be the quickest way to fix the problem. I think that for 30 ... 100 € everyone could buy it without having to waste time making one with definitely higher costs and with a worse yield.

lucasub avatar Sep 08 '22 09:09 lucasub