diyBMSv4Code
diyBMSv4Code copied to clipboard
Optimization ideas for LTO batteries
Hi there, big fan of open source hardware and software. Great project! And I like your KiCad skills. :) Been following the Lithium Ion Titanate thread over on https://community.openenergymonitor.org/t/diybms-for-lithium-titanate-battery-cells-lto/16285 and I have some ideas on how to improve the platform for LTO:
The useful LTO cell voltage at 1C discharge goes from 2.65 volts down to around 2.0 volts. This is for the Yinlong cells out of China that basically everybody uses. It may be, though, that a cell goes below 1.8 volts, for example at high-C discharge rates, wrong charger settings, or when a cell in a block goes bad. To keep the cell module running for longer and even into deep 1.xx volts, I suggest:
- Lower the internal clock from 8 MHz to 512 kHz. Or does the Atmel need to be run at 8 MHz? I couldn't find anything particularly compute intensive. Activation threshold for Vpor is min 0.6 typ 1.3 max 1.6 volts so I think we could keep the chip running until 1.3 volts, maybe even lower. Since you run at a slow 2400 8N1, a UBRR of 12 would give 2462 baud or 2,5% error over 2400 baud, or in double speed mode a UBRR of 26 would give an error of only 1,25% at 2370 baud. I haven't checked other parts of the code that may need adjustment.
- Before attempting any EEPROM write, code should check Vcc by reading the internal 1V1 reference against it. For example code see https://github.com/cano64/ArduinoSystemStatus/blob/master/SystemStatus.cpp As can be seen this code appears to employ workarounds to get a valid result (2ms settle wait instead of 1ms, discards first result). This shall prevent one source of EEPROM corruption, low Vcc. Which reminds me while looking at "ModuleV421_Round.jpg", shouldn't there be a 1uF cap close to the AVR to catch voltage spikes? And a 100nF ceramic bypass cap to filter noise?
- Use EEPROM to store more than one copy of myConfig. Always use latest copy of course, but switch to any other recent copy if latest is corrupt. My thinking is when a large consumer like an induction stove is suddenly turned on and inverters draw alot of power from cells, a brownout in the AVR could still occur while it is rewriting its EEPROM.
In addition:
- Communication wires are twisted but unshielded and potentially run in EMI-heavy environments. I think I remember some diyBMS user suffering from bit errors due to inverter noise. Keep 2400 8N1 but employ a 24/16-bit Forward Error Correction code on the wire. See http://www.robotroom.com/Hamming-Error-Correcting-Code-1.html and https://github.com/blinkenrocket/firmware/tree/master/src/Hamming for an example that sends 3 bytes for every 2 data bytes.
- PCB is missing https://www.oshwa.org/open-source-hardware-logo/ logo of pride in silkscreen?
- (Going overboard here, but since the chip supports it...) Introduce self-update using a small "flasher" at end of flash (some linker magic) that can accept new firmware and/or EEPROM contents from the ESP32. Will probably need Vcc check and polled serial I/O with interrupts and WDT off. Rewrite in 64 byte chunks for program flash.
I realize most freaks will buy or build something with LiFePo4 still these days. But 5x the cycle count of LiFePo4, large operating temperature range, 99% efficiency, and especially the huge charge/discharge C rates will let LTO find many new friends the coming years.
Hi @bsdice thank you for the comprehensive message.
I like the suggestions you make, I don't have time to comment on them all at the moment this side of Christmas, but happy to work with you to make some of the improvements.
The current V4.21 design does work down to 1.8V (and the newer LTO style boards I made are custom designed for this voltage), so I'm not sure there is anything to be gained by dropping the internal clock, below 2Mhz the chip still needs a min of 1.7 volts to operate, so you are not gaining much compared to 1.8V.
The controller would have also spotted if a voltage was low, or a module stopped communicating, so could have taken action to disable power/start charging etc.
Additionally, the fast clock, also means that the ATTINY isn't "awake" very much - this is the trade off with slower vs faster clock speeds. I've also had a LOT of problems with baud rate - the ATTINY is very poor at regulating its internal clock frequency for the UART. Which is why I had to settle on 2400 baud - ideally I would like faster. I do have a v4.3 design board which uses an external oscillator, but not tested that much at the moment.
I'll take a look at the EEPROM and communication suggestions, the comms isn't as bad as people sometimes make out, often its a faulty solder joint or a problem with crimping a JST connector. If you take a look at one of Adam Welsh videos on YouTube his DIYBMS v4 has over a million packets through it and no errors.
I'm thinking of moving to a different connector type - any suggestions?
I have considered doing the self update as well - this is currently a pain to upgrade 20+ ATTINY modules! I thought about a custom boot loader of some sort, although the code footprint would need to drop to allow this to work/fit.
There is a massive change in code based in one of the branches- https://github.com/stuartpittaway/diyBMSv4Code/tree/modulepwmchanges free free to take a look and try it if you already have the hardware.
Stuart, thanks for reading, and happy holidays!
I watched Ross Mitchell on Youtube and I liked his LTO cabinet build. I shun commercial options like the upcoming Zenaji (https://zenaji.com/eternity/) because of price, risk of the supplier going out of business (bye spare parts) and usual lack of open API for influxdb, mqtt, grafana.
As for the topics at hand:
- Atmel Vcc. I don't want the BMS to go early into power-on reset in the event of mistreated LTO cells. LTO can even tolerate a 100% discharge. At really low frequencies like 512 kHz the Atmel should continue operating even at 1.3V. The higher the clock, the higher the minimum voltage the chip requires. If you look in datasheet page 238 and 301 you can even see that at 1.8V the curve stops at 4 MHz, so we would already be overclocking. At 1.7V curve ends at 2 MHz.
- Atmel Vcc. My second point, but that needs verification, is the problem with sudden current inrush into inverters, which is pretty much a raison d'etre for LTO. Going below 1.8V could temporarily reset the BMS, because a large consumer has turned on. Now LTO can do 10C which works out to 400 amps per cell. The voltage drop V_ESR = R_ESR * I. See for example https://www.onsemi.com/site/pdf/EETAsia-April2010.pdf. As you can see on PDF page 11 of https://www.mdpi.com/2313-0105/5/1/31/pdf from https://www.mdpi.com/2313-0105/5/1/31 during high discharge voltage on battery poles can drop to below 1.6V. Hence my idea to push this limit down to 1.5V. Or whatever the POR threshold is.
- Internal clock drifting. Yes, that is a problem to be aware of. The on-chip oscillator will speed up or slow down on voltage and temperature changes. I just checked for 512 kHz ULP clock in datasheet page 299, for the relevant 1.7 to 2.7 volts at 25 to 85 degC operating temperature the clock changes very little. However when going from 15degC to 85degC e.g. during load shedding the oscillator will see a 5-6% clock slowdown. This is already at max baud rate tolerance for chip's USART (see datasheet page 174) and 100% beyond recommended timing error tolerances. Ideas: a) Using a crystal will incur a 1000 fold slower start up times vis a vis internal clock, because crystal has to stabilise. b) At 512 kHz to be within 1% truncation error no more than a baud rate than 4800 bps can be used (Ratio of CLK to baud rate 1 : 0.01). c) Introducing some kind of temperature compensated calibration into the protocol will blow up the code size and reduze manageability. d) I think I would go with external crystal and pick a HC49-SMD 4,915200 MHz in CKDIV8 mode, so 614,4 kHz system clock. In async normal mode and with UBRR = 7 this will deliver exactly 4800 baud.
- GCC. Last project I did was a TMK USB to PS/2 keyboard converter with an ATMEGA32U4. After some reading and tinkering I found out that avr-gcc 8.4.0 and avr-libc compiled with that gave the most compact and troublefree code. Later GCCs all had increasingly nuissances. No real miscompiles but just bloated code. So I recommend to use the latest 8.x branch if possible.
- Connector type. No idea beyond the JST in use. I'd want light cabling.
- What worries me a little is long-term stability of solder pads for the load resistors, especially with no-lead solder. How long will those last? 3V*1,2Amps means 3,6 watts of dump load. So 2,5 Ohms correct? Did you only choose SMD because JLC had them read for placement in factory? I think I would feel better with a Vitrohm KV214-310B2R2. Or four Vitrohm PO593-05T10R 2 watts 10 ohms in parallel, with 100% headroom.
Some more notes so I don't forget...
I looked at the v4 schematics and for comparison also what Batrium has in their CellMon modules (https://s3.amazonaws.com/helpscout.net/docs/assets/5af3b7ec0428631126f1e589/attachments/5df41bf52c7d3a7e9ae51d65/BMon-and-LMon-LeafMon-Data-Sheet-Ver-2.0.pdf)
In the v4 schematics I can't find any reverse polarity or overvoltage protection. Was that a design choice? To simplify the circuit? Considering LTO's low 1.8 to 2.8 voltage compared to Li-Ion or LiFePo4 range this presents some challenges. Most IC+MOSFET-type solutions I have checked by Maxim, TI etc. only work at 2.5V and up. Or guzzle up a lot of power, in mA range. A good solution shouldn't draw any power at all, in order to not empty the batteries. Should such protection ever be included here, my recommendation would be to use a "crowbar circuit" as shown here: https://www.electronicshub.org/crowbar-circuit/ (scroll into the middle of the page, just above "Crowbar using TRIAC"). For the fuse maybe some PTC with 200mA I_h could already be trippy enough. This is already derated for 80degC i.e. plus 100% for hold and trip current. For zener a 4.3V like the BZX84C4,3 so Q1 fires already way below 5V, here at +50% of max cell voltage of LTO. For reverse polarity protection I suggest PMEG1020EA that has a Vf of super low 100mV typ. at If=10mA. See https://assets.nexperia.com/documents/data-sheet/PMEG1020EA.pdf, it's one big ass SMD Schottky though...
Now in case somebody reverses polarity, the load dump MOSFET's diode will be ON (correct?) and the battery will discharge using the resistor(s)? Anything going up in smoke at -2.8V?
Finally after digesting yesterday's x-mas buffet some more, I again looked into keeping the MCU working much deeper than 1.8V. I threw out the idea of underclocking the chip so that it might continue running at 1.5V, until the POR asserts due to low voltage. I still think this is useful for diagnosing dying cell blocks, where really low voltage readings could still be returned, instead of nothing. I did some calculations based on probably wrong ESR values for Yinlong LTO cells and watched some videos of people doing LTO load testing and nearly blowing their gear up. At 10C loads, cell voltage appears to sag 200-300mV. I browsed through some joule thief designs until I found something nice and well suited imho: TI TPS61099x https://www.ti.com/lit/ds/symlink/tps61099.pdf Can start from 0.7 volts and after that will go as low as 0.4 volts. Very low power. Insurance for load and charging noise from cheap inverters with too small buffer caps and cheap solar charge controllers. The DRV package will be a pain to solder. But it would allow to run the AVR stably at 8 MHz from basically a near flat LTO cell.
I just received an e-mail because this post has shown up in Google search, and got sent this link about TI as a heads-up: https://www.eevblog.com/forum/projects/psa-do-not-use-the-tps61099-boost-reg-in-your-designs/
Wow... looks like TI really has lost its mojo.
Alternatives as per that thread and some research:
- MCP16251 https://ww1.microchip.com/downloads/en/DeviceDoc/20005173B.pdf
- MCP1640 http://ww1.microchip.com/downloads/en/DeviceDoc/20002234D.pdf
- NCP1423 https://www.onsemi.com/pub/Collateral/NCP1423-D.PDF
- MAX1722 https://datasheets.maximintegrated.com/en/ds/MAX1722-MAX1724.pdf
- LTC3429 https://www.analog.com/media/en/technical-documentation/data-sheets/3429fa.pdf
- LTC3525-3.3 https://www.analog.com/media/en/technical-documentation/data-sheets/3525fc.pdf
- LTC3526L-2 https://www.analog.com/media/en/technical-documentation/data-sheets/3526llb2fa.pdf
- LTC3528 https://www.analog.com/media/en/technical-documentation/data-sheets/3528bfd.pdf
- SGM663 http://www.sg-micro.com/uploads/soft/20190626/1561534022.pdf
- PT1301 http://www.micro-bridge.com/data/crpowtech/pt1301e.pdf
- XC6367 https://www.torexsemi.com/file/xc6367/XC6367-XC6368.pdf
- ME2108 https://datasheet.lcsc.com/szlcsc/Nanjing-Micro-One-Elec-ME2108A33M3G_C236804.pdf
- L6920 https://www.st.com/resource/en/datasheet/l6920.pdf
MCP16251 looks like a no-brainer, SOT-23 case for easy soldering, no load input current of only 14uA, starts at 0.82V and after will go down to 0.35V, extensive, useful and readable datasheet.
In the v4 schematics I can't find any reverse polarity or overvoltage protection. Was that a design choice? To simplify the circuit?
Yes, design choice. These boards can be built for less than $2USD, so if the user connects backwards then so be it, easier to replace for that sort of cost. The cell would start to drain and the board become very hot - the RED LED should also indicate this.
The previous V4 design of the DIYBMS used a buck/boost converter to provide a stable voltage to ATTINY85, this worked fine but constantly drains power from the cells - what does the MCP16251 look like in this respect? Adding this one single part, also drives up the BOM cost.