FLASH driver
Adds sync API for most CH32 parts.
Tested on 208 and 307 (see examples)
Is there any particular blocker to merging this? I need this for embassy-boot support
I will try to take a look at this soon.
On Fri, Aug 15, 2025, at 07:04, Khionu Sybiern wrote:
khionu left a comment (ch32-rs/ch32-hal#91) https://github.com/ch32-rs/ch32-hal/pull/91#issuecomment-3191580554 Is there any particular blocker to merging this? I need this for embassy-boot support
— Reply to this email directly, view it on GitHub https://github.com/ch32-rs/ch32-hal/pull/91#issuecomment-3191580554, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIP5IENMN7PDCHEYUFG2AL3NXSHFAVCNFSM6AAAAABZQNL2WGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCOJRGU4DANJVGQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>
@Codetector1374 all done on my side.
FYI the the ch32v305 examples fail to build on nightly on my machine too, but builds successfully on nightly-2025-06-01
Maybe pinning the rust version in the CI would be a good idea
Yeah let me work on fixing that soon, then we can merge
I pushed a fix onto main, I believe you need to do a rebase then it should pass
Is there any particular blocker to merging this? I need this for embassy-boot support
FYI: these chips have a Weird flash erase value of 0xe339 which isn't quite compatible with embedded-storage's trait and therefore embassy-boot. see rust-embedded-community/embedded-storage#35 and also my embassy fork
(only just saw this, busy moving)
Changing that const in embassy-boot is actually a decent use case for cargo-patch. That would unblock people at least
@chmousset The available program flash for CH32V203 is 224KB, covering both zero-wait (cached) and the non-zero-wait flash
/* CH32V203C8T6 */
MEMORY
{
FLASH : ORIGIN = 0x00000000, LENGTH = 64k
RAM : ORIGIN = 0x20000000, LENGTH = 20k
/* Non Zero Wait Flash, 224K - 64K = 160K */
FLASH1 : ORIGIN = 0x00010000, LENGTH = 160K
}
At least, we should be able to read/write/erase beyond the zero-wait area instead on depending on pub const FLASH_SIZE: usize = 65536; // from pac.rs
https://github.com/chmousset/rs-ch32-hal/blob/add_flash/src/flash/common.rs
impl<'d, MODE> Flash<'d, MODE> {
/// Blocking read.
///
/// NOTE: `offset` is an offset from the flash start, NOT an absolute address.
/// For example, to read address `0x0800_1234` you have to use offset `0x1234`.
pub fn blocking_read(&mut self, offset: u32, bytes: &mut [u8]) -> Result<(), Error> {
blocking_read(FLASH_BASE as u32, FLASH_SIZE as u32, offset, bytes)
}
/// Blocking write.
///
/// NOTE: `offset` is an offset from the flash start, NOT an absolute address.
/// For example, to write address `0x0800_1234` you have to use offset `0x1234`.
pub fn blocking_write(&mut self, offset: u32, bytes: &[u8]) -> Result<(), Error> {
unsafe {
blocking_write(
FLASH_BASE as u32,
FLASH_SIZE as u32,
offset,
bytes,
write_chunk_unlocked,
)
}
}
/// Blocking erase.
///
/// NOTE: `from` and `to` are offsets from the flash start, NOT an absolute address.
/// For example, to erase address `0x0801_0000` you have to use offset `0x1_0000`.
pub fn blocking_erase(&mut self, from: u32, to: u32) -> Result<(), Error> {
unsafe { blocking_erase(FLASH_BASE as u32, from, to, erase_sector_unlocked) }
}
}
@jsprog TBH I didn't want to open that can of worms.
- if ch32-data is wrong, or lack chip-specific information, we should create a PR in ch32-data
- I'm all for giving the dev access to all the hardware available, but not to bless them with the opportunity to loose hours (if not days) trying to figure out why a certain portions of the code suddenly loose 50 or 75% of its performance
So really the only approach I think makes sense is the one taken with STM32, where the Flash sectors are exposed in ch32-data and handled in ch32-hal. I would also add an optional feature to generate a linker script that places code in the entire Flash, which will force the dev to read the doc and be exposed to the (big) limitations that go with full access.
As plenty of WCH chips have non-zero wait Flash sectors, a generic approach would benefit tons of parts at once.
It's a bit more than I can chew on right now, hence the working-but-not-ideal solution I PRd here.
@chmousset I understand that going beyond the non-zero-wait area is more involved and not just providing a different value instead of FLASH_SIZE.
According to a note in the reference manual:
- The Fast programming related functions can only be placed in the zero-wait area FLASH. (256 page write/erase size)
- Fast programming: This method uses page operation (recommended). After unlocking through a specific sequence, a single 256-byte program, 256-byte erase, 32K-byte erase, and full chip erase are performed.
Beyond the non-zero wait area there is the standard programming mode
- Standard programming: This method is the default programming method (Compatible method). In this mode, the CPU executes programming in a single 2-byte manner, and executes erasure and entire chip erasure operations in a single 4K byte
Likely I'll bear with the current limits for storing serialized configuration and taking benefits from other embedded-storage wrappers for wear-leveling, kv-storage, and less erase cycles, and use the non-zero-wait area for logs keeping (custom code).
Finally, I'll appreciate if you could answer this question:
- As they're using in-package separate flash. What's making half-word erase value 0xe339 instead of 0xFFFF?
https://github.com/tweedegolf/sequential-storage/issues/93#issue-3390373024
@jsprog
As they're using in-package separate flash. What's making half-word erase value 0xe339 instead of 0xFFFF?
Unfortunately I didn't find any clue in the documentation or online... like most people I found out when trying to read a blank device. At least it's consistent across zero-wait and non-zero-wait FLASH (and probably system flash as well, although this one isn't writable).
Anyways, I gave more thoughts on the current API and its limitations. I think we can start with this current PR, which obviously doesn't allow to access the non-zero-wait flash, and upgrade it later with access to both FLASH and maybe async aswell. If we use a similar API as for the STM32 HAL the user code should remain the same when moving to the new API.
@Codetector1374 is there anything blocking this PR on your side?
@chmousset My last resort was to avoid using embedded-storage or any kind of third party storage library, and instead I used the PAC directly. My requirements were quite simple to only store limited system configuration with some circular logs and the support of wear leveling. Probably I'll share some code when having time to polish it.
@jsprog so does it work? if you can share a PoC, even rough, that could be useful.
@chmousset, It's a non finished library and I didn't touched the code since last September, but I'm planning to share it very soon (likely before the end of this year). I created it for use with one of my projects that required storing system configuration and operation logs without the use of additional external memory.
Due to write restrictions imposed by CH32V which prevented writes at bytes level without erase, I had to give up on different high level libraries (sequential-storage, ekv, etc..), and instead I created a light weight library from scratch. I also abstracted over a custom BlockStorage trait to allow implementations on PC/RAM, and allowed batching multiple writes with higher level abstractions LogsStore (Circular) and KVStore (not sorted). The pages are CRC checked and any page with errors is considered free for use.
The library is designed so that If you needed a LogsStore, you reserve a BlockStorage space, then assign it when creating the LogsStore instance. The LogsStore is power failure resilient and I'm planning to prevent collision between storage spaces at compile time very soon.
let storage = BlockStorageImplementation::reserve_space(OFFSET, SIZE_IN_BYTES).unwrap();
let mut logs_store = LogsStore::new(storage);
logs_store.mount().await.unwrap(); // this will scan the storage for the last sequence number to select the current page
// any erased or corrupted page is considered free for use
let _r = logs_store.append(LogEntry { ts: 0xEE, defer: false, content: .... }).await;
// storage/src/block_storage.rs
use core::future::Future;
use crate::Error;
pub trait BlockStorage: Sized
{
const READ_SIZE: usize = 4; // min read size
const BLOCK_SIZE: usize = 256; // block size (driver implementation)
const SIZE: usize; // full storage size
const OFFSET: usize; // storage offset (e.g: Flash Offset)
fn capacity(&self) -> usize; // capacity per instance space; not the full storage size
fn blocks_count(&self) -> usize { self.capacity() / Self::BLOCK_SIZE }
fn read(&self, offset: usize, bytes: &mut [u8]) -> impl Future<Output = Result<(), Error>> + Send;
fn erase_block(&mut self, offset: usize) -> impl Future<Output = Result<(), Error>> + Send;
fn write_block(&mut self, offset: usize, bytes: &[u8]) -> impl Future<Output = Result<(), Error>> + Send;
}
// src/logs/LogHeader.rs
use serde::{Serialize, Deserialize};
/// LogHeaderBytes
/// 48-bits: ver(8), flags(4), len(12), seq(24)
/// flags could be used with higher level implementations; 0000 -> Logs, 0001 -> KV
/// or when writing larger logs in multiple pages
pub type LogHeaderBytes = [u8; 6];
#[derive(Clone)]
pub struct LogHeader {
pub ver: u8, // version for storage format // version for storage format
pub len: u16, // bytes length except crc
pub seq: u32, // sequence number (higher is the most recent)
}
impl LogHeader {
pub fn to_bytes(&self) -> LogHeaderBytes {
self.into()
}
}
impl TryFrom<LogHeaderBytes> for LogHeader {
type Error = ();
fn try_from(bytes: LogHeaderBytes) -> Result<Self, Self::Error> {
let ver = bytes[0];
let _reserved = bytes[1] & 0xF0;
let len = (((bytes[1] & 0x0F) as u16) << 8) + bytes[2] as u16;
let seq = ((bytes[3] as u32) << 16) + ((bytes[4] as u32) << 8) + (bytes[5] as u32);
// let items_count = (bytes[6] as u16) << 8 + (bytes[7]);
Ok(Self { ver, len, seq })
}
}
impl From<&LogHeader> for LogHeaderBytes {
fn from(val: &LogHeader) -> Self {
let mut bytes = [0_u8; 6];
// 8-bits: version
bytes[0] = val.ver;
// 16-bits: 4-bits(reserved) + 12-bits(length/count)
bytes[1] = ((val.len >> 8) & 0x0F) as u8;
bytes[2] = val.len as u8;
// 24-bits: sequence
bytes[3] = (val.seq >> 16) as u8;
bytes[4] = (val.seq >> 8) as u8;
bytes[5] = val.seq as u8;
bytes
}
}
impl Serialize for LogHeader {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer {
let bytes: LogHeaderBytes = self.into();
serializer.serialize_bytes(&bytes)
}
}
impl<'de> Deserialize<'de> for LogHeader {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de> {
use serde::de::Error;
let bytes: LogHeaderBytes = Deserialize::deserialize(deserializer)?;
let log_header: LogHeader = bytes.try_into()
.map_err(|_err| Error::custom("Invalid LogHeader"))?;
Ok(log_header)
}
}
// storage/src/logs/log_entry.rs
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize)]
pub struct LogEntry {
pub ts: u64,
pub defer: bool, // whether to defer logs or not.
// deferred logs are batched and only pushed to storage
// after timeout, filling page, or if followed with non-defered log
// pub content: LogContent, // I'll decide later about this (generics, impl Trait, or bytes)
}
// storage/src/utils/crs.rs
pub (crate) const CRC_LENGTH: usize = 4;
const CRC: crc::Crc<u32> = crc::Crc::<u32>::new(&crc::CRC_32_ISO_HDLC);
pub fn compute_crc(buf: &[u8]) -> u32 {
let mut crc_digest = CRC.digest();
crc_digest.update(&buf[..]); // crc full block, except crc
let crc_computed = crc_digest.finalize();
crc_computed
}
pub fn validate_crc<const BLOCK_SIZE: usize>(buf: &[u8]) -> Result<(), ()> {
// todo: fail if buf length is not equal to block size
let crc_expected = u32::from_le_bytes((&buf[(BLOCK_SIZE-CRC_LENGTH)..]).try_into().unwrap());
let crc_computed = compute_crc(&buf[..BLOCK_SIZE-CRC_LENGTH]);
if crc_expected == crc_computed {
Ok(())
} else {
Err(())
}
}
// storage/src/logs/mod.rs
mod log_header; pub use log_header::*;
mod log_entry; pub use log_entry::*;
use heapless_07::Vec;
use crate::utils::crc::{compute_crc, validate_crc, CRC_LENGTH};
use crate::{Error, BlockStorage};
use core::mem::size_of;
// #[cfg(not(feature = "std"))]
// use hal::println;
const LOGS_VERSION: u8 = 0x00;
const ERASE_VALUE: u32 = 0xFFFFFF; // const ERASE_VALUE: u32 = 0xE339E339;
const PADDING_VALUE: u8 = 0x00;
pub struct LogsStore<S: BlockStorage, const BLOCK_SIZE: usize>
// where [(); S::BLOCK_SIZE]:
{
pub storage: S,
// pub storage: BlockStorage<BLOCK_SIZE>,
next_block: usize,
next_sequence: usize,
payload: Vec<u8, BLOCK_SIZE>, // the first byte is the count of payload items
// other stats:
// - count for blocks with valid entries
// - count for all valid entries in stored blocks
}
impl<S: BlockStorage, const BLOCK_SIZE: usize> LogsStore<S, BLOCK_SIZE>
// where [(); S::BLOCK_SIZE]:
{
pub fn new(storage: S) -> Self {
if BLOCK_SIZE != S::BLOCK_SIZE {
#[cfg(feature = "std")]
panic!("BLOCK_SIZE must equal to S::BLOCK_SIZE")
}
Self {
storage,
next_block: 0,
next_sequence: 0,
payload: Vec::new(),
}
}
pub async fn mount(&mut self) -> Result<(), Error> {
let mut buf = [0_u8; BLOCK_SIZE];
// iterate/check blocks + format corrupted ones
for b_idx in 0..self.storage.blocks_count() {
// read block
let offset = b_idx * S::BLOCK_SIZE;
self.storage.read(offset, &mut buf).await?;
// skip erased blocks
{
let buf = unsafe { core::slice::from_raw_parts(buf.as_ptr() as *const u32, S::BLOCK_SIZE / 4) };
let block_is_erased = buf.iter().fold(true, |acc, current| acc & (*current == ERASE_VALUE));
if block_is_erased { continue; }
}
// try to validate header, then crc
let (header_bytes, _payload) = postcard::take_from_bytes::<LogHeaderBytes>(&buf).unwrap();
let header: Result<LogHeader, _> = header_bytes.try_into();
if let Ok(header) = header {
let crc_check = validate_crc::<BLOCK_SIZE>(&buf);
if crc_check.is_err() { continue; }
// todo: update state (next_block, max_sequence, count, etc...)
// self.count += 1; // fixme: inspect valid entries stored within the log (instead of 1)
if (header.seq + 1) as usize > self.next_sequence {
self.next_sequence = (header.seq + 1) as usize;
self.next_block = (b_idx + 1) % self.storage.blocks_count();
// if self.next_block >= BLOCK_SIZE { self.next_block = 0 }
}
}
}
Ok(())
}
pub async fn append(&mut self, log_entry: LogEntry) -> Result<(), Error>{
let mut buf = [0_u8; BLOCK_SIZE];
let new_log_bytes = postcard::to_slice(&log_entry, &mut buf).unwrap();
// flush previously defered entries if the current one couldn't fit in the same block
let overflow = self.payload.len() + new_log_bytes.len() > (S::BLOCK_SIZE - size_of::<LogHeaderBytes>() - CRC_LENGTH);
if overflow {
self.flush().await?;
}
// ensure that empty payload has the 0 count field
if self.payload.len() == 0 { self.payload.push(0).unwrap(); }
// add new entry to payload
self.payload.extend_from_slice(&new_log_bytes)
.map_err(|_| Error::BufferOverflow)?;
// increment items count
self.payload[0] = self.payload[0] + 1;
// flush non defered entry
if !log_entry.defer {
self.flush().await?;
}
// todo: estimate if current payload could never be fitted with the next log_entry
// - flush as needed
Ok(())
}
pub async fn flush(&mut self) -> Result<(), Error> {
if self.payload.len() == 0 { return Ok(()) }
use heapless_07::Vec;
let mut buf = Vec::<u8, 256>::new();
let bytes_len_excluding_crc: u16 = (size_of::<LogHeaderBytes>() + self.payload.len()) as u16;
// header
let header = LogHeader { ver: LOGS_VERSION, seq: self.next_block as u32, len: bytes_len_excluding_crc };
let header_bytes = header.to_bytes();
buf.extend_from_slice(&header_bytes).unwrap();
// payload (count + entries)
buf.extend_from_slice(&self.payload).unwrap();
// pad to BLOCK_SIZE leaving crc
for _ in buf.len()..(S::BLOCK_SIZE-CRC_LENGTH) {
buf.push(PADDING_VALUE)
.map_err(|_| Error::BufferOverflow)?;
}
// crc (end every block with crc)
let crc = compute_crc(&buf);
buf.push(crc as u8).unwrap();
buf.push((crc >> 8) as u8).unwrap();
buf.push((crc >> 16) as u8).unwrap();
buf.push((crc >> 24) as u8).unwrap();
// pad remaining space with 0x00 (we don't care about it)
for _ in buf.len()..S::BLOCK_SIZE {
buf.push(0x00).unwrap();
}
// persist
self.storage.write_block(self.next_block * S::BLOCK_SIZE, &buf).await?;
// next block
self.next_block();
Ok(())
}
fn next_block(&mut self) {
// advance state + reset buffers
self.payload.clear();
self.next_sequence += 1;
self.next_block += 1;
if self.next_block >= self.storage.blocks_count() {
self.next_block = 0;
}
}
}
@chmousset
Here, I'm sharing RAM implementation for testing the library with a PC, I also introduced the print_block method for storage inspection during development. You'll also notice I'm using RwLock for concurrency.
// storage/src/impls/ram_storage.rs
use crate::{BlockStorage, Error};
#[cfg(feature = "std")]
type RwLock<T> = async_lock::RwLock<T>;
#[cfg(not(feature = "std"))]
type RwLock<T> = embassy_sync::rwlock::RwLock<embassy_sync::blocking_mutex::raw::CriticalSectionRawMutex, T>;
// #[cfg(not(feature = "std"))]
// type BlockingMutex<T> = embassy_sync::blocking_mutex::Mutex<embassy_sync::blocking_mutex::raw::CriticalSectionRawMutex, T>;
const CAPACITY: usize = 224 * 1024;
struct Storage { ram: [u8; CAPACITY] }
pub struct RamStorage {
offset: usize,
capacity: usize,
}
impl RamStorage {
fn storage() -> &'static RwLock<Storage> {
static STORAGE: RwLock<Storage> = RwLock::new(Storage { ram: [0; CAPACITY] });
&STORAGE
}
pub fn new(offset: usize, capacity: usize) -> Result<Self, Error> {
// todo: validate block alignment
// todo: reserve non-overlapping block space
Ok(Self { offset, capacity })
}
pub async fn print_block(&self, offset: usize) -> Result<(), Error> {
if offset >= CAPACITY { return Err(Error::OutOfBounds) }
if offset % Self::BLOCK_SIZE != 0 { return Err(Error::NotAligned) }
let storage = Self::storage().read().await;
let bytes = &storage.ram[offset..(offset+Self::BLOCK_SIZE)];
for (idx, b) in bytes.iter().enumerate() {
print!("{:02X} ", b);
if idx % 32 == 31 { println!("") }
}
println!("");
Ok(())
}
}
impl BlockStorage for RamStorage {
// const READ_SIZE: usize = 4; // default
// const BLOCK_SIZE: usize = 256; // default
fn capacity(&self) -> usize { self.capacity }
async fn read(&self, offset: usize, bytes: &mut [u8]) -> Result<(), Error> {
if offset % Self::READ_SIZE != 0 { return Err(Error::NotAligned) }
if bytes.len() % Self::READ_SIZE != 0 { return Err(Error::NotAligned) }
if offset + bytes.len() > self.capacity() { return Err(Error::OutOfBounds) }
let storage = Self::storage().read().await;
let offset = self.offset + offset;
bytes.copy_from_slice(&storage.ram[offset..(offset+256)]);
Ok(())
}
async fn erase_block(&mut self, offset: usize) -> Result<(), Error> {
if offset % Self::BLOCK_SIZE != 0 { return Err(Error::NotAligned) }
if offset + Self::BLOCK_SIZE > self.capacity() { return Err(Error::OutOfBounds) }
if offset + Self::BLOCK_SIZE > CAPACITY { return Err(Error::OutOfBounds) }
let offset = self.offset + offset;
let mut storage = Self::storage().write().await;
for idx in offset..(offset+Self::BLOCK_SIZE) {
storage.ram[idx] = 0xFF;
}
Ok(())
}
async fn write_block(&mut self, offset: usize, bytes: &[u8]) -> Result<(), Error> {
if offset % Self::BLOCK_SIZE != 0 { return Err(Error::NotAligned) }
if bytes.len() % Self::BLOCK_SIZE != 0 { return Err(Error::NotAligned) }
if offset + bytes.len() > self.capacity() { return Err(Error::OutOfBounds) }
let offset = self.offset + offset;
let mut storage = Self::storage().write().await;
storage.ram[offset..(offset+Self::BLOCK_SIZE)].copy_from_slice(bytes);
Ok(())
}
}
@chmousset
@jsprog so does it work?
I implemented BlockStorage for CH32V using the PAC directly an it worked, thus providing free higher level abstractions for free (LogsStorage + KVStorage). Rest assured, I'll share the code very soon.
About the KVStorage, It's simple, unsorted key-value store with support for only one entry per page. Wear leveling is achieved by reserving multiple pages for the Storage Space. Adding the support for multiple batched entries to the same page is feasible, but introduce complexity when reclaiming free space after removing entries (compaction, garbage collection, ...) and better keep it simple with the small page size of 256 bytes
// stroage/src/kv/kv_header.rs
// note: this isn't final and may change
use serde::{Serialize, Deserialize};
pub struct KVHeader {
pub ver: u8,
pub len: u16,
pub seq: u32,
pub hashkey: u64,
}
pub type KVHeaderBytes = [u8; 14];
impl KVHeader {
pub fn to_bytes(&self) -> KVHeaderBytes {
self.into()
}
}
impl TryFrom<KVHeaderBytes> for KVHeader {
type Error = ();
fn try_from(bytes: KVHeaderBytes) -> Result<Self, Self::Error> {
let ver = bytes[0];
let _reserved = bytes[1] & 0xF0;
let len = (((bytes[1] & 0x0F) as u16) << 8) + bytes[2] as u16;
let seq = ((bytes[3] as u32) << 16) + ((bytes[4] as u32) << 8) + (bytes[5] as u32);
// let items_count = (bytes[6] as u16) << 8 + (bytes[7]);
let hashkey = 0;
Ok(Self { ver, len, seq, hashkey })
}
}
impl From<&KVHeader> for KVHeaderBytes {
fn from(val: &KVHeader) -> Self {
let mut bytes = [0_u8; 14];
// 8-bits: version
bytes[0] = val.ver;
// 16-bits: 4-bits(reserved) + 12-bits(length/count)
bytes[1] = ((val.len >> 8) & 0x0F) as u8;
bytes[2] = val.len as u8;
// 24-bits: sequence
bytes[3] = (val.seq >> 16) as u8;
bytes[4] = (val.seq >> 8) as u8;
bytes[5] = val.seq as u8;
bytes
}
}
impl Serialize for KVHeader {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer {
let bytes: KVHeaderBytes = self.into();
serializer.serialize_bytes(&bytes)
}
}
impl<'de> Deserialize<'de> for KVHeader {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de> {
use serde::de::Error;
let bytes: KVHeaderBytes = Deserialize::deserialize(deserializer)?;
let log_header: KVHeader = bytes.try_into()
.map_err(|_err| Error::custom("Invalid KVHeader"))?;
Ok(log_header)
}
}