filesystem stdlib design
C3 needs some cross-platform way for paths/files manipulation, because in C programmers have to write platform-specific code to work with file system. The different path's encoding is an issue too.
There will be a kind of path_t object, which should be able to store paths, but there are some options, and I can't decide which one is better
- All paths stored as UTF-8. Pros:
- printing and other using without overhead, programmer will get an opportunity to write cross-platform code Cons:
- but even
fopenmay lead to memory allocation to convert path into right encoding (e.g. UTF-16 on Windows), and this overhead will occur every call offopen,stat,mkdir, etc.
- All paths use native path encoding (e.g. UTF-16 on Windows) Pros:
- closer to platform
- all file ops without memory allocation (but memory allocation is still needed for
FILEobject) Cons: - all operations with string representation of path will be very exansive, in this case some kind of
PathBuildershould be provided to allow programmer make operations with path faster
- Store all paths in UTF-8, but reserve space for conversions. For example, use
N * 2bytes to store path, which length isN. Pros:
- UTF-8 <-> UTF-16 conversions can be done inplace without using of any additional memory
path_tobject will Cons:- Even that just string conversion is better than memory allocation, there is still time overhead
- Concurrency? (What if 2nd thread accesses
path_tobject, which is now in native encoding, not in UTF-8?
- Use UTF-32?
We can also add both platform independent and platform dependent versions of the code. If the user wants to optimize, then the platform dependent versions may be used.
I think it's better to have some platform independent version, so (1) here. This will make it easier to be API stable, then this converts to the platform specifics. This can be coupled with a platform API, so that the platform API exposes the underlying platform directly. When platform dependent features are needed / maximum performance, the platform API can be used.
So something like:
std::files(platform independent code)std::files::win(platform dependent code for windows)std::files::mac(platform dependent code for mac)
etc
C++17 https://en.cppreference.com/w/cpp/filesystem as example?
Do you mean for namespace or for functionality? It's useful to also look at Ruby, Java and the ObjC functionality in Cocoa. It might be better to split this into multiple tasks for each part later on.
Java and ObjC uses a general URL-like approach, of which a file path is merely a subtype, it both have good and bad parts.
Do you mean for namespace or for functionality?
Both, I guess.
Ruby, Java and the ObjC
Or D and Rust? :)
But there are probably a lot of tasks in between:
- Unicode
- strings
- algorithms
As a general thing having libc available is a stopgap obviously, but it's there.
Currently I am working on this. Path is the normalized (and safe) path. The various operations (appending a file, getting the extension etc, all of those work on path). This is UTF-8, but it knows if the path is is using is Windows or Posix.
I'll close this for now.