openFrameworks icon indicating copy to clipboard operation
openFrameworks copied to clipboard

ofToDataPath simplification

Open dimitre opened this issue 2 years ago • 26 comments

ofToDataPath can now be hugely simplified using filesystem functions.

dimitre avatar Mar 07 '23 04:03 dimitre

my angle on ofToDataPath() is that it should support searching in multiple places. right now the logic is to locate a dataPathRoot() (which wraps defaultDataPath()) and then construct file paths based on that unique root.

why multiple roots? the case of the .app bundle is one (that could be solved in different ways) but I see value for a system that searches for resources in different places, either automatically (OS-sanctioned areas, ofxiOSGetDocumentsDirectory(), shared directories, etc) or manually (ofSetDataPathRoot(path) becomes ofAddDataPathRoot(path)). the MaxMSP search paths are an inspiration for that, allowing dynamic (runtime) "pointing" to resources in addition to the "by default" adjacent space. this opens up another conversation about what constitutes a resource ("required" assets, shaders, etc vs My movie (1).mov), but in all cases I will advocate for anything that enhances malleability for unexpected future usage (moreover if we open up the mind to apps distributed at large to "users").

however to do that it means that ofToDataPath() moves from a path-building function to something that actually looks in the filesystem to find things (to an extent, it already does to find data/), and returns not only "legal paths" (from the point of view of std::filesystem::path) but paths actually backed by a file on the filesystem (overlapping with ofFile::exists()).

there is already a bit of confusion, for instance: https://github.com/openframeworks/openFrameworks/blob/f71884f12fde7dc6f9a3ea0fff2ebb33e3e60c4d/addons/ofxSvg/src/ofxSvg.cpp#L32-L52

  • indeed, .empty() should be used instead of .compare("") or =="" as it is guaranteed to be as efficient as possible, and also in line with the STL containers and most modern C++ libs. (related topic: initialization of empty strings should be done with {} and not by assigning an empty string ="" — that includes method parameters that default to empty string. 2 low-hanging PR's there...)

  • that being said, ofToDataPath() does not validate the existence of the underlying file, so this condition is not really useful (it will branch only of the user does .load("")).

  • (on top, there is error handling within loadFromString() that would catch a bad path, so redundancy)

where I'm heading is that ofToDataPath() should gain the responsibility of validating paths and return a valid path, or nullopt (yes I'm sneaking in std::optional here), since it changes the return type, maybe a new method ofConcreteDataPath(), so the above code could become:

if (auto concrete_file = ofConcreteDataPath(fileName)) {
    auto buffer = ofBufferFromFile(concrete_file);
    if (loadFromString(buffer.getText(), concrete_file)) isLoaded = true;
    // NB optional adopts pointer operator so perhaps needed here or adapted further in
} 
//else {
//  fileName was not found in any dataPaths...
//  but local code does not need to deal with path existence (good separation!)
//  (a "file not found" warning would already be emitted within ofConcreteDataPath())
//  nevertheless, if something meaningful has to happen here the nullopt `else` branch works
//}

note that that's actually independent of searching into multiple places... sorry for not being totally structured! so (a) make ofToDataPath poly-directory and (b) make ofToDataPath confirm paths.

and note that @arturoc mentions overhauling return types here https://github.com/openframeworks/openFrameworks/pull/4998#issuecomment-197460058 where std:expected would work fine, so maybe we skip optional altogether (it's so pre-pandemic anyway) and head straight into expected? (am I scaring you enough yet?)

(note that ofBufferFromFile() is incorrectly named and should be ofBufferFromPath() (as it actually takes a path and constructs an ofFile within))

artificiel avatar Mar 07 '23 07:03 artificiel

I totally agree with your view. maybe a vector of fs::path to look for, for all platforms, and macos has two paths already initialized. Thinking of this PR as a midway to get there, do you think its logic makes sense? most functions from ofFilePath can be substituted easily for of::filesystem ones, we don't need addtrailingslash anymore too. if we don't use declared separators as '/' we dont need to use path::make_preferred

dimitre avatar Mar 07 '23 11:03 dimitre

I will not be able to read carefully today, but one issue that has to be addressed is what happens when you construct a path for writing (as ofSaveImage(pix, "myFutureImage.png") calls ofToDataPath, and if the path is relative, build one into data().

So there needs to be a "deterministic" dataPath that is used to write to. If this idea of multidatapath is to go somewhere, maybe the data paths need to be qualified in some way? brainstorm: OF_DATA_INTERNAL, OF_DATA_USER, OF_DATA_GENERATED, so when asking for a write-destined dataPath the developer can "hint" which one. In the case the macOS bundle issue, it definitely makes no sense to write user-data inside the app bundle, so it's a case where the app must find at runtime things within the bundle, and write outside the bundle. I don't know what are the best practices in iOS OF but in my case I'm using ofxiOSGetDocumentsDirectory() with absolute paths do download ressources and basically skip ofToDataPath(). I don't find that "platform exception" to be a good thing. So basically unwrap the function of ofxiOSGetDocumentsDirectory() in a generalized pattern.

also I had this on my reading list specifically around https://github.com/openframeworks/openFrameworks/pull/1523#discussion_r1516796 as evidently considerable work is into ofToDataPath et al. and whatever "upgrade" to the interface has to be carefully designed.

and as far as the error checking, I think it is important to clearly define which step is responsible for the file existence or writability (I argue it is not the app or addon writer's responsibly), and from there look a a few concrete scenarios to test properly how a design would work in actual use. (I was a bit kidding about std::expected as a "valid path or nullopt" is exactly a use case for std::optional). [edit to add]: and in the case of an unfound file it is more a matter of handling a minor "disappointment" than an actuall "error" and should be processed as such.

artificiel avatar Mar 07 '23 15:03 artificiel

Thanks. I'll be reading carefully your comments later.

Just a quick note: I don't think Poco::path should be used now we have fs. The only place I've seen is in projectGenerator and it needs badly some love.

I appreciate you think some steps ahead. Anyway just to think of "here and now" I would like your opinion on this PR as it is, compared to the master code, if you see any downside on this change.

dimitre avatar Mar 07 '23 16:03 dimitre

oh no don't worry I'm not advocating poco at all! (that thread is from 2012) just pointing out the specific comment indicating there are tricky aspects to "low level" aspect dataPathRoot() and the evolution of the method should be reviewed in order not to miss anything that might have been solved before (in terms of design, not implementation)

artificiel avatar Mar 07 '23 16:03 artificiel

I was starting to go through the specific changes and was quickly "drifting out" towards "thinking steps ahead"...

ex: the section about finding the data/ folder in bundled macOS apps, I'm not sure which should be the default: checking first ../../.. means a user placing a bundled macOS app besides a folder name data/ (accidentally or not) will not get access to it's bundled ressources. perhaps the default should be bundled so it has precedence? but then ofSaveImage(ofToDataPath("screen.png")) ends up inside the app... this immediately leads to the idea of supporting more than one path, especially considering we need an implicit path for bundled ressources, and another one for user-produced stuff.

so what I mean is as far as "review" goes it helps to have specific goal or problems to check, so:

  • for the reduction of complexity of the try/catch stuff, if it is effectively resolved by the std::filesystem without changing the behaviour (I presume the unit tests are already setup for spotting that kind of thing? although the windows file dialog ) it's a no-brainer!

  • for the handling of macOS bundled data, as I mentioned in https://github.com/openframeworks/openFrameworks/pull/7361 an imperfect solution is better than none, but there is some urgency to think about what we want in terms of design, and while the specific problem stems from macOS (and iOS to an extent) it should prompt a wider look (especially considering "users" of "apps" (vs developers compiling along their data/). also I think we should pass ${UNLOCALIZED_RESOURCES_FOLDER_PATH} from Xcode as "../Resources" comes from there and it's were Xcode will actually copy data (we take it for granted but something might change it) -- but I don't know the correct way to do that (it's an Xcode template project thing). so maybe start by using a #define MACOS_UNLOCALIZED_RESOURCES_FOLDER_PATH "../Resources" and work from there?

also line 386 of ofCubemap.cpp, going from:

std::string encFolder = data->settings.cacheDirectory;

to

of::filesystem::path encFolder = data->settings.cacheDirectory;

is the perfect example of why auto is a better idiom...!

artificiel avatar Mar 07 '23 20:03 artificiel

Great, thanks @artificel Now I see why testing both paths could lead to unexpected results. Maybe we should left as it is default (../../../data in macOS) and the user has to define this change. A helper function can be useful too, like

ofSetDataPathToBundle(); // horrible name

dimitre avatar Mar 07 '23 20:03 dimitre

about the tests of this PR. I've noticed some tests fails on windows because it is expecting a trailing slash, but it is not needed in fs::path, so to fix this we have to update the tests. what do you think @ofTheo and @artificiel?

dimitre avatar Mar 07 '23 20:03 dimitre

ofSetDataPathToBundle(); // horrible name

this approach is not ideal as the fact that the app is bundled might be a "later" decision. if you write shader.load("dude") you want it in ../../../data until you enable OF_BUNDLE_DATA_FOLDER = 1 then it should mean ${UNLOCALIZED_RESOURCES_FOLDER_PATH}/data BUT ofSaveImage("capture.png") should never save an image within the app bundle. and as an app writer you certainly don't want if (appIsBundled()) { } else { } at user-level code.

hence the discussion about multiple dataPaths and ways of hinting/distinguishing them. maybe multiple data paths as free "search paths" is not the correct idea, and something like resourcesDataPath() vs userDataPath(), which by default could both overlap in ../../../data like currently (but userDataPath = ofxiOSGetDocumentsDirectory() in iOS, and the search strategy of macOS), and then the app or addon writer gets to decide which one to use at all times / all platforms? (I switch the same apps from Mac an linux and would hate to have platform-specific stuff like that to handle (I already dislike ofxiOSGetDocumentsDirectory() but iOS apps are inherently different it does not really count as "cross platform"). We can presume the code author knows if the ressource is an internal one like a font or shader, or a "media" that the user reads/writes.

(I would also jot here not to forget the case of a multi-user linux system with a common /opt/openFrameworks install, the "data" could also be separated between the common path (resources for the examples, whatever in apps/) and something in the user's directory if that common place is not writable (ex: ~/openFrameworks/$appname/data). I have often written tools that could be used by many users, and it's always tricky (of course simply compiling is tricky in a shared env, but I'm thinking towards people running a compiled binary that could be centrally installed)).

(and while we're taking about dataPath I cannot also omit my reflection of making the function a validator of the given path, hence std::optional<of::filesystem::path> return -- which has deeper ramifications);

artificiel avatar Mar 07 '23 21:03 artificiel

to complete the above:

shader.load("dude"); // must be found in bundle
ofLoadImage("test_pattern.png"); // must be found in bundle
ofSaveImage("capture.png"); // cannot save within bundle -- automatic deduction of writable place?
ofLoadImage("capture.png"); // must be found either in or out of bundle! 
                            // hence my initial "search path" idea" which is the transparentest

or:

shader.load(ofResourceData("dude"));
ofSaveImage(ofUserData("capture.png"));
ofLoadImage(ofUserData("capture.png"));
ofLoadImage("capture.png"); // defaults to user-space

ofLoadImage("relative/path/i.png") would default to be placed within ofUserData() which defaults to ../../../data (and is less specialized than ofResourceData(), which would be required by bundled apps authors (or addons authors providing assets))

artificiel avatar Mar 07 '23 21:03 artificiel

Lots of deep insights here. Thanks @artificel. I now understand what you mean and agree there are lots of subleties to be taken care of. I linked this discussion here https://github.com/openframeworks/openFrameworks/discussions/7042 so it doesn't get lost

Do you think code in this PR has equivalent functionality of openframeworks:master? In the case we don't close this PR, should it keep functionality of this PR? checking second folder? https://github.com/openframeworks/openFrameworks/pull/7361 or maybe postponing this decision after a more detailed discussion?

dimitre avatar Mar 07 '23 23:03 dimitre

yes! for the macOS bundle part, this is a good first step (i.e. it's not perfect but you can now bundle an app and have it running (combined with https://github.com/openframeworks/openFrameworks/pull/7358 you can even copy it around with custom dylibs -- I'm doing just that)).

for the general dataPath simplification, if the tests are good I don't see a problem but I'll let some other eyes with more experience with that code confirm.

artificiel avatar Mar 08 '23 00:03 artificiel

Great! I'm hoping https://github.com/openframeworks/openFrameworks/pull/7358 get merged soon and I love the idea of separating the script

dimitre avatar Mar 08 '23 01:03 dimitre

@dimitre Seems like a few of the tests are failing on Linux too:

image

I think to merge this we should try and get it to pass the tests as is, as some of the small differences could cause real world issues.

ofTheo avatar Mar 08 '23 17:03 ofTheo

Yes, I think there is more to it, and I am investigating the issue now

dimitre avatar Mar 08 '23 17:03 dimitre

I'm a bit confused by the GitHub thread my post seems stuck as "pending" anyhow where's what I wrote above:

image

artificiel avatar Mar 13 '23 01:03 artificiel

@artificiel yes I agree it has to be ported, but things can be done incrementally too. in this case this is my starting point for this PR now.

dimitre avatar Mar 13 '23 12:03 dimitre

Hello @artificel @ofTheo @2bbb @NickHardeman I would like some thought about some suggestions here.

After reading how OF deals with file paths from end to end I've noticed we always have the reference point relative to exePath, which is different in linux/windows from macos or macos bundled data.

My suggestion is we can simplify a lot all of this by only having the working directory as data. always. This is a working PR and it is passing all file tests tests/utils/fileUtils

I think this can save a lot of headaches from now on, once we always set current_path to data folder, so data folder is always our "zero" reference, and we can use ofToDataPath always like the last step, when opening a file for read or write.

what do you think?

dimitre avatar Mar 15 '23 19:03 dimitre

Hello @artificel @ofTheo @2bbb @NickHardeman I would like some thought about some suggestions here.

After reading how OF deals with file paths from end to end I've noticed we always have the reference point relative to exePath, which is different in linux/windows from macos or macos bundled data.

My suggestion is we can simplify a lot all of this by only having the working directory as data. always. This is a working PR and it is passing all file tests tests/utils/fileUtils

I think this can save a lot of headaches from now on, once we always set current_path to data folder, so data folder is always our "zero" reference, and we can use ofToDataPath always like the last step, when opening a file for read or write.

what do you think?

So the idea would be to set current working directory to the data folder and then just remove all path translations for relative paths?

If users change ofToDataPath location that would then change the working directory?

Just wondering what sort of weird edge cases this might bring up ( app translocation, or sandboxed filesystems etc ).

Def would want to test that heavily before bringing into master.

ofTheo avatar Mar 15 '23 20:03 ofTheo

Yes this is the idea. Yes this should be tested.

dimitre avatar Mar 15 '23 20:03 dimitre

Yes, users can change with ofSetDataPathRoot and it will change accordingly this is the idea

dimitre avatar Mar 15 '23 20:03 dimitre

It may make sense to map data as cwd for some reasons, but it does not solve the macOS bundle problem as you cannot consider the bundled data dir (containing “built in” resources that are provided by the app) to be a practical place for user assets. I’ve been absorbed elsewhere but i have an overhaul proposal that would enhance the current data path pattern in a multitude of contexts (including sandbox and shared environments), and at the same time simplify and strengthen the user API (along the lines of my posts above in this topic). I should be able to share the design and an implementation within a week…

artificiel avatar Mar 15 '23 23:03 artificiel

@artificiel and if we consider the macOS bundle as a separate struggle, do you see any downside of this change?

dimitre avatar Mar 15 '23 23:03 dimitre

in general in ofFileUtils there's a fair amount of bouncing around between dataPathRoot, defaultDataPath, of::filesystem::current_path, ofFilePath::getCurrentExeDir, ofFilePath::getCurrentExePath... beyond simplification it's difficult to grasp what is the real problem, especially knowing there is a need for multiple data directories! (heheh). all this to say I don't see a downside per se (as it is an invisible-to-the-user behaviour change) but the tests better be exhaustive (for instance, is there a test that simulates a windows file popup? (the kind of thing that affects CWD -- as they say The current path as returned by many operating systems is a dangerous global variable. It may be changed unexpectedly by third-party or system library functions, or by another thread.))

artificiel avatar Mar 16 '23 02:03 artificiel

Yes the windows cwd test is something to take in consideration. I've left the original code commented but it can be brought back with same funcionality.

to clear things this PR is all about ofToDataPath. simpler functionality and the proposition of using data folder as centric across different OS.

dimitre avatar Mar 16 '23 03:03 dimitre

I've thought a lot about this suggestion of moving CWD to data and I really think it can help with decades old headache.

we can stop using open / openFromCWD differentiation and we can stop using bool bRelativeToData in functions parameters.

dimitre avatar Mar 26 '23 12:03 dimitre