godirwalk
godirwalk copied to clipboard
Add helper function Dirent.FollowSymlink()
Added a helper function for handling symlinks to Dirent. This function can be used as follows:
err := godirwalk.Walk("./example", &godirwalk.Options{
FollowSymbolicLinks: true,
Callback: func(path string, de *godirwalk.Dirent) error {
// resolve any symlinks we encounter to their linked filepath
de, err := de.FollowSymlink()
if err != nil {
return err
}
log.WithFields(log.Fields{
"path": path,
"IsRegular": de.IsRegular(),
"IsDir": de.IsDir(),
"IsSymlink": de.IsSymlink(),
}).Info("")
return nil
},
})
This is useful because without following the symlink via filepath.EvalSymlinks() and creating a new Dirent, then the original Dirent on some systems will return de.IsSymlink() // true and de.IsDir() // false and de.IsRegular() // false, which can cause accidental mistakes. By using de.FollowSymlink() you can then be confident that de.IsDir() and de.IsRegular() will now return stat information about the followed symlink.
This PR also includes adding Dirent.Path()
and Dirent.path
to support absolute paths when following the symlink.
Note: There exists a subtle flaw in that you can't reassign and initialize a variable in the same := or = in golang, so instead these must be separated out. :/ The other option is to pre-declare both variables such as var err error or adding it to the function proto.
Can you provide a mini file system example that led you down the path of adding this functionality?
There is nothing in the new FollowSymLinks
method that requires it to have internal representation of the Dirent
structure. This method could very well be implemented in the callback:
Callback: func(osPathname string, de *godirwalk.Dirent) error {
// resolve any symlinks we encounter to their linked filepath
if de.IsSymlink() {
resolvedPath, err := filepath.EvalSymlinks(osPathname)
if err != nil {
return nil, err
}
absPath, err := filepath.Abs(resolvedPath)
if err != nil {
return nil, err
}
de, err = NewDirent(absPath)
if err != nil {
return nil, err
}
}
log.WithFields(log.Fields{
"path": path,
"IsRegular": de.IsRegular(),
"IsDir": de.IsDir(),
"IsSymlink": de.IsSymlink(),
}).Info("")
return nil
},
Or implemented as a function in a given program.
func NewDirentByFollowingSymlinks(osPathname string, de *godirwalk.Dirent) (*godirwalk.Dirent, error) {
resolvedPath, err := filepath.EvalSymlinks(osPathname)
if err != nil {
return nil, err
}
absPath, err := filepath.Abs(resolvedPath)
if err != nil {
return nil, err
}
return NewDirent(absPath)
}
Then in the Callback, you could call this function:
Callback: func(osPathname string, de *godirwalk.Dirent) error {
// resolve any symlinks we encounter to their linked filepath
if de.IsSymlink() {
var err error
de, err = NewDirentByFollowingSymlinks(osPathname, de)
if err != nil {
return nil, err
}
}
log.WithFields(log.Fields{
"path": path,
"IsRegular": de.IsRegular(),
"IsDir": de.IsDir(),
"IsSymlink": de.IsSymlink(),
}).Info("")
return nil
},
I have two primary and two minor concerns with adding this into godirwalk right now. My first concern is expanding the API without my clearly understanding the benefit of doing so, especially in light of the two alternate solutions I have provided above.
My second concern flows from the first. If we added FollowFromSymlink
to the API, it has limited use because not everyone would necessarily want to create a new Dirent
from the absolute pathname, but maybe simply the resolved path.
My third concern is minor, but storing the entire pathname in each Dirent
structure adds memory churn, when it might not always be required.
Finally my final concern is some of the API calling semantics of the proposed FollowSymlink
method. It calls by value rather than by pointer, making a copy of the Dirent
, only to return a pointer of itself if the entry is not a symlink. Then for all of the error conditions, it returns a copy of the copy of the entry along with the error. This is a problem because it allows sloppy programming, where a user might not bother checking the error and might just check the first return value for something. Finally, in the happy case, it calls NewDirent
which just allocates yet another chunk of RAM for the new Dirent
instance.
Thanks for the review.
I'll address the minor concerns first about memory and relative path:
- For the value receiver vs. pointer receiver, that can be changed and was simply a copy/paste from the other functions in
dirent.go
, I agree that a pointer receiver is better in this case. - For returning a new Dirent vs changing the underlying Dirent received via pointer, this is initially how I wanted to implement this but got different feedback from an private review first. I think that we could receive a pointer to
FollowSymlink()
and it would change the underlying dirent to receive information about the new node on the filesystem. Either way is fine with me. - For the extra pathname being stored, I found it strange that a Dirent doesn't know where it came from or how to represent itself, it's like it only stores the information related to some basic filesystem info. I think adding in path for Dirent in general would be a good choice to allow for more flexibility in this library
Why should we add this feature at all?
The use case that led me to this library is being able to walk filepaths in golang with the ability to interact with symlinks. This library is the most full featured and user friendly, and I think this additional API fits with godirwalk's calling card of "interacts with symlinks".
Edit: I also originally thought that FollowSymbolicLinks: true
would have automatically followed the symlinks and was surprised to see that godirwalk only just told me about the links and didn't actually follow them. This led me to think, "Well why couldn't godirwalk just follow the links automatically for me?".
I could write an internal function that extends this api only in my one project, but I want to share this ability with the world and share in the user friendly functionality. Just knowing that a path is a symlink is great, because now we can interact with it, and being able to directly interact with symlinks as a first class citizen of this library is a cherry on top in my opinion. And that's why I sought to add this feature to godirwalk.
Clarification:
If we added FollowFromSymlink to the API, it has limited use because not everyone would necessarily want to create a new Dirent from the absolute pathname, but maybe simply the resolved path.
What do you mean by this where someone would only want to visit the symlink on the resolved path? In my understanding symlinks (as they're soft links) represent a relative path to another path on the system that must have an absolute representation. These symlinks could be relative to another relative path to the originating path used in the call to Walk()
but it could also not be relative to that, so I'm a bit confused on what you mean by the resolved path as a resolved path may not exist relative to the original path for all symlinks we encounter while walking?