rfcs
rfcs copied to clipboard
RFC: proposal for filtering `WalkerHandler` directories
Have you considered changing the dir_entries
from an array to a list (in WalkerHandler
, Directory.entries
and other places) as an alternative?
@dipinhora using a list would get around the awkwardness around removing items from the array. Two things come to mind with that approach:
- What would the performance impact be? I think using an array is probably going to be faster than a list in terms of creation and traversal, though not necessarily in terms of removal.
- The idea of modifying a structure that is passed into the function rather than returning a value feels a little strange. As far as I can recall, this is the only place in the stdlib where we do something like that. So whether we're passing in an array or a list, I don't think the handler code should be modifying it.
@aturley
- re: performance, the list create code in
Directory.entries
isn't written with performance in mind currently as it is not pre-allocating an array of whatever number of entries it will need (https://github.com/ponylang/ponyc/blob/master/packages/files/directory.pony#L92). Given that, I don't think that the list creation using an array will necessarily be faster. Agreed that traversal using an array would be faster. Either way, I'm not sure that how performant this code needs to be but it would likely require thinking more holistically about it. - I find it amusing that you mentioned the performance concern in item 1 and then didn't want to modify the list in place for this item which would likely be more performant when using a
list
as compared to creating a newlist
orarray
. Regardless, I'm ambivalent to whether the handler code modifies it in place or has to make a copy but I would prefer to keep the option of being able to modify in place.
Probably a better option altogether would be to have a function that provides a true
/false
or similar for filtering each entry when passed to it instead of passing the full list to the handler
so there's no modification of the list nor the need to create a new list either. When the function returns a true
the walk
happens and when the function returns a false
the walk is skipped.
@dipin
There are tradeoffs here between performance and various abstractions. If we truly wanted the best performance we would write it in assembly. But, since we've chose to use Pony I'm guessing we're all at least implicitly willing to make certain performance tradeoffs. One of the questions here is "which tradeoffs between performance and ease of use and consistency are we willing to make?"
Can you provide an example of what you're thinking of with the true/false
filtering? I have an idea of how that might work but I'd like to see if we're thinking of the same thing.
@aturley Yes, of course. Sorry if my earlier comment came across as antagonistic. That wasn't my intent.
The following is something along the lines of what I meant by the true
/false
filtering:
class MyWalkerHandler is WalkerHandler
fun ref apply(dir_path: FilePath, dir_entry: String): Bool =>
let skip_directories_with_string = "IGNORE_ME"
if dir_entry.contains(skip_directories_with_string) then
false
else
true
end
with the following versions of WalkerHandler
interface:
interface WalkerHandler
"""
A handler for `FilePath.walk`.
"""
fun ref apply(dir_path: FilePath, dir_entry: String): Bool
and Filepath.walk
:
fun val walk(handler: WalkHandler ref, follow_links: Bool = false) =>
"""
Walks a directory structure starting at this.
`handler(dir_path, dir_entry)` will be called for each directory
starting with this one. The handler returns true/false for which subdirectories
to walk or skip.
"""
try
with dir = Directory(this)? do
var entries: Array[String] ref = dir.entries()?
for e in entries.values() do
if WalkHandler(this, e) then
let p = this.join(e)?
let info = FileInfo(p)?
if info.directory and (follow_links or not info.symlink) then
p.walk(handler, follow_links)
end
end
end
end
else
return
end
I've never used the API in question. I know that @aturley has used it a lot. I trust his judgment on the changes that are needed. I'm sure we have a lot of APIs in the standard library that could use some love. Particularly in the file related sections of the standard library.
@aturley are you interested in continuing with this RFC?