RoboSharp
RoboSharp copied to clipboard
parser query
@RFBomb
Hi
Any idea why the first one below can be parsed correctly, but the second one gives "'Source and Destination were unable to be parsed.'" error?
Dim a = RoboCommandParser.Parse("\\pc-01\c$\Software C:\test ""."" /E /ZB /MIR")
Dim b = RoboCommandParser.Parse("\\servername\share D:\ ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH")
Likely the $
if I had to guess. All the unc paths I was testing with had a $ to denote where the root folder started (which I believe is the standard for unc pathing?), and the second statement does not have a $
in the source.
I'd have to really look into it a bit more to verify this suspicion, but that's my first guess.
Thanks for replying. As far as I am aware no you don't have to have a $ in a UNC path, you would only use a $ if accessing a hidden share such as C$ or ADMIN$ or IPC$. The dollar just indicates a hidden share. Any other share you set up on a network would just be accessed using the name or IP address of the computer and the name of the shared folder e.g. \servername\sharename or \192.168.1.2\myfolder
I was unaware of that nuance, as every share at my company uses $
Alright, so the regex needs to be modified to accommodate this then.
Is it safe to assume shares/unc paths always start with //
or \\
? If so, the regex can likely be broken into share and local validators.
Alternatively we open it up to be less discriminatory [:$]
~> [:$\\/]
, but that will allow the following the be treated as valid "my\example" (whereas previous that is not a treated as valid path as it does not contain : or $ )
Sorry for delay in replying.
Yes it is safe to assume UNC paths / shares will always start with \\ or // e.g.
//servername/path - Unix
\\servername\path - DOS/Windows
Personally I think first suggestion above is the best way to go rather than opening it up to be less discriminatory.
Meant to say any testing I can do to help resolve this please let me know. I am not very good at RegEx so will rely on your skills for this if possible :)
I can likely rake a peek into this tomorrow, just create a collection of paths that should be considered valid, and a collection that are invalid. And I'll incorporate those examples into the unit tests.
As far as regex, I use Regex101 to validate everything
Excellent, thank you, much appreciated
Some examples:
Allowed (with and without quotes, with and without trailing slash)
C:
C:\SomeDirectory
//Server/Drive$/Folder
//Server/Folder
Disallowed: Folder Some/Folder /Folder (though this is technically valid on Linux, but that's not standard use case so it's outside scope of this.)
That cover the base?
For Allowed the slashes would need to be other way round - as // would be for Linux / Unix
e.g.
\\Server\Drive$\Folder \\Server\Folder
But yes that covers the base
I was going to treat them as equivalent, as one is the Path.AlternateDirectorySeperator
@RFBomb - thank you so much for looking at this.
The following example doesn't seem to work
Dim x = RoboCommandParser.TryParse("C:\Users\darren\OneDrive\Desktop\Working \\XPSDESKTOP\test ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)
Source is fine, but destination is found as \\XPSDESKTOP\t rather than \\XPSDESKTOP\test
EDIT: fixed GitHub formatting which had removed \ and * from pasted example
I had also wanted to update the logic to not create new strings every time it checks for a flag, so I used string builder instead here to accomplish it.
As for that example, /XPSDesktop
is Linux path formatting, I'm surprised it picked up anything for the destination. //XPSDESKTOP/test
should work, as the regex is specifically looking for //
~~Also, your file filter shows as "".""
, which doesn't go anything? Should it be "*.*"
?~~ it's early. File filter is .
Seems to be a GitHub formatting issue as I pasted it from Visual Studio and for some reason it changed it.... pasted again below as code snippet which displays correctly
\\XPSDESKTOP\test - is the correct format for a Windows UNC path, if it was linux it would be // rather than \\ - reference = https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#unc-paths
//XPSDESKTOP/test should work, as the regex is specifically looking for //
EDIT: and for info that doesn't work either - still get destination of //XPSDESKTOP/t rather than /test
Dim x = RoboCommandParser.TryParse("C:\Users\darren\OneDrive\Desktop\Working \\XPSDESKTOP\test ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)
Just in case it helps resolve it, if I reverse the source and destination e.g.
Dim x = RoboCommandParser.TryParse("\\XPSDESKTOP\test C:\Users\darre\OneDrive\Desktop\Working ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)
Then they parse correctly
Been "trying" to resolve this myself and I believe it maybe to do with the regex on line 42 of RoboCommandParserFunctions.cs, because when testing in Regex101 I get the same problem where it is only matching one character after the computer name in the UNC path, see screenshots below:
Further information from my testing:
It parses correctly if a destination UNC path is quoted, so below works fine, which would indicate second regex on line 42 is working correctly
Dim x = RoboCommandParser.TryParse("C:\Users\darre\OneDrive\Desktop\Working ""\\XPSDESKTOP\test"" ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)
Dim x = RoboCommandParser.TryParse("C:\Users\darre\OneDrive\Desktop\Working ""\\SERVER\c$\test"" ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)
Now I am definitely no expert on RegEx, but playing with Regex101 changing line 42 from
internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$\s]+?[$]?[\\\/][^*:?""<>|\s]+?) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";
to
internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|\s]*) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";
seems to work, but not sure of any other implications, so definitely need your input on it as I don't understand why it works or really what the changes I made have done....
@RFBomb - thank you very much for the last commits, initial testing appears that all is working well. Is there anything specific changed I need to know about or test for in these changes?
Not sure if any of my previous replies were any use to you or not?
Do we need to worry about the "All checks have failed - AppVeyor build failed" message or is that an expected issue now.
The examples were useful, as I was iterating the code I used them to create some other unit tests.
The tests failing at the moment will be fixed, just hadn't gotten to it yet. I changed how the source/dest directory path was cleansed when setting it in copy options, and a few other tweaks for minor optimizations/robustness improvements. So there's still things I am going to be looking at, including why some tests hang or fail currently.
I suspect some of the failing tests are due to string literal comparisons with quoted/unquoted paths, but again I'll have to investigate. Currently at a trade show so time to work on this is sparse.
Excellent, thank you, I will continue testing today and report back later.
internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$\s]+?[$]?[\\\/][^*:?""<>|\s]+?) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";
to
internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|\s]*) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";
Cursory glance, you changed [pattern]+?
to [pattern]*
at end of first group. I changed mine to ([pattern]+)?
which is effectively same thing (from "match as many characters as possible, not greedy, but must be at least one" to "as many as possible, but may not exist"
To break it down:
Greedy definition +
: Continue matching as much as possible.
Not Greedy +?
: if another subsection matches, this one ends, but this one will continue until that occurs or next character no longer matches this pattern. May lose last character in some instances.
Pattern 1 - no quotes
([\/]{2}[^:?""<>|$]+?[$]?[\/][^:?""<>|\s]*)
-
[\\\/]{2}
Must start with directory separation, count of 2. -
[^*:?""<>|$]+?
match characters that are NOT in this group, must be atleast 1, but not greedy. Minimum 1 match. -
[$]?
optional $ character -
[\\\/]
single directory separator that must exist -
[^*:?""<>|\s]*
match anything NOT in disallowed characters collection, as many times as possible (greedy). Stops when reaches a disallowed character. Minimum 0 matches. -
)
end of this pattern.
|
OR attempt to match next pattern which is for quoted paths.
Thank you for the explanation of the RegEx - really useful to understand it more :)
After running multiple tests I can confirm this definitely resolves my initial problem - so thank you.
Below I will list any related problems I have found during testing:
1
Not sure this is something we can account for as it would really be down to user providing a really badly typed command, but if you parse something like below examples ( which have multiple slashes between computer name and path) then it gets confused and uses part of the malformed destination path as the file filter.
I only found this after making a typo when testing commands and is so incorrect it would really be a users error, and I know we can't cover every eventuality, but mentioning it anyway for info.
c:\source \\servername\c$ /E /R:15
c:\source \\xpsdsktop\folder /E /R:15
So while technically the regex can accommodate those invalid inputs and cleansing them would likely be pretty trivial, my goal was to only cleanse and handle standard and proper inputs. Otherwise the possibilities are endless.
So while technically the regex can accommodate those invalid inputs and cleansing them would likely be pretty trivial, my goal was to only cleanse and handle standard and proper inputs. Otherwise the possibilities are endless.
That makes perfect sense and is exactly what I thought you would say, but just mentioning things as I find them.
2
Bearing in mind I totally agree with your previous response, but is there at least a simple way of just ensuring that a user doesn't try and parse a string which is completely invalid, because using TryParse you can pass something like the below (e.g. complete gobbledygook and it comes back as True!
Dim t As RoboCommand = Nothing
Dim x = RoboCommandParser.TryParse("gfgfgfgfdgfdge4t434rty436464646", t)
Okay I have spent several hours today fully testing parsing across all my usage cases and it is now working fine for all UNC path variations I have tried, so thanks again for looking at this, much appreciated.
Good feedback on the TryParse.
The reason this was successful is because current iteration ignores if both source and destination were not found, and only errors if one is found and other is not.
This can be modified to require source/destination arguments for TryParse, and ignore them (as currently does) for ParseOptions
@RFBomb - Hi Robert, hope you are well. When you have a few spare moments (I know you are busy), could you do whatever is required to #207 to allow me to merge it in if possible? Anything I can do to help just let me know, I have already extensively tested it and all seems okay so far.
I think it's mostly complete. But these changes were mostly thrown together, and I'm going to give them a polish/review of my own before I say go ahead.
I've been at a trade show for a few weeks, so haven't had much time to dedicate to any projects.