RoboSharp icon indicating copy to clipboard operation
RoboSharp copied to clipboard

parser query

Open PCAssistSoftware opened this issue 10 months ago • 28 comments

@RFBomb

Hi

Any idea why the first one below can be parsed correctly, but the second one gives "'Source and Destination were unable to be parsed.'" error?

Dim a = RoboCommandParser.Parse("\\pc-01\c$\Software C:\test ""."" /E /ZB /MIR")

Dim b = RoboCommandParser.Parse("\\servername\share D:\ ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH")

PCAssistSoftware avatar Apr 26 '24 23:04 PCAssistSoftware

Likely the $ if I had to guess. All the unc paths I was testing with had a $ to denote where the root folder started (which I believe is the standard for unc pathing?), and the second statement does not have a $ in the source.

I'd have to really look into it a bit more to verify this suspicion, but that's my first guess.

RFBomb avatar Apr 27 '24 00:04 RFBomb

Thanks for replying. As far as I am aware no you don't have to have a $ in a UNC path, you would only use a $ if accessing a hidden share such as C$ or ADMIN$ or IPC$. The dollar just indicates a hidden share. Any other share you set up on a network would just be accessed using the name or IP address of the computer and the name of the shared folder e.g. \servername\sharename or \192.168.1.2\myfolder

PCAssistSoftware avatar Apr 27 '24 00:04 PCAssistSoftware

I was unaware of that nuance, as every share at my company uses $

Alright, so the regex needs to be modified to accommodate this then.

Is it safe to assume shares/unc paths always start with // or \\? If so, the regex can likely be broken into share and local validators.

Alternatively we open it up to be less discriminatory [:$] ~> [:$\\/], but that will allow the following the be treated as valid "my\example" (whereas previous that is not a treated as valid path as it does not contain : or $ )

RFBomb avatar Apr 27 '24 01:04 RFBomb

Sorry for delay in replying.

Yes it is safe to assume UNC paths / shares will always start with \\ or // e.g.

//servername/path - Unix

\\servername\path - DOS/Windows

Personally I think first suggestion above is the best way to go rather than opening it up to be less discriminatory.

PCAssistSoftware avatar Apr 27 '24 11:04 PCAssistSoftware

Meant to say any testing I can do to help resolve this please let me know. I am not very good at RegEx so will rely on your skills for this if possible :)

PCAssistSoftware avatar Apr 28 '24 14:04 PCAssistSoftware

I can likely rake a peek into this tomorrow, just create a collection of paths that should be considered valid, and a collection that are invalid. And I'll incorporate those examples into the unit tests.

As far as regex, I use Regex101 to validate everything

RFBomb avatar Apr 28 '24 15:04 RFBomb

Excellent, thank you, much appreciated

PCAssistSoftware avatar Apr 28 '24 15:04 PCAssistSoftware

Some examples:

Allowed (with and without quotes, with and without trailing slash) C:
C:\SomeDirectory //Server/Drive$/Folder //Server/Folder

Disallowed: Folder Some/Folder /Folder (though this is technically valid on Linux, but that's not standard use case so it's outside scope of this.)

That cover the base?

RFBomb avatar Apr 28 '24 19:04 RFBomb

For Allowed the slashes would need to be other way round - as // would be for Linux / Unix

e.g.

\\Server\Drive$\Folder \\Server\Folder

But yes that covers the base

PCAssistSoftware avatar Apr 28 '24 20:04 PCAssistSoftware

I was going to treat them as equivalent, as one is the Path.AlternateDirectorySeperator

RFBomb avatar Apr 28 '24 20:04 RFBomb

@RFBomb - thank you so much for looking at this.

The following example doesn't seem to work

Dim x = RoboCommandParser.TryParse("C:\Users\darren\OneDrive\Desktop\Working \\XPSDESKTOP\test ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)

Source is fine, but destination is found as \\XPSDESKTOP\t rather than \\XPSDESKTOP\test

EDIT: fixed GitHub formatting which had removed \ and * from pasted example

PCAssistSoftware avatar May 01 '24 09:05 PCAssistSoftware

I had also wanted to update the logic to not create new strings every time it checks for a flag, so I used string builder instead here to accomplish it.

As for that example, /XPSDesktop is Linux path formatting, I'm surprised it picked up anything for the destination. //XPSDESKTOP/test should work, as the regex is specifically looking for //

~~Also, your file filter shows as ""."", which doesn't go anything? Should it be "*.*" ?~~ it's early. File filter is .

RFBomb avatar May 01 '24 10:05 RFBomb

Seems to be a GitHub formatting issue as I pasted it from Visual Studio and for some reason it changed it.... pasted again below as code snippet which displays correctly

\\XPSDESKTOP\test - is the correct format for a Windows UNC path, if it was linux it would be // rather than \\ - reference = https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#unc-paths

//XPSDESKTOP/test should work, as the regex is specifically looking for //

EDIT: and for info that doesn't work either - still get destination of //XPSDESKTOP/t rather than /test

Dim x = RoboCommandParser.TryParse("C:\Users\darren\OneDrive\Desktop\Working \\XPSDESKTOP\test ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)

PCAssistSoftware avatar May 01 '24 10:05 PCAssistSoftware

Just in case it helps resolve it, if I reverse the source and destination e.g.

Dim x = RoboCommandParser.TryParse("\\XPSDESKTOP\test C:\Users\darre\OneDrive\Desktop\Working ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)

Then they parse correctly

PCAssistSoftware avatar May 01 '24 21:05 PCAssistSoftware

Been "trying" to resolve this myself and I believe it maybe to do with the regex on line 42 of RoboCommandParserFunctions.cs, because when testing in Regex101 I get the same problem where it is only matching one character after the computer name in the UNC path, see screenshots below:

image

PCAssistSoftware avatar May 02 '24 09:05 PCAssistSoftware

Further information from my testing:

It parses correctly if a destination UNC path is quoted, so below works fine, which would indicate second regex on line 42 is working correctly

 Dim x = RoboCommandParser.TryParse("C:\Users\darre\OneDrive\Desktop\Working ""\\XPSDESKTOP\test"" ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)

 Dim x = RoboCommandParser.TryParse("C:\Users\darre\OneDrive\Desktop\Working ""\\SERVER\c$\test"" ""*.*"" /COPY:DAT /DCOPY:DA /E /SEC /MOVE /R:2 /W:2 /V /FP /BYTES /NJH", t)

PCAssistSoftware avatar May 02 '24 09:05 PCAssistSoftware

Now I am definitely no expert on RegEx, but playing with Regex101 changing line 42 from

internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$\s]+?[$]?[\\\/][^*:?""<>|\s]+?) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";

to

internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|\s]*) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";

seems to work, but not sure of any other implications, so definitely need your input on it as I don't understand why it works or really what the changes I made have done....

PCAssistSoftware avatar May 02 '24 13:05 PCAssistSoftware

@RFBomb - thank you very much for the last commits, initial testing appears that all is working well. Is there anything specific changed I need to know about or test for in these changes?

Not sure if any of my previous replies were any use to you or not?

Do we need to worry about the "All checks have failed - AppVeyor build failed" message or is that an expected issue now.

PCAssistSoftware avatar May 03 '24 09:05 PCAssistSoftware

The examples were useful, as I was iterating the code I used them to create some other unit tests.

The tests failing at the moment will be fixed, just hadn't gotten to it yet. I changed how the source/dest directory path was cleansed when setting it in copy options, and a few other tweaks for minor optimizations/robustness improvements. So there's still things I am going to be looking at, including why some tests hang or fail currently.

I suspect some of the failing tests are due to string literal comparisons with quoted/unquoted paths, but again I'll have to investigate. Currently at a trade show so time to work on this is sparse.

RFBomb avatar May 03 '24 09:05 RFBomb

Excellent, thank you, I will continue testing today and report back later.

PCAssistSoftware avatar May 03 '24 09:05 PCAssistSoftware

internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$\s]+?[$]?[\\\/][^*:?""<>|\s]+?) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";

to

internal const string uncRegex = @"([\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|\s]*) | (""[\\\/]{2}[^*:?""<>|$]+?[$]?[\\\/][^*:?""<>|]+?"")";

Cursory glance, you changed [pattern]+? to [pattern]* at end of first group. I changed mine to ([pattern]+)? which is effectively same thing (from "match as many characters as possible, not greedy, but must be at least one" to "as many as possible, but may not exist"

To break it down: Greedy definition +: Continue matching as much as possible.

Not Greedy +? : if another subsection matches, this one ends, but this one will continue until that occurs or next character no longer matches this pattern. May lose last character in some instances.

Pattern 1 - no quotes

([\/]{2}[^:?""<>|$]+?[$]?[\/][^:?""<>|\s]*)

  • [\\\/]{2} Must start with directory separation, count of 2.
  • [^*:?""<>|$]+? match characters that are NOT in this group, must be atleast 1, but not greedy. Minimum 1 match.
  • [$]? optional $ character
  • [\\\/] single directory separator that must exist
  • [^*:?""<>|\s]* match anything NOT in disallowed characters collection, as many times as possible (greedy). Stops when reaches a disallowed character. Minimum 0 matches.
  • ) end of this pattern.

| OR attempt to match next pattern which is for quoted paths.

RFBomb avatar May 03 '24 09:05 RFBomb

Thank you for the explanation of the RegEx - really useful to understand it more :)

PCAssistSoftware avatar May 03 '24 10:05 PCAssistSoftware

After running multiple tests I can confirm this definitely resolves my initial problem - so thank you.

Below I will list any related problems I have found during testing:

1

Not sure this is something we can account for as it would really be down to user providing a really badly typed command, but if you parse something like below examples ( which have multiple slashes between computer name and path) then it gets confused and uses part of the malformed destination path as the file filter.

I only found this after making a typo when testing commands and is so incorrect it would really be a users error, and I know we can't cover every eventuality, but mentioning it anyway for info.

c:\source \\servername\c$ /E /R:15
c:\source \\xpsdsktop\folder /E /R:15

PCAssistSoftware avatar May 03 '24 11:05 PCAssistSoftware

So while technically the regex can accommodate those invalid inputs and cleansing them would likely be pretty trivial, my goal was to only cleanse and handle standard and proper inputs. Otherwise the possibilities are endless.

RFBomb avatar May 03 '24 12:05 RFBomb

So while technically the regex can accommodate those invalid inputs and cleansing them would likely be pretty trivial, my goal was to only cleanse and handle standard and proper inputs. Otherwise the possibilities are endless.

That makes perfect sense and is exactly what I thought you would say, but just mentioning things as I find them.

PCAssistSoftware avatar May 03 '24 12:05 PCAssistSoftware

2

Bearing in mind I totally agree with your previous response, but is there at least a simple way of just ensuring that a user doesn't try and parse a string which is completely invalid, because using TryParse you can pass something like the below (e.g. complete gobbledygook and it comes back as True!

 Dim t As RoboCommand = Nothing
 Dim x = RoboCommandParser.TryParse("gfgfgfgfdgfdge4t434rty436464646", t)

PCAssistSoftware avatar May 03 '24 14:05 PCAssistSoftware

Okay I have spent several hours today fully testing parsing across all my usage cases and it is now working fine for all UNC path variations I have tried, so thanks again for looking at this, much appreciated.

PCAssistSoftware avatar May 03 '24 16:05 PCAssistSoftware

Good feedback on the TryParse.

The reason this was successful is because current iteration ignores if both source and destination were not found, and only errors if one is found and other is not.

This can be modified to require source/destination arguments for TryParse, and ignore them (as currently does) for ParseOptions

RFBomb avatar May 03 '24 16:05 RFBomb

@RFBomb - Hi Robert, hope you are well. When you have a few spare moments (I know you are busy), could you do whatever is required to #207 to allow me to merge it in if possible? Anything I can do to help just let me know, I have already extensively tested it and all seems okay so far.

PCAssistSoftware avatar May 15 '24 16:05 PCAssistSoftware

I think it's mostly complete. But these changes were mostly thrown together, and I'm going to give them a polish/review of my own before I say go ahead.

I've been at a trade show for a few weeks, so haven't had much time to dedicate to any projects.

RFBomb avatar May 15 '24 16:05 RFBomb