regex:file_path:windows
This is an old revision of the document!
Table of Contents
Regex - File Path - Windows
^(\w:\\$)|^([A-Za-z_\-\s0-9\.\$]+)$|^(?:[\w]\:|\\)(\\(?!\.)(?!\s[a-z_0-9-])[A-Za-z_\-\s0-9\.\$]+)+$ ^(\w:\\$)|^(?:[\w]\:|\\)(\\(?!\.)(?!\s[a-z_0-9-])[A-Za-z_\-\s0-9\.\$]+)+$
NOTE: This is made up of:
- ^(\w:\\$): Caters for C:\.
- ^([A-Za-z_\-\s0-9\.\$]+)$: Caters for plain filenames, such as file.txt or file.
- ^(?:[\w]\:|\\)(\\(?!\.)(?!\s[a-z_0-9-])[A-Za-z_\-\s0-9\.\$]+)+$: Cater for UNC and standard files with a directory name.
NOTE: This regex supports the following as valid.
c:\ c:\folder\myfile.txt c:\folder\myfileWithoutExtension c:\my folder\abc abc.docx c:\my-folder\another_folder\abc.v2.docx C:\pictures\holiday C:\pictures\holiday\ \\192.168.0.1\folder\file.pdf \\192.168.0.1\my folder\folder.2\file.gif \\10.30.28.52\CDSDataCenter\Servers\Continuous\CDS20$ \\server\filename \\server\filename with space \\test\test$\TEST.xls \\server\share\folder\myfile.txt \\server\share\myfile.txt \\123.123.123.123\share\folder\myfile.txt
NOTE: The following are not valid:
file file.xls c:\my folder\another_folder\.docx c:\my folder\\another_folder\abc.docx c:\my folder\another_folder\ab*c.v2.docx c:\my?folder\another_folder\abc.v2.docx C:\ pictures\holiday C:\ pictures\holiday\ C:\pictures \ holiday C:\pictures \ holiday\ C:\pictures\ holiday \ \\192.168.0.1\folder\fi<le.pdf \\192.168.0.1\folder\\file.pdf \\192.168.0.1\my folder\folder.2\.gif \\server\filename\{token}\file
^(?:[\w]\:|\\)(\\(?!\.+)[A-Za-z_\-\s0-9\.\$]+)+$
Works….
(?:[\w]\: |
---|
(?:[\w]\: |
---|
([a-z_\-\s\0-9\.\\]+)+([a-z_\-\s\0-9\.]+)(\\)([a-z_\-\s\0-9]+)$
^(?:[\w]\:|\\)(\\[a-z_\-\s0-9\.]+)+\.(txt|gif|pdf|doc|docx|xls|xlsx)$ ^(?:[\w]\:|\\)(\\[a-z_\-\s\0-9\.]+)+(?:\\)([a-z_\-\s\0-9]+){1}$
(?:(?:[a-z]:|\\\\[a-z0-9_.$\●-]+\\[a-z0-9_.$\●-]+)\\| # Drive \\?[^\\/:*?"<>|\r\n]+\\?) # Relative path (?:[^\\/:*?"<>|\r\n]+\\)* # Folder [^\\/:*?"<>|\r\n]* # File ^(?:[\w]\:\\|\\\\)([a-z0-9_.$\s-]+\\[a-z0-9_.$\.-]+\\|\\?[^\\/:*?"<>|\r\n]+\\?)(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*$
Windows
(\\\\([a-z|A-Z|0-9|-|_|\s]{2,15}){1}(\.[a-z|A-Z|0-9|-|_|\s]{1,64}){0,3}){1}(\\[^\\|\/|\:|\*|\?|"|\<|\>|\|]{1,64}){1,}(\\){0,}
NOTE: Disallows a few characters: \/:*?"<>|.
^((?:[a-z]:\\$|(?:[a-z]:|\\\\[a-z]+\\(?!\.)[^\r\n$<>]+\$?)))((\\|(\\(?!\.)[^\r\n<>\\]+)*)(?<!\\)$)
^[a-zA-Z]:\\(((?![<>:"/\\|?*]).)+((?<![ .])\\)?)*$
NOTE: It makes the path conform to the NTFS standard (see the MSDN spec).
- ^[a-zA-Z]:\\ matches single drive letter, with colon and backslash
- (?![<>:"/\\|?*]) is a negative lookahead to ensure the next character is not invalid
- ((?![<>:"/\\|?*]).)+ wraps that lookahead, followed by the next character, any number of times
- (?<![ .])\\ is a negative lookbehind to ensure the file/directory doesn't end with a space or period. Please note: Lookbehinds are not fully implemented everywhere just yet.
All of that is is repeated 0 to many times, with the last backslash optional.
For many use cases it may be best to restrict the path length to 256 characters.
- To do so, replace *with {0,256}.
References
regex/file_path/windows.1621552350.txt.gz · Last modified: 2021/05/20 23:12 by peter