It is necessary to create folders and subfolders in them by the names of business entities so that the names are as close as possible to the original ones.
At the same time, it is necessary to take into account the limitations of the operating system, which adds hassle - if there are virtually no limitations on linux-macos, then they are more than full on windows.
Such code turned out here, inadmissible characters are replaced with a point.
private static readonly string NormalizationPattern = string.Format(@"([{0}]*\.+$)|([{0}]+)", Regex.Escape(string.Concat(new string(Path.GetInvalidPathChars()), "?", "/", "*", "\""))); private static readonly string[] DosReservedNames = { "CON", "PRN", "AUX", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4", "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4", "LPT5", "LPT6", "LPT7", "LPT8", "LPT9" }; public static string NormalizePath(string name) { if (Environment.OSVersion.Platform == PlatformID.Unix || Environment.OSVersion.Platform == PlatformID.MacOSX) return name; const string replacement = "."; var matchesCount = Regex.Matches(name, @":\\").Count; string correctName; if (matchesCount > 0) { var regex = new Regex(@":", RegexOptions.RightToLeft); correctName = regex.Replace(name, replacement, regex.Matches(name).Count - matchesCount); } else correctName = name.Replace(":", replacement); var replace = Regex.Replace(correctName, NormalizationPattern, replacement); foreach (var reservedName in DosReservedNames) { var builder = new List<string>(); foreach (var folder in replace.Split(Path.DirectorySeparatorChar)) { var changedName = folder; if (string.Equals(folder, reservedName, StringComparison.InvariantCultureIgnoreCase)) changedName = replacement + reservedName; var value = reservedName + '.'; if (folder.StartsWith(value, StringComparison.InvariantCultureIgnoreCase)) changedName = replacement + value + folder.Remove(0, value.Length); builder.Add(changedName); } replace = string.Join<string>(Path.DirectorySeparatorChar.ToString(), builder); } return replace.TrimEnd(' ', '.'); }
The root of the folder is usually selected in the system and it already exists. And then all levels of nesting are created through normalization. Therefore, for example, trimming of points and spaces is made only at the end, not at each level. Maybe you should not do so and it is worth the name of each folder.
Tests are written on it, cases in general look like this:
[Test, Sequential] public void CheckNotAllowedNames([Values( "test" ,@"C:\somename\somename:name" ,@"usr\home\somename:name" ,@"start < > : "" / \ | ? * end" ,"\x15\x3D" // less than ASCII space ,"\x21\x3D" // HEX of !, valid ,"\x3F\x3D" // HEX of ?, not valid ,@"C:\somename\ trailing space " ,@"C:\somename\...trailing period..." ,@"C:\somename\CON" ,@"C:\somename\CON.txt" ,@"CON" ,@"C:\somename\con.txt\context" ,@"home\NUL.liza" ,@"home\ NUL.liza" ,@"C:\somename\..." // Bad name get the root folder, bug =_= ,@"root\..\sub" ,@"root\..\" ,@".\..\some?folder" ,@"root\.." // relative path trimmed, bug =_= )] string name, [Values( "test" ,@"C:\somename\somename.name" ,@"usr\home\somename.name" ,@"start . . . . . \ . . . end" ,".=" ,"!=" ,".=" ,@"C:\somename\ trailing space" ,@"C:\somename\...trailing period" ,@"C:\somename\.CON" ,@"C:\somename\.CON.txt" ,@".CON" ,@"C:\somename\.CON.txt\context" ,@"home\.NUL.liza" ,@"home\ NUL.liza" ,@"C:\somename\" ,@"root\..\sub" ,@"root\..\" ,@".\..\some.folder" ,@"root\" )] string expected) { Assert.AreEqual(expected, NormalizePath(name)); }
Actually, it would be desirable in the first place that someone looked and may have found the errors I missed.
And secondly - can I reinvent the wheel, and where is the finished normalization? Googled long and hard, but could miss, the cycle is full.
UPD1: a problem with relative paths was found and I donβt have any idea how to solve it yet, added tests with current behavior. Api dotnet allows you to request the creation of the root\folder\.....
and returns the root
folder. The help on msdn says that you can create points through api, but you should not, so as not to cause problems. As a result, processing the relative paths correctly is another question.
_
. Points are a really awkward option. I did not see ready implementations. - rdorn