There is a list of directories (just strings) that need to be sorted. For example:
C: \ Program Files \ Microsoft
C: \ Program Files (x86)
C: \ Program Files

Standard sorting:
C: \ Program Files
C: \ Program Files (x86)
C: \ Program Files \ Microsoft

It is necessary that the list is sorted in order of nesting folders:
C: \ Program Files
C: \ Program Files \ Microsoft
C: \ Program Files (x86)

In reality, the paths are stored in the ObservableCollection<AnalyseObject> Pathes where the AnalyseObject class is:

 public class AnalyseObject { public ObjectType Type { get; set; } //enum: диск / папка / файл public string Path { get; set; } //содержит выше приведенные значения public bool IsAnalyzed { get ; set; } //true - анализировать, false - нет } 

The collection is bound to a table on a form (WPF) using:

 xmlns:scm="clr-namespace:System.ComponentModel;assembly=WindowsBase" <CollectionViewSource x:Key="pathesVS" Source="{Binding Pathes}"> <CollectionViewSource.SortDescriptions> <scm:SortDescription PropertyName="Path" /> </CollectionViewSource.SortDescriptions> </CollectionViewSource> 

How can such sorting be achieved?

  • What do you mean, after a everything in a displayed? - Andrey NOP
  • one
    If you store the folders in a tree structure, it will be enough to add something like IEnumerable<Entry> Enumerate() { yield return this; foreach (var child in Children) foreach (var item in child.Enumerate()) yield return item; } IEnumerable<Entry> Enumerate() { yield return this; foreach (var child in Children) foreach (var item in child.Enumerate()) yield return item; } IEnumerable<Entry> Enumerate() { yield return this; foreach (var child in Children) foreach (var item in child.Enumerate()) yield return item; } - Andrey NOP
  • one
    or, a little shorter: IEnumerable<Entry> Enumerate() => Children.SelectMany(child => child.Enumerate()).Prepend(this); - Andrey NOP
  • Folders are obtained either from json or from a text file in which they are simply stored in the list in the form indicated. They can be from several tens of thousands. I suppose it will be easier and faster to sort them than to collect a tree. Just do not know how to do it. - krok
  • @AndreyNOP, do not tell me in what way can the existing list be converted into a tree structure? - krok

3 answers 3

I went the other way. Complemented class AnalyseObject :

 public class AnalyseObject : IComparable<AnalyseObject> { public ObjectType Type { get; set; } //enum: диск / папка / файл public string Path { get; set; } //содержит выше приведенные значения public bool IsAnalyzed { get ; set; } //true - анализировать, false - нет public int CompareTo(AnalyseObject other) { if (this.Path == other.Path) return 0; int thisPathLength = this.Path.Length; int otherPathLength = other.Path.Length; int maxIndex = Math.Min(thisPathLength - 1, otherPathLength - 1); for (int i = 0; i <= maxIndex; i++) { if (this.Path[i] == other.Path[i]) continue; if (this.Path[i] == '\\') return -1; if (other.Path[i] == '\\') return 1; return this.Path[i] < other.Path[i] ? -1 : 1; } return thisPathLength < otherPathLength ? -1 : 1; } } 

Added by:

 public static class ObservableCollection { public static void Sort<TSource, TKey>(this ObservableCollection<TSource> source, Func<TSource, TKey> keySelector, bool byDescending = false) { if (!byDescending) { List<TSource> sortedList = source.OrderBy(keySelector).ToList(); source.Clear(); foreach (var sortedItem in sortedList) { source.Add(sortedItem); } } else { List<TSource> sortedList = source.OrderByDescending(keySelector).ToList(); source.Clear(); foreach (var sortedItem in sortedList) { source.Add(sortedItem); } } } } 

Sorting is forced:

 Pathes.Sort(p => p); 

Probably not the best option, but it seems to work as it should. I would appreciate if someone offers a better option.

  • Oh, yes, just about what I offered you. Now I am writing my answer. - Andrew NOP
  • one
    Nuance - it is better to work with the paths by means specially created for this purpose: Path , Directory , File classes, etc. - Andrey NOP

As a first approximation, I would do something like this:

Take the EnumerateParts method from here and write a comparer based on it:

 class PathComparer : IComparer<string> { public int Compare(string x, string y) { var xParts = EnumerateParts(x).Reverse().ToList(); var yParts = EnumerateParts(y).Reverse().ToList(); for (int i = 0; i < xParts.Count && i < yParts.Count; ++i) { var c = string.Compare(xParts[i], yParts[i]); if (c != 0) return c; } return xParts.Count.CompareTo(yParts.Count); } IEnumerable<string> EnumerateParts(string path) { var root = Path.GetPathRoot(path); while (path != root) { yield return Path.GetFileName(path); path = Path.GetDirectoryName(path); } yield return root; } } 

Now you can use this:

 var source = new[] { @"C:\Program Files\Microsoft", @"C:\Program Files (x86)", @"C:\Program Files" }; foreach (var p in source.OrderBy(s => s, new PathComparer())) Console.WriteLine(p); 

It works as it should, but if you put a breakpoint in EnumerateParts , then you can calculate that for 3 lines it is called as many as 10 times, which can be very inefficient, so I would rewrite the code so that EnumerateParts called only once for each line. For example:

 class PathParts : IComparable<PathParts> { readonly List<string> parts; public PathParts(string path) { parts = EnumerateParts(path).Reverse().ToList(); } public int CompareTo(PathParts other) { for (int i = 0; i < parts.Count && i < other.parts.Count; ++i) { var c = string.Compare(parts[i], other.parts[i]); if (c != 0) return c; } return parts.Count.CompareTo(other.parts.Count); } IEnumerable<string> EnumerateParts(string path) { var root = Path.GetPathRoot(path); while (path != root) { yield return Path.GetFileName(path); path = Path.GetDirectoryName(path); } yield return root; } } 

and now:

 var source = new[] { @"C:\Program Files\Microsoft", @"C:\Program Files (x86)", @"C:\Program Files" }; var ordered = source.Select(s => new { s, pp = new PathParts(s) }) .OrderBy(a => a.pp) .Select(a => as); foreach (var p in ordered) Console.WriteLine(p); 

Well, now you can make PathParts nested private class in your AnalyseObject , also create an instance of PathParts once at creation and then set the method:

 public int CompareTo(AnalyseObject other) => pathParts.CompareTo(other.pathParts); 
  • one
    Also an interesting option. I’ll stop on my own for the time being, but maybe where you use it. However, I noticed one flaw - if you add the same to the array of paths, but on disk D, then the paths will be mixed up - the folders will be sorted correctly, but the disks are mixed. Something like this: 1. "D: \ Program Files", 2. "C: \ Program Files", 3. "C: \ Program Files \ Microsoft", 4. "D: \ Program Files \ Microsoft", 5 . "C: \ Program Files (x86)", 6. "D: \ Program Files (x86)". - krok
  • Aha, sensible remark, the part with the name of the disc does not fall into the parts collection, now I will correct it - Andrey NOP
  • Fixed, now the name of the disk also falls into an array of pieces of the path - Andrey NOP

I will assume that you know how to get a list of directories (see questions like this ), so I’ll focus only on building a linq query that will sort your collection.

My test data will be as follows:

 private IEnumerable<string> GetDirectoriesStub() { return new string[] { @"C:\Documents and Settings" ,@"C:\Program Files" ,@"C:\Program Files (x86)" ,@"C:\Users" ,@"C:\Windows" ,@"C:\git\gitlab.com" ,@"C:\Program Files\Common Files" ,@"C:\git\github.com" ,@"C:\Program Files\dotnet" ,@"C:\Program Files\Git" ,@"C:\Program Files\Git\bin" ,@"C:\Program Files\Git\dev\shm" ,@"C:\git" ,@"C:\Program Files\nodejs" ,@"C:\Program Files\Git\dev\mqueue" }; } 

First of all, I count the number of \ characters in each line. The more such characters there are, the deeper the folder will lie, the more its level will be:

 var paths = this.GetDirectoriesStub(); paths.Select(x => new { Path = x, Level = x.Count(y => y == '\\')} ) .Dump(); 

We get the following data set:

enter image description here

Now sort the way we want. Judging by the voiced conditions, it is necessary first to sort by the level of nesting, and secondly - by the folder name:

 paths.Select(x => new { Path = x, Level = x.Count(y => y == '\\') }) .OrderBy(x => x.Level) .ThenBy(x => x.Path) .Dump(); 

It turns out like this:

enter image description here

Something ugly. Let's try changing the sorting: first by x.Path, then by x.Level:

enter image description here

It seems so much more logical and understandable.


Update

Very close, but not quite right. Note the "C: \ Program Files (x86)", which is located between the "C: \ Program Files" and "C: \ Program Files \ Common Files". And logically, it should be displayed after all nested in the "C: \ Program Files"

Here it is already possible to build a tree so that you can see which folders have which folders. But in principle, you can dodge and continue to do sorting.

Add some magic with weights:

 paths.Select(x => new { Path = x, PathForSorting = (x + '\x01').Replace('\\', '\x02'), Level = x.Count(y => y == '\\') }) .OrderBy(x => x.PathForSorting) .ThenBy(x => x.Path) .ThenBy(x => x.Level) .Dump(); 

enter image description here

In principle, it works and if you remove .Replace('\\', '\x02') - but for reliability left the full version. (In the full version it will be possible to take into account some additional conditions if necessary)

 paths.Select(x => new { Path = x, PathForSorting = (x + '\x01'),//.Replace('\\', '\x02'), Level = x.Count(y => y == '\\') }) .OrderBy(x => x.PathForSorting) .ThenBy(x => x.Path) .ThenBy(x => x.Level) .Dump(); 

enter image description here

The explanation of magic. Why was the first character entered? To be guaranteed to distinguish who is the parent and who is the descendant. If earlier it was enough to assume that "someone whose line length is shorter is the higher one in the sorting / hierarchy," then, taking into account the example, it is already necessary to determine more precisely.

I introduced the second symbol solely for clarity, choosing it more than the symbol1. Implicitly, the sorting rule "\ x01 show above \ x02" sounds like "the parent folder (this is the \ x01 symbol) is located above any subfolders in it (this is the symbol 2)". This rule is redundant and is introduced only for clarity, so that this rule can be verbalized (the explicit one is better than the implicit one).

In general, I believe that already in this place the amount of applied magic comes to the point where it is time to think about building a tree.

There are two reasons for this.

First, some non-obviousness of the algorithm (magic).

Secondly, it is quite possible that in your directory names there will be characters below the space (all kinds of curved folders) and in a real application it is better to avoid such hacks. Is it necessary for you, so that users of the program prefer other software that will not be buggy even in case of incorrect folders on the disk? )

  • Thanks for the option. Very close, but not quite right. Note the "C: \ Program Files (x86)", which is located between the "C: \ Program Files" and "C: \ Program Files \ Common Files". And according to the logic, it should be displayed after all nested in the "C: \ Program Files". - krok
  • one
    @krok This is due to the fact that the `` character is also a certain character and is sorted in some way. You can delete it beforehand, or replace it with a character that puts it at the beginning of the list (for example, 'a' or a space), or with a character that puts it at the end of the list (for example, 'Z'). - AK
  • @krok Updated the answer. In principle, if the coefficients do not fit (look, maybe you have forgotten to specify some other condition) - you can already think about building a tree, since with more complicated rules, it will be easier to see them clearly in the tree than to understand the magic used. - AK
  • For some reason, the magic does not work, the space is higher than the characters '\ x01' '\ x02'. The result is the same. Does not want "C: \ Program Files (x86)" to navigate below "C: \ Program Files \ Common Files" and other nested in "C: \ Program Files". - krok
  • @krok Ah, I get what you want. No, this is not done by sorting - you need to honestly build a tree, I at least do not see any options. I can leave for now the answer, suddenly someone will push me to the way of finishing, but as for me - I need to remove this branch as a dead end. - AK