How to make awk consider a new line, compare it with those already considered and, if it met a coincidence, add it to an existing file, and if it were new, then create a new file where all lines with the same sign would be added ?

The problem is with the creation of a large number of files, with new names ... the algorithm must invent them or call these files by the name of the string (I think), otherwise I can not imagine how to generate names that can be oriented.

If someone came across a similar problem splitting a large amount of data, I would be grateful for a hint how to write it in accordance with the awk syntax.

  • Why awk ? If the task is so complicated, it will be faster to write in C ++. Or in C #: IEnumerable <string> ReadConsoleLines () {string line; while ((line = Console.ReadLine ())! = null) yield return line; } foreach (var group in ((filename! = null)? File.ReadLines (filename): ReadConsoleLines ()). GroupBy (l => selector)) File.WriteAllLines (FileNameFromKey (group.Key), group); You can create an array of already encountered strings on awk , and add a new string to it when a new string is detected. - VladD
  • Because I work on the server where the terminal is on Linux, and from the tools that are available to me are octave and awk. And I will try to do this with an array, only the problem is not with finding strings on a separate basis ... 35711 35 193 1170 455 456 457 458 459 476 477 34812 47 193 1170 455 456 190 191 1175 1172 1201 ... I have such data 2 gigabytes, 1-departure time in seconds, the second travel time, and then the number of nodes. It is necessary to scatter the paths in different files so that in one file there are those with 3 and the last fields the same. (all lines have 3 fields identical and last) - zhildemon
  • decided in one line> awk '{print & 0 >> "Origin" $ 3 "Destination" $ NF}' filename - zhildemon

1 answer 1

response from comment:


It was decided in one line:

 $ awk '{print &0 >> "Origin "$3" Destination "$NF}' filename