Delete the line break character depending on the content of the next line

Question

Hello. I ask for the help of the implementation of the removal function, in a text file, of the newline character in the lines that begin with "nazwa =" and "khnazwa =".

Example:

nazwa = Any
text
location = any

It should look like this:

nazwa = Any text
location = any

Example 2:

khnazwa = Any
text
khadres = any

It should look like this:

khnazwa = Any text
khadres = any

The original part of the document (as you can see, the line "nazwa =" is divided into two lines, and after passing the program it should be one line):

Kontrahent{ Notatka_Dl{ opis = } id =65 flag =0 subtyp =0 znacznik =68 info =N osoba = kod =Cerrad nazwa = Spolka z Ograniczona Odpowiedzialnoscia miejscowosc =Starachowice dom = lokal = imie = bnazwa = bkonto = negoc =N grupacen =5 typ_naliczania =netto typ_ceny =B upust =0 limit =0 limitkwota =0.00 limitwaluta = rejestr_platnosci =BANK forma_platnosci =przelew 14 dni stanpl =0.00 stannl =0.00 zapas = NazwaRodzaju =Kontrahenci? }

I see this process logic. We are looking for a line that begins with "nazwa =" and check the beginning of the next one, if it contains "location =" go further, and if not, delete the line break in the line with "nazwa =" and go further.

Also in the line that starts with "khnazwa =" we check the beginning of the next one, if it contains "khadres =" we go further, and if not, delete the newline character in the line with "khnazwa =" and go further.

Thanks to all who made efforts to solve the problem.

Only approached the study of C # and the moment after reading the line with the first match, the transition to the second, and most importantly, provided that the second line does not have "khadres =" in the case of "khnazwa =", and the action of returning to the previous and performing removal is total darkness for me.

Accepted Answer · 2015-07-19T14:05:39

TextFile1.txt

 Kontrahent{ Notatka_Dl{ opis = } id =65 flag =0 subtyp =0 znacznik =68 info =N osoba = kod =Cerrad nazwa = Spolka z Ograniczona Odpowiedzialnoscia miejscowosc =Starachowice dom = lokal = imie = bnazwa = bkonto = negoc =N grupacen =5 typ_naliczania =netto typ_ceny =B upust =0 limit =0 limitkwota =0.00 limitwaluta = rejestr_platnosci =BANK forma_platnosci =przelew 14 dni stanpl =0.00 stannl =0.00 zapas = NazwaRodzaju =Kontrahenci?

 static void Main(string[] args) { var strings = File.ReadAllLines("TextFile1.txt"); var newStrings = new List<string>(); for (int i = 0; i < strings.Length; i++) { if (strings[i].Trim().StartsWith("nazwa") || strings[i].Trim().StartsWith("khnazwa")) { if (!strings[i + 1].Contains("=")) { newStrings.Add(strings[i] + " " + strings[i + 1]); i++; } } else { newStrings.Add(strings[i]); } } string[] output = newStrings.ToArray(); File.WriteAllLines("output.txt", output); }

output.txt

 Kontrahent{ Notatka_Dl{ opis = } id =65 flag =0 subtyp =0 znacznik =68 info =N osoba = kod =Cerrad nazwa = Spolka z Ograniczona Odpowiedzialnoscia miejscowosc =Starachowice dom = lokal = imie = bnazwa = bkonto = negoc =N grupacen =5 typ_naliczania =netto typ_ceny =B upust =0 limit =0 limitkwota =0.00 limitwaluta = rejestr_platnosci =BANK forma_platnosci =przelew 14 dni stanpl =0.00 stannl =0.00 zapas = NazwaRodzaju =Kontrahenci?

Thank you for the option, but the bottom line is that you need to work specifically with the lines "nazwa =" and "khnazwa =" because there are 15k lines in the document and almost every one has the = sign.
If nothing fits, perhaps it makes sense to lay out the source file, rather than 4 lines from it, in order to understand the whole task.
Well, you can add if ((s.StartsWith ("nazwa") || (s.StartsWith ("khnazwa")) && s.Contains ("="))
You are right, and did, laid out part of the original document.

ApInvent ApInvent 3,467 one 12 25 · Answer 2 · 2015-07-19T10:30:06

Still, in general terms, you can do this:

 using System; using System.Text.RegularExpressions; namespace ConsoleApplication1 { public class Program { public static void Main() { const string text = @"nazwa = Zarzycki Help In Road miejscowosc = any khnazwa = Cerrad Community Health Systems khadres = any "; var newText = Regex.Replace(text, "(.+)\\r\\n([^=]+)\\r", "$1 $2"); Console.WriteLine(newText); } } }

Answer 3 · 2015-07-19T11:02:21

Here is another option (updated with updated conditions):

 IEnumerable<string> ProcessText(IEnumerable<string> original) { string hold = null; foreach (var s in original) { if (s.StartsWith("location =") && hold != null) { yield return hold; hold = null; } if (hold != null) s = hold + " " + s; if (s.StartsWith("nazwa =") || s.StartsWith("khnazwa =")) hold = s; else yield return s; } if (hold != null) yield return hold; }

I can not verify your code, please display with the implementation of the code to open the file from c: \ example.txt and save it to c: \ example2.txt.
@IgKos: File.WriteLines(@"c:\example2.txt", ProcessText(File.ReadLines(@"c:\example.txt")));

ixSci ixSci 22k 3 33 54 · Answer 4 · 2015-07-19T08:55:45

Your task is elementary.

Read the file line by line into an array of strings
Run through the array of strings and paste the current line to the previous one, if the current line does not contain the = sign. If sticking happened - delete the current line.
Write the resulting array to the file.

Anton Komyshan Anton Komyshan 2.077 eight 26 · Answer 5 · 2015-07-19T09:03:46

  using System.Linq; class P { private static void Main(string[] args) { string str = "khnazwa = Cerrad \n Community Health Systems \n khadres = any"; System.Console.WriteLine(str); System.Console.WriteLine(PrepereSting(str)); } private static string PrepereSting(string str) { string[] strings = str.Split('\n'); for (int i = 0; i < strings.Length - 1; i++) { if (strings[i].StartsWith("nazwa") || strings[i].StartsWith("khnazwa")) { strings[i] = string.Concat(strings[i] + strings[i + 1] + "\n"); strings[i + 1] = string.Empty; } } return string.Join("", strings); } }

Instead of "\ n" it is better to use Environment.NewLine , and split by lines by passing a string to a StringReader and calling the ReadLine method.
The problem arose in the fact that the document has a lot of fields with nazwa (bnazwa, odnazwa, NazwaKatalogu, etc.).
Is it possible to open a text document, find the section Kontrahent {...} in it and already here start the executions of the line break mark removal code?

Delete the line break character depending on the content of the next line

5 answers 5

More articles: