file pointer shift

Question

I had a code that did reverse each line of the file in a single-byte encoding. Now it took to do this for UTF-8. Rewrote under wide char (code below). It also reverses strings, but some characters are lost. I understand that the problem is with the fseek functions at the end of the loop body. But I can not solve it myself.

#define NULL_TERMINATOR '\0' #define NEW_LINE '\n' int main() { size_t i = 0; file = _wfopen(L"rev.txt", L"r+, ccs=UTF-8"); wchar_t b[4096]{ NULL_TERMINATOR }; while ((fgetws(b, sizeof(b), file)) != NULL) { if (b[wcslen(b) - 1] == NEW_LINE) { b[wcslen(b) - 1] = NULL_TERMINATOR; wcsrev(b); b[wcslen(b)] = NEW_LINE; } else wcsrev(b); fseek(file, i, SEEK_SET); int t=fwrite(b, wcslen(b)*2, count, file); i += wcslen(b)*2+3; fseek(file, i, SEEK_SET); } return 0; }

@ Kirill21 Only here spaces, punctuation marks and Latin characters remain single-byte.
without reading the first byte of the character it is impossible to predict how long it is.
Your plan is doomed to failure, because if the symbol consists of two code points (for example, a letter and a diactric), then during reversal, their order must be maintained.
@alexolut And it seems to me that utf-8 does not have a limit on the size of the character at all.
From Wikipedia: Алгоритм UTF-8 технически позволяет записывать код любой длины. Но для эффективной и надёжной работы алгоритма необходимо ограничение длины кода. Действующий стандарт Unicode 6.х предполагает использование кода до 21-го бита, то есть до четырех байт в UTF-8.
Алгоритм UTF-8 технически позволяет записывать код любой длины. Но для эффективной и надёжной работы алгоритма необходимо ограничение длины кода. Действующий стандарт Unicode 6.х предполагает использование кода до 21-го бита, то есть до четырех байт в UTF-8.
@ Kirill21 By the way, Abyx is right, when you reverse, you may have other problems, such as combined characters.
for example, the letter Y. stackoverflow.com/questions/481050/… And say this thanks for the Russian language, utf-8 allows you to assign up to 3 combinational characters to a character and almost all libraries consider them as separate characters.
I don’t need to go far, javascript on the combined Q gives a length of 2 characters

file pointer shift

0

More articles: