I'm trying to create my own string class. It turned out that the strlen function is somehow not working as expected. What am I doing wrong?

#include <iostream> #include <cstring> #include <cstdlib> using namespace std; class my_str { private: char* s; public: my_str(); my_str(const char *str); my_str(const my_str &ob); ~my_str(); my_str &operator=(const my_str &ob); void my_strlen(); void print_str(); }; my_str::my_str() { s = new char [1]; strcpy (s, ""); cout << "конструктор" << endl; } my_str::my_str(const char *str) { cout << "конструктор парметризованный1" << endl; s = new char [strlen(str)+1]; strcpy (s, str); cout << strlen(s); cout << "конструктор парметризованный2" << endl; } my_str::my_str(const my_str &ob) { s = new char [strlen(ob.s)+1]; strcpy (s, ob.s); cout << "конструктор копии" << endl; } my_str::~my_str() { if(s) delete [] s; cout << "деструктор" << endl; } my_str &my_str::operator=(const my_str &ob) { cout << "=" << endl; cout << ob.s << endl; cout << strlen(ob.s) << endl; delete [] s; s = new char[strlen(ob.s)+1]; strcpy(s, ob.s); return *this; } void my_str::my_strlen() { cout << strlen(s) << endl; } void my_str::print_str() { if(s) { for(int i = 0; s[i]; i++) { cout << s[i] << " "; } } cout << "_" << endl; } int main() { my_str a("Привет "), b("всем!"), c; cout <<strlen("Привет ")<< endl; a.my_strlen(); cout <<strlen("всем!")<<endl; b.my_strlen(); c.my_strlen(); c=a; c.my_strlen(); c.print_str(); a=b; a.my_strlen(); a.print_str(); return 0; } 

After starting the program, I get (g ++ compiler):

 конструктор парметризованный1 13конструктор парметризованный2 конструктор парметризованный1 9конструктор парметризованный2 конструктор 13 13 9 9 0 = Привет 13 13 ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ _ = всем! 9 9 ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ! _ деструктор деструктор деструктор 

Why are 13 and 9 when should be 7 and 5 ?? And where do these strange characters come from when you print character by character, not a string?

The encoding was first replaced with ASCII, then with Windows-1251. strlen became true to count. However, with Russian letters it became even worse:

 ▒▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 1 7▒▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 2 ▒▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 1 5▒▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 2 ▒▒▒▒▒▒▒▒▒▒▒ 7 7 5 5 0 = ▒▒▒▒▒▒ 7 7 ▒ ▒ ▒ ▒ ▒ ▒ _ = ▒▒▒▒! 5 5 ▒ ▒ ▒ ▒ ! _ ▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒ 

Now everything is in hieroglyphs. Return to previous encodings is no longer working ...

  • four
    I think that you have a source in UTF-8, and Russian letters have more than one byte ... Here from 7 characters - 6 Russian + ASCII you get 6 * 2 + 1 = 13, from 4 + 1 - 9 ... - Harry
  • Привет in UTF-8: D0 9F D1 80 D0 B8 D0 B2 D0 B5 D1 82 20, and what is your OS and what editor? in vs in vs it should be typed: u8 "Hello" - Pavel Gridin
  • I write the code in Notepad ++ (yes, the UTF-8 encoding). I launch the program through MINGW64 (Git Bash). OS - Windows 7. - Jane_Brown February
  • In Notepad ++ u8 not. Offers to convert to ASCII, I'll try. - Jane_Brown
  • To begin, set up an experiment: use only characters from the lower half of ASCII: a space, numbers, punctuation marks, English large and small letters. Most likely, the function will work correctly. This means the encoding of the editor is different from the single letter. Put the coding CP1251. At the command prompt, run chcp 1251 , then the output of your program will be correct. - Mark Shevchenko

0