I have a string. I need to get a list or array of all capital letters of the English alphabet. And so that the letters in the list are not repeated.

For example: str = "AvB ^ Cv (A ^ B)";

Must get: ["A", "B", "C"]

    5 answers 5

    Once the c++ tag is in your hands, unordered_set ! :) Although for me it is easier to scan the string and mark the presence of letters in an array of 26 bool ...

    Here is the bitset option:

     string str = "csvCFCjgcgbcYcgmCUYGlhKBNJHulGCVcjgfD"; bitset<26> b; for(auto c: str) { if ((c >= 'A') && (c <= 'Z')) b.set(c-'A'); } vector<char> v; for(size_t i = 0; i < b.size(); ++i) { if (b[i]) v.push_back('A'+i); } for(auto c: v) { cout << c; } 

    Or

     unordered_set<char> u; for(auto c: str) { if ((c >= 'A') && (c <= 'Z')) u.insert(c); } for(auto c: u) { cout << c; } 

    Also O(n) on average.

      Solution to the forehead ( c++11 ):

       std::set<char> result; for (auto &c : str) { if (tolower(c) != c) { result.emplace(c); } } 
      • O (n lg n), although the task is O (n) - Harry
      • Pure O (n) on standard containers do not do, or additional memory, or extra time. Everything spoils the check for repetition. - SBKarr
      • Cited the code O (n) on standard containers. Yes, with an intermediate bitset , but just not sure, you have to look at the complexity of inserting into unordered_set . PS - looked, on average, the insert for O (1), so also O (n) - Harry

      Using the facts known to us that the number of characters from 'A' to 'Z' less than the size of int (in bits) , and also that the numerical values ​​of the codes of these characters in ASCII (by the way, in Unicode too) go in a row , you can write a simple program in which the integer variable is used as the set of capital letters found.

      We view the specified string and if the code of the next character is in the range from 'A' to 'Z' , set to 1 the corresponding bit in the previously set to zero variable of type int (the desired set).

      Next, we look through this set and put the letters in it into an array, the maximum size of which we also know in advance.

       #include <stdio.h> #include <stdlib.h> int main (int ac, char *av[]) { char upc['Z' - 'A' + 2], // тут будет результат *src = av[1] ? av[1] : (char *)"AaBbcD...XyZ."; // исходная строка по умолчанию unsigned int set = 0, // временное множество, будет уничтожено при производстве результата i, // индекс для перебора множества j; // индекс свободного места в результате (upc[]) for (; *src; src++) if (*src >= 'A' && *src <= 'Z') set |= (1 << (*src - 'A')); // это заглавная буква, добавим ее в наше множество for (i = j = 0; set; i++, set >>= 1) // перебираем множество бит за битом, пока в нем еще что-то есть if (set & 1) upc[j++] = i + 'A'; // добавим очередную заглавную букву в результат upc[j] = 0; return puts(upc) == EOF; } 

      The program and gcc and g ++ are perceived as their own.

      PS
      This is roughly "under the hood" in a C ++ program, for example, from the example of @Harry

      UPDATE

      After thinking a bit about the comment that it was sad, I decided to propose an option that preserves the accumulated set and at the same time leaves intact the pointer to the original data.

        unsigned int set = 0, i, j; for (i = j = 0; src[i]; i++) if (src[i] >= 'A' && src[i] <= 'Z') if (!(set & (1 << (src[i] - 'A')))) { set |= (1 << (src[i] - 'A')); upc[j++] = src[i]; } upc[j] = 0; 

      Of course, now capital letters are not displayed in alphabetical order.

      • This is all very sad. - αλεχολυτ
      • @alexolut, I am also sure that the C ++ compiler does not (unfortunately) have such initial knowledge about the nature of data. - avp
      • What nature are you talking about? If the possibility of packing several Boolean variables in one byte, then this is provided in both the bitset and vector <bool>. Competently written C ++ code has the same efficiency as C code. In this case, the program is written in higher-level abstractions, which reduces the chance to make a mistake. Of course, in order to assert efficiency, it is necessary to compare assembly exhaust, but I have repeatedly convinced that this approach is correct, and this case is unlikely to be an exception. - αλεχολυτ
      • @alexolut, that the codes from A to Z after a small transformation are placed in an int . It's about the "understanding" of this transformation by the compiler. But how much (and in what situations) high-level abstractions reduce the number of errors is a separate question. - avp
      • In this case, it is completely unclear to me why we need a purely shared approach for solving the problem of c++ . Your comment seems to say: С++ компилятор не обладает ... то ли дело C . But maybe I misunderstood you. By the way, I want to note that in the conditions of 16-bit int, your code will fail. - αλεχολυτ

      Option with a functional approach:

       #include <string> #include <set> #include <algorithm> #include <cctype> #include <iterator> std::string s = "csvCFCjgcgbcYcgmCUYGlhKBNJHulGCVcjgfD"; std::set<char> r; std::copy_if(begin(s), end(s), inserter(r, begin(r)), isupper); 

        I read a lot of things, and at the end I decided to just check each character of the string, str[i] >= 'Z' && str[i] <= 'A' . And if the condition is true, then I push into the set. And his default property is not to push elements that already exist in it.

        Qt Creator Code:

        std::set<QString> letters; for (int i = 0; i < str.length(); i++) { if ((str[i] <= 'Z') && (str[i] >= 'A')) { letters.insert(QString(str[i])); } }