Suppose there are 3 vectors of the following form (the values ​​in them are not repeated, and they are already sorted):

std::vector<std::pair<int,string>> vector; 

Each contains the following values:

 1 вектор: {(1,a),(2,b),(3,c)}; 2 вектор: {(1,k),(5,q),(4,r)}; 3 вектор: {(1,e),(6,d),(7,v)}; 

It is necessary to leave only intersecting values ​​in each of the vectors, ie, it should remain:

 в 1 векторе:{(1,a)}; во 2 векторе {(1,k)}; в 3 векторе {(1,e)}; 

How can I do that?

  • You need std :: set_intersection and the right comparator for your data - KoVadim
  • @KoVadim A little bit wrong: he needs to remove from the vectors ... - Harry
  • if it can't, then cycles. But the elements are not deleted, but get new ones and later replace the old ones. In this case, the problem is solved easier. - KoVadim
  • By the way, what if, say, in the first vector there is also (1,z) ? Leave or delete? And if in the second - (2,q) - or only those that are in all three are needed? - Harry
  • @Harry, Only those that are in all three. There are no duplicate values ​​in the vectors. As for std :: set_intersection, I know about it. But, if you use it, as I understand it, you need it for the first vector with respect to all the others, for the second, with respect to all the others ... etc, right? - bronstein87

2 answers 2

You can find pairs common to all vectors ( common_pairs ), and then remove pairs from each vector that do not belong to the intersection using set_intersection() :

 #include <algorithm> // set_intersection #include <iostream> #include <iterator> // back_inserter #include <string> #include <type_traits> // remove_reference #include <utility> // pair #include <vector> int main() { using namespace std; typedef vector<pair<int, string>> V; V seqs[] = { {{1, "a"}, {2, "b"}, {3, "c"}}, {{1, "k"}, {5, "q"}, {4, "r"}}, {{1, "e"}, {6, "d"}, {7, "v"}}, }; // find intersection of k vectors V common_pairs; auto less_first = [](auto a, auto b) { return a.first < b.first; }; set_intersectionk(begin(seqs), end(seqs), back_inserter(common_pairs), less_first); // remove pairs that are not in the intersection for (auto&& v : seqs) { V temp; set_intersection(begin(v), end(v), begin(common_pairs), end(common_pairs), back_inserter(temp), less_first); v.swap(temp); } // print results for (auto&& v : seqs) { for (auto&& p : v) cout << "(" << p.first << ", " << p.second << ") "; cout << endl; } } 

Example:

 $ g++ -std=c++14 *.cc && ./a.out (1, a) (1, k) (1, e) 

Where set_intersectionk() simply finds in pairs the intersection of vectors:

 /// Find common elements among collections given in [first, last) range. template<class InputIt, class OutIt, class Compare> OutIt set_intersectionk(InputIt first, InputIt last, OutIt out, Compare comp) { if (first == last) // no collections return out; typedef std::vector<std::remove_reference_t<decltype(*begin(*first))>> V; V intersection(begin(*first), end(*first)); for (++first; first != last; ++first) { V temp; temp.swap(intersection); std::set_intersection(std::begin(*first), std::end(*first), std::begin(temp), std::end(temp), std::back_inserter(intersection), comp); } return std::copy(begin(intersection), end(intersection), out); } 

Input collections are not required to be vectors. In terms of the worst case ( k identical collections), the time complexity of the algorithm is already optimal O(k*n) . If the intersection is noticeably smaller, then you can improve the execution time, if you do not look at each (different) element, for example, using the doubling search suggested in the discussion of a similar issue for the implementation of set_intersection() :

 /// Return the first iterator such that !comp(*it, key) template<class It, class K = typename std::iterator_traits<It>::value_type, class Compare = std::less<K> > It doubling_lower_bound(It first, It last, K key, Compare comp = {}) { typename std::iterator_traits<It>::difference_type upper_bound = 1, count = std::distance(first, last); while (upper_bound < count && comp(*std::next(first, upper_bound), key)) upper_bound *= 2; return std::lower_bound(std::next(first, upper_bound / 2), std::next(first, std::min(upper_bound, count)), key, comp); } template<class It1, class It2, class OutIt, class Compare> OutIt set_intersection(It1 first1, It1 last1, It2 first2, It2 last2, OutIt out, Compare comp) { for ( ; first2 != last2; ++first2) { first1 = doubling_lower_bound(first1, last1, *first2, comp); if (first1 == last1) break; // all(first1s) < *first2 <= all(first2s) first2 = doubling_lower_bound(first2, last2, *first1, comp); if (first2 == last2) break; // all(first2s) < *first1 <= all(first1s) *out++ = *first1++; // !(a < b) && !(b < a) <=> a == b } return out; } 
  • Mark your decision as correct. it is, after all, more interesting) - bronstein87

Well, in general, something like that. I don’t answer for optimality, but he considers 4 vectors with dimensions of about 100 thousand quickly (vectors must be sorted in advance)

 void setIntersectionData(std::vector<std::vector<std::pair<int,string>>>& data) { if(data.size()==1) { return; } /*формируем первый вектор, как вектор без непересекающимеся значениями относительно остальных*/ for(int i=1;i<data.size();i++) { std::vector<std::pair<int,string>> intersectedData;// вектор, который будет содержать непересекающиеся значения, его будем присваивать первому вектору std::set_intersection (data[0].begin(), data[0].end(), data[i].begin(), data[i].end(),std::back_inserter(intersectedData), [](const auto& a, const auto& b){return a.first<b.first;}); data[0]=std::move(intersectedData); } /*убираем непересекающиеся значения из остальных векторов*/ for(int i=1;i<data.size();i++) { std::vector<std::pair<int,string>> intersectedData;// вектор, который будет содержать непересекающиеся значения, его будем присваивать оставшимся векторам std::set_intersection (data[i].begin(), data[i].end(), data[0].begin(), data[0].end(),std::back_inserter(intersectedData), [](const auto& a, const auto& b){return a.first<b.first}); data[i]=std::move(intersectedData); } } 
  • Does it work if vectors that later go make a smaller intersection? For example, if the last vector would not contain (1,e) pairs, would your solution make all the vectors empty? - jfs
  • @jfs, do you know the answer, or do you really ask? well, if I didn’t miss anything, then the absence of (1, e) pair will make the vector data [0] in the first cycle empty. and then applying it to the remaining vectors will make them empty ... - bronstein87
  • Understood: your data [0] plays the role of common_pairs in my code. - jfs