In my opinion, the bool MoveNext()
method violates one of the principles of SOL I D because combines the method of moving to the next element and obtaining a sign of reaching the end of the sequence. This causes problems when the iterator is used not in the simplest case in one foreach loop, but as an argument in nested method calls. Those. when used in an external method, passed to an internal method that calls MoveNext()
. It turns out that only the internal method will know about the end of the sequence, and the external method will not be able to find out about it, except in the returned parameter from the internal one. But do not pass the same bool
from each method called! This will make the code just terrifying. It would be much more logical in IEnumerator
to make another property of the end attribute. Moreover, such information is actually always available in the iterator class itself and is returned in the displacement method.
Example:
public class Parser { void Parse(IEnumerator<char> enumerator) { // Допустим есть последовательности букв, разделенных пробелами enumerator.Reset(); while (enumerator.MoveNext()) { if (!char.IsWhiteSpace(enumerator.Current)) // пробелы пропускаем ParseWord(enumerator); // не понятно в каком положении последовательность // Далее должна быть обработка итератора и анализ enumerator.Current } } void ParseWord(IEnumerator<char> enumerator) { // читаем пока буквы while (char.IsLetter(enumerator.Current)) { // ... // если при переходе встретили конец последовательности, выходим if (!enumerator.MoveNext()) { break; // Отсюда начнутся проблемы, т.к.выйдя из метода, // теряется информация о конце последовательности. } } } }
As you can see, when exiting the ParseWord method, 2 situations are possible - either we are on the character following the word, or we have moved to the end of the sequence. The Parse method cannot call MoveNext (), because otherwise, it loses the character, and when accessing enumerator.Current risks an exception. One would think that by the exception it is possible to determine the end of the sequence, but this is not so. Firstly, an InvalidOperationException
exception may occur for some other reason, and secondly, in the case of an array it will be, but in the case of a List<T>
or string
it will not.
I also want to note that in Java, as far as I understand, the situation is not particularly better. There is a combination of the transition method and getting the element itself:
Interface Iterator<E>
- boolean hasNext()
- Returns true if the iteration has more elements. (In other words, returns true if you’re rather than a throwing an exception.)
- E next()
- Returns the next element in the iteration. Throws: NoSuchElementException - iteration has no more elements.
Similarly, in the example above in the case of Java, we will acquire information about the end of the sequence, but we will lose the opportunity to analyze the element itself.
As a solution to this problem, I saw the creation of a template wrapper class for an arbitrary iterator that would catch and store the values returned from MoveNext()
and then be provided as a property.
Ideally, I would like to have this:
public interface IEnumerator { object Current { get; } bool MoveNext(); bool HasValue { get; } // то же что и MoveNext, но без перемещения void Reset(); }
But the question is whether there really is an architectural interface problem or maybe there is a solution that allows you to use an iterator in many nested methods without passing the value obtained from MoveNext()
?
@VladD: I apologize, but I did not quite understand about caching. I did not begin to understand for a long time, but having understood that most likely we are talking about different things, I decided to spend time on my example. More precisely, even 2 examples. 1st with extended enumerator and 2nd with standard. And what was the surprise that in the end they turned out to be almost the same - only 3 differences.
// Внимание! // Допущение: вложенный энумератор должен управляться только логикой класса-обертки и не должен учавствовать в других операциях. // Допустим в интерфейсе итератора существует еще одно свойство public interface IEnumeratorEx<out T> : IEnumerator<T> { // Признак конца последовательности // Он же то, что возвращает MoveNext() // Он же, означающий наличие значения, которое можно получить bool HasValue { get; } } // Класс с расширенным энумератором (+HasValue) public class FilteringEnumeratorEx<T> : IEnumeratorEx<T> { IEnumeratorEx<T> wrapped; // Оригинальная последовательность, подлежащая фильтрации Func<T, bool> filter; // Фильтр public T Current { get { // Если вложенный итератор не пройден до конца, значит текущий элемент был найден, // отфильтрован и должен быть возвращен if (HasValue) return wrapped.Current; else throw new InvalidOperationException("Энумератор достиг конца. Значение не может быть получено."); } } object IEnumerator.Current { get { return Current; } } public bool HasValue { get { return wrapped.HasValue; } } public FilteringEnumeratorEx(IEnumeratorEx<T> wrapped, Func<T, bool> filter) { this.wrapped = wrapped; this.filter = filter; } public bool MoveNext() { if (!HasValue) return false; while (wrapped.MoveNext() && !filter(wrapped.Current)) ; return HasValue; } public void Reset() { wrapped.Reset(); } public void Dispose() { wrapped.Dispose(); } } // Класс со стандартным энумератором public class FilteringEnumerator<T> : IEnumerator<T> { IEnumerator<T> wrapped; // Оригинальная последовательность, подлежащая фильтрации Func<T, bool> filter; // Фильтр // Поскольку вложенный энумератор не имеет отдельного признака конца, кэшируем последнюю MoveNext() // Иными словами, изобретаем костыли // Хотя, раз уж есть такое полезное свойство, то почему бы его не выставить наружу, пусть пользуются. public bool HasValue { get; private set; } // 1 отличие public T Current { get { // Если вложенный итератор не пройден до конца, значит текущий элемент был найден, // отфильтрован и должен быть возвращен if (HasValue) return wrapped.Current; else throw new InvalidOperationException("Энумератор достиг конца. Значение не может быть получено."); } } object IEnumerator.Current { get { return Current; } } public FilteringEnumerator(IEnumerator<T> wrapped, Func<T, bool> filter) { this.wrapped = wrapped; this.filter = filter; HasValue = true; // -1 элемент всегда должен позволять шагнуть дальше // 2 отличие } public bool MoveNext() { if (!HasValue) // Если предыдущая MoveNext вернула false, значит ушли в конец return false; // - вернем признак конца. while ((HasValue = wrapped.MoveNext()) && !filter(wrapped.Current)) // 3 отличие ; return HasValue; } public void Reset() { wrapped.Reset(); } public void Dispose() { wrapped.Dispose(); } }
I can conclude that the lack of a HasValue property in the iterator being wrapped doesn’t really interfere with adding it to the wrapper. There is practically no difference in complexity.
I think it is worth explaining which of the principles is violated in my opinion.
Interface Segregation Principle (ISP). Clients should not depend on methods that they do not use. Many interfaces specifically designed for customers are better than one general-purpose interface. Too “thick” interfaces should be divided into smaller and more specific ones so that the clients of small interfaces know only about the methods they need in their work. As a result, when changing the interface method, clients that do not use this method should not change.
This principle, I think, is usually perceived regarding the interface as an interface in a programming language. But if you think about it, this principle is much more general and concerns the interaction interface, which is also any method, function, and even the whole system. It even seems to me that, in fact, it is still the same principle of a single duty (Single responsibility principle). The only responsibility should have not only the object, but also its interaction interface. At the same time, each method of this interface in turn should also have a sole responsibility. Any entity should be split as small as possible, but no more.
In the interface of interaction with the iterator there should be only those methods that relate to its work. To do this, you need to be able to receive items, sort through them (advance on them), as well as receive information about whether we have finished our search or not. Further crushing is no longer possible, because removing any of the features will make working with the interface impossible.
Now you need to design how to use these enumerator capabilities. Possible options:
// Простой enumerator public interface ISimpleEnumerator { // получить текущий элемент object Current { set; } // продвинуться дальше void MoveNext(); // проверить наличие текущего элемента bool HasValue { get; } } // C# enumerator public interface ICSharpEnumerator { object Current { get; } // попытка переместиться на следующий элемент bool MoveNext(); } // Java enumerator public interface IJavaEnumerator { // остались ли еще впереди элементы bool HasNext { get; } object Next(); } // Super enumerator public interface ISuperEnumerator { // пытаемся получить следующий элемент // возвращает признак его получения и сам элемент bool MoveNext(out object current); } // Все тесты предполагают нахождение итератора в -1 позиции // т.е. перед первой попыткой чтения. // В начальном состоянии доступ к любому методу/свойству // кроме MoveNext некорректен и скорее всего должн вызывать исключение. public class TestClass { void SimpleEnumeratorTest(ISimpleEnumerator e) { for (;;) { e.MoveNext(); if (!e.HasValue) break; var x = e.Current; } } void CSharpEnumeratorTest(ICSharpEnumerator e) { while (e.MoveNext()) { var x = e.Current; } } void JavaEnumeratorTest(IJavaEnumerator e) { while (e.HasNext) { var x = e.Next(); } } void SuperEnumeratorTest(ISuperEnumerator e) { object x; while (e.MoveNext(out x)) { // use x } } }
As you can see, the possibilities of each of the 4 enumerators are the same, but the difference is in how the interaction interface is designed with each of them. If you follow the principle of common responsibility or the principle of separation of interfaces, then each of the methods should also be elementary, but no more. So These principles are satisfied by a simple enumerator of 3 methods / properties, and the ISuperEnumerator with only 1 method is the most disturbed.
At first, it may seem that only 1 method is very convenient, but in practice it will surely appear that a lot of problems will start to arise due to the fact that different data from the MoveNext method and so on will be required in different places of the algorithm you have to cache and drag them all the time. Within one method, this may not cause difficulties, and when the call stack is affected, then they will manifest themselves fully, since visibility of local variables will disappear, and it will be impossible to get them again from the interface. As a result, an adapter will be constructed that brings this super interface to the simplest interface. Actually it is a crutch.
If you pay attention to all kinds of APIs, then you can see that most often each interaction is made as small and elementary as possible. This is exactly where the principle of ISP is manifested (many interfaces specifically designed for customers are better than one general-purpose interface). It could also be written in the following form: many methods / functions specifically designed for clients are better than one general purpose method / function.
Although SRP and ISP look different, they actually have the same basic principle - crushing the system to achieve Low coupling and High cohesion, see GRASP. The basis is the goal to simplify the system by reducing the number of links in it.
I do not pretend to the truth, I just expressed my opinion.
FilteringEnumeratorEx
is incorrect. Imagine that your filter is such that the resulting sequence is empty. Then if you createFilteringEnumeratorEx
and queryHasNext
, you gettrue
, which is wrong. - VladD