A simple way to predict the next value (Markov predictor, ant algorithm, Levinson-Darbin recursion)

Question

Hello!

There is a number series, programmatically some array of length N The simplest algorithm for predicting the next N+1 value, based on all previous values, is required. Please tell me this.

My series is represented by random positive fractional numbers. In general, any simple ideas I would like to consider.

UPD :: where do numbers come from?

There are 50 equidistant transceivers (PP), no matter how, but anyone can contact anyone ( fully connected topology ). There is an abstract source and receiver (which can also contact any of the PPs). The connection between the source and the receiver is always established after 5 PP (the source and the receiver are naturally not taken into account). At one point in time, for example, once a day, the connection (route / path) is re-formed according to some random (for us unknown, random (because, for example, some BCPs are currently unattainable for some reason) laws. It is necessary to predict which path (or 6-10 other path variants) will be chosen as follows based only on knowledge of previous system choices.

I have already tried many different algorithms. Since there is no statistical relationship between the two different choices of paths (and if there is, it is difficult to determine, I’m even afraid to say that the change of path occurs according to, for example, the normal probability distribution , but I think that this can be assumed for this task) , so now I try to approximate on the basis of the "weights" of the routes (the products of the "weights" of each pair of PP extracted from the transition matrix (weights are determined at the intersection of the row and column: the transition from the i-th PP to the j-th ), built on principle described @northerner , n more simplified).

Since the weight of the paths | 1,2,3,4,5 | 1,3,2,4,5 | 5,4,3,2,1 |, equal, simple calculations give a little more than 2 million. possible routes. For each of them, I calculate the weight (by the matrix). Similarly, we can calculate the weight of the route for each previous route already selected by the system (the period is a year, for each day of the year). Predicting the behavior of the weight value, we will choose from six million 6-10 options that are most similar to our predicted value.

At the moment I have only reached the point that, assuming a normal distribution, we can select the most rarely encountered pairs in the transition matrix. For example, there are pairs that have never been encountered in a year (in which in the transition matrix at the intersection, they have to replace zero because they spoil the picture, as I believe). However, these pairs do not begin to appear more often than others.

If there are alternative ideas / methods for predicting routes, I will be glad. If you have ideas on how to establish the relationship between changes of routes, distribution and other characteristics, I will also be happy.

If the next value is based on the previous ones, then this is already a random process.
@ karmadro4, not quite sure that my random values can be investigated for convergence.

Alexey Shimansky 36.9k five 47 115 · Accepted Answer · 2012-03-29T15:20:08

You can try to build an approximate variational series and create a new number as a random variable, distributed in the same way.

That is, split the set of possible values [0; 1] [0; 1] for n intervals (say, for 10), calculate how many values fall into each and create a random variable that has the same probability distribution.

If the numbers are random, then this seems like the only suitable method.

Answer 2 · 2012-03-29T18:11:31

The choice of a predictor is highly dependent on the sequence specificity. If the number of different numbers that may occur in a sequence is small, I suggest a simple Markov predictor.

We assume that the elements of the sequence take values from 1 to N (otherwise they can be renumbered and will be so).

We call the possible values of the elements states. Let the power of the set of states (the number of different values) be K. Define a matrix P of size KxK and two one-dimensional arrays R and C of size K. Having received the next element of the sequence X (N), increase by one the values of P (X (N - 1), X (N)), R (X (N - 1)) and C (X (N)). Thus, the values gradually accumulate: P (I, J) - the number of transitions from I to J, R (I) - the number of departures from I, C (J) - the number of arrivals in J. P (I, J) / R (I ) - estimate of probability (frequency), being in I go to J.

Having accumulated a sufficient number of transitions, we can begin to “predict”: let us be in the Z state. We play a random variable uniformly distributed in [0, 1]. We go along the line Z from left to right, subtracting P (Z, J) / R (Z). As soon as we get zero or less, the column in which we are and is the expected element.

The method works well with the following limitations:

the number of states is really small;
the process is stationary (homogeneous in time), that is, if somewhere at the beginning of a sequence of 15, it always turned into 42, and in the remaining part never, there is no point in using the method;
the next element essentially depends on the current one (and much less on the very old ones);
the transition probabilities are distributed rather unevenly (otherwise, we have white noise and no predictor will do anything).

In the described algorithm, C (J) turned out to be unnecessary, but sometimes required, so I will leave them as well.

Excellent algorithm, true for another subtask, thanks in a special form of reward.
@Dex, you just double-check, I write the patient, I think a little tight.
@northerner, I almost had already implemented it, except for some important little things, I could not find.

Answer 3 · 2015-11-21T19:54:24

UPD

One of the best prognostic methods is autoregressive. In any case, he will catch periodic bursts.

Signal model

The method is based on an autoregression model (AP) of order k for the sample i = 0, 1, ..., n-1 :

x _i + f ₀ x _i-1 + f ₁ x _i-2 + f ₂ x _i-3 + ... + f _k-1 x _ik = 0, where i = k, k + 1, ..., n-1.

The order of the model should roughly correspond to the complexity of the signal.

Toeplitz symmetry

According to the least squares method, the residual should be minimized:
EPS = <(x _i + f ₀ x _i-1 + f ₁ x _i-2 + f ₂ x _i-3 + ... + f _k x _ik ) ² > _{i = k..n-1} ,
for which, the partial derivatives of the residual with respect to the autoregression coefficients f<sub>s</sub> for s=0..k-1 should be equated to zero.
This leads to the following system of special equations:

a ₀ f ₀ + a ₁ f ₁ + a ₂ f ₂ + ... + a _k-2 f _k-2 + a _k-1 f _k-1 = -a ₁ ,
a ₁ f ₀ + a ₀ f ₁ + a ₁ f ₂ + ... + a _k-3 f _k-2 + a _k-2 f _k-1 = -a ₂ ,
a ₂ f ₀ + a ₁ f ₁ + a ₀ f ₂ + ... + a _k-4 f _k-2 + a _k-3 f _k-1 = -a ₃ ,
...
a _k-2 f ₀ + a _k-3 f ₁ + a _k-4 f ₂ + ... + a ₀ f _k-2 + a ₁ f _k-1 = -a _k-1 ,
a _k-1 f ₀ + a _k-2 f ₁ + a _k-3 f ₂ + ... + a ₁ f _k-2 + a ₀ f _k-1 = -a _k ,

where a is the autocorrelation function (ACF).

The first feature is the view of the matrix on the left side, when the coefficients on the main diagonal and all coefficients on the diagonals equidistant from the main one coincide. Matrices with symmetry of this type are called Toeplitz matrices, and Levinson's algorithm is applicable to them.

The second feature is that the column of free members is formed from the elements of the same matrix a, and the Levinson-Darbin algorithm is applicable to such a system.

The third feature is that actually the Toeplitz matrix is only approximate. For example, for a second-order model, the exact form of the equations is:

<x _i-1 ² > f ₀ + <x _i-1 x _i-2 > f ₁ = <x _i x _i-1 >,
<x _i-1 x _i-2 > f ₀ + <x _i-2 ² > f ₁ = <x _i x _i-2 >,
where the index i in each sum of the form <u _i > runs through the values from 2 to the maximum.
It is easy to notice that:

The elements on the main diagonal correspond to the values of the zero element of the autocorrelation function, the samples for which are calculated are shifted by one element.
The elements on the secondary diagonal and the first element in the column of free terms correspond to the values of the first element of the autocorrelation function, and the elements on the diagonal are equal, and the sample for the free term is shifted relative to them.

This can lead to unexpected effects for sequences with a strong increasing (decreasing) trend.

So, for a sample of the first 10 members of the Fibonacci sequence
x _i = {1, 1, 2, 3, 5, 8, 13, 21, 34, 55}
it turns out a system of equations of the form:
1869 f ₀ + 1155 f ₁ = - 3024,
1155 f ₀ + 714 f ₁ = - 1869
with the correct solution, f ₀ = -1, f ₁ = -1, but there is no expected Toeplitz symmetry of the problem.
In this case, the use of ACF gives the system
4893 f ₀ + 3024 f ₁ = - 3024,
3024 f ₀ + 4893 f ₁ = - 1869,
use of which in this case is illegal. The main method of dealing with these effects is sample centering. But the main thing is to check the possibility of a serious simplification of the model and the use of the Darbin algorithm.

Durbin's algorithm

The essence of Durbin's idea is the search for a solution for a matrix of dimension n + 1 in the form
(f ' ₀ , f' ₁ , ..., f ' _n-1 , f' _n ) = (f ₀ , f ₁ , ..., f _n-1 , 0) + beta (f _n-1 , f _n-2 , ..., f ₀ , 1).
Indeed, the substitution of this solution into a matrix of dimension n + 1 , taking into account a system of dimension n, gives:

-a ₁ - beta a _n + a _n beta = -a ₁ ,
-a ₂ - beta a _n-1 + a _n-1 beta = -a ₂ ,
...
-a _n-1 - beta a ₁ + a ₁ beta = -a _n ,
a _n f ₀ + a _n-1 f ₁ + a _n-2 f ₂ + ... + a ₂ f _n-2 + a ₁ f _n-1 + beta (a _n f _n-1 + a _n-1 f _n-2 + a _n-2 f _n-3 + ... + a ₂ f ₁ + a ₁ f ₀ + a ₀ ) = -a _{n + 1} .
The first n equations of the system are satisfied for any value of beta, the last for
beta = - (a _{n + 1} + a _n f ₀ + a _n-1 f ₁ + a _n-2 f ₂ + ... + a ₂ f _n-2 + a ₁ f _n-1 )
/ (a _n f _n-1 + a _n-1 f _n-2 + a _n-2 f _n-3 + ... + a ₂ f ₁ + a ₁ f ₀ + a ₀ ).

The durbin() function returns an array of autoregressive vectors for the entire range of n.
When n = 1, f = (-a ₁ / a ₀ ), the subsequent coefficient vectors are calculated recurrently (first beta , then f ).

Software implementation

The PHP program has the following functions:

Centering the center() array.
The scalar product of vectors with a shift of the second vector scalar_prod() .
Output array with text comment print_array() .
Derivation of a system of second-order linear equations and its solutions print_s() .
Calculation of the autocorrelation function (ACF) acf() .
Comparison of exact and second order Toeplitz systems and their solutions for a given sample of compare_s() .
Durbin's algorithm for solving a system of special equations durbin() .
Testing the Darbin algorithm test_durbin() .

Text of the program

 function center(&$arr){ $len = count($arr); $aver = array_sum($arr) / $len; foreach($arr as &$item){ $item -= $aver; } } function scalar_prod($a, $b, $shift = 0, &$c = null){ $scal = 0; if(is_null($c)) $cc = []; else $cc = &$c; foreach($a as $key => $item){ $cc[] = $item * $b[$key+$shift]; $scal += end($cc); } return $scal; } function print_array($arr, $str, $n = 11){ print $str."["; foreach($arr as $key => $item){ if(!(($key+1) % $n)) print "<br>"; printf ("\"%03d\" => %.3f,&ensp;", $key, $item); } print "]"; } function print_s($a, $b, $str){ print("<br><br>$str"); printf("<br> %.3f f0 + %.3f f1 = %.3f", $a[0][0], $a[0][1], $b[0]); printf("<br> %.3f f0 + %.3f f1 = %.3f", $a[1][0], $a[1][1], $b[1]); $det = $a[0][0]*$a[1][1] - $a[1][0]*$a[0][1]; $det0 = $b[0]*$a[1][1] - $b[1]*$a[0][1]; $det1 = $a[0][0]*$b[1] - $a[1][0]*$b[0]; printf("<br>Решение: f = [%f, %f]", (float)$det0 / $det, (float)$det1/$det); } function acf($flow, $k, $center = -1, $len = null){ if(is_null($len)){ $len = count($flow); } $slice = array_slice($flow, $k, $len-$k); for($lag = 0; $lag <= $k; $lag++){ $result[$lag] = scalar_prod($slice, $flow, $k-$lag); } if($center != -1){ $denom = 1.0/$result[0]; foreach($result as &$res){ $res *= $denom; } } return $result; } function compare_s($test){ $m = count($test); $acf2 = acf($test, 0, -1, $m-2); $acf1 = acf($test, 1, -1, $m-1); $acf = acf($test, 2); $a_exact = [ [$acf1[0],$acf1[1]], [$acf1[1],$acf2[0]] ]; $a = [ [$acf[0],$acf[1]], [$acf[1],$acf[0]] ]; $b = [-$acf[1], -$acf[2]]; print_s($a_exact, $b, "Контроль симметрии матрицы<br><br>Точная система (порядок 2):"); print_s($a, $b, "Тёплицева система (порядок 2):"); } function durbin($acf, $n){ $ff = []; $f = [-$acf[1]/$acf[0]]; print_array($f, "<br>f = "); $ff[] = $f; for($r = 1; $r < $n; $r++){ $acr = array_reverse(array_slice($acf, 0, $r+1)); $fr = array_reverse($f); $fr[] = 1; $f[] = 0; $beta = - ($acf[$r+1] + scalar_prod($f, $acr))/scalar_prod($fr, $acr); print("<br>beta[$r] = $beta "); $f = array_map(function($a,$b) use($beta){ return $a+$beta*$b; },$f,$fr); $ff[] = $f; print_array($f, "&emsp; f = "); } return $ff; } function test_durbin($arr, $n, $center=0){ $len = count($arr); if($center){ center($arr); } $eps_arr = 0; foreach($arr as $item){ $eps_arr += $item*$item; } printf("<br><br>Решение по Дарбину (порядок АР = %d, длина выборки = $len, СКО выборки = %f):", $n, sqrt($eps_arr/($len-1))); $a = acf($arr, $n); print_array($a, "<br><br>АКФ: "); $ff = durbin($a,$n); $c = array_reverse(end($ff)); $eps = 0; $brr = []; for($j=$n; $j<$len; $j++){ $brr[$j] = $arr[$j]+scalar_prod($c, $arr, $j-$n); $eps += pow($brr[$j],2); } $k = count($f)-1; printf("<br>СКО остатка = %f", sqrt($eps/($len-$n))); } $m = 10; $test = [1.0, 1,0]; for ($i = 2; $i < $m; $i++){ $test[$i] = $test[$i-1] + $test[$i-2]; } print("*** Фибоначчи - 10: ***"); compare_s($test); test_durbin($test, 2); print("<br><br>*** Фибоначчи-10 с центрированием: ***"); test_durbin($test, 2, 1); $m=2000; $dpi = 2*M_PI; for($j=0; $j<$m; $j++) $arr[$j] = sin($j);// - $dpi*floor($j/$dpi)); print("<br><br>*** Авторегрессия по Дарбину (синусная выборка): ***"); compare_s($arr); test_durbin($arr, 1); test_durbin($arr, 2); test_durbin($arr, 3); test_durbin($arr, 4); test_durbin($arr, 5); test_durbin($arr, 6);

Results:

 *** Fibonacci 10: ***

 Matrix symmetry control

 Exact system (order 2):
 1869.000 f0 + 1155.000 f1 = -3024.000
 1155.000 f0 + 714.000 f1 = -1869.000
 Solution: f = [-1.000000, -1.000000]

 Töplitz system (order 2):
 4893.000 f0 + 3024.000 f1 = -3024.000
 3024.000 f0 + 4893.000 f1 = -1869.000
 Solution: f = [-0.618007, -0.000030]

 Durbin solution (order of AP = 2, sample length = 10, MSE of sample = 23.321426):

 ACF: ["000" => 4893.000, "001" => 3024.000, "002" => 1869.000,]
 f = ["000" => -0.618,]
 beta [1] = -2.98035943134E-5 f = ["000" => -0.618, "001" => -0.000,]
 RMS residual = 15.285024

 *** Fibonacci-10 with centering: ***

 Durbin solution (order of AP = 2, sample length = 10, RMS sample = 17.795443):

 ACF: ["000" => 2496.320, "001" => 1399.520, "002" => 716.420,]
 f = ["000" => -0.561,]
 beta [1] = 0.0398418808262 f = ["000" => -0.583, "001" => 0.040,]
 RMS residual = 12.412102

 *** Durbin autoregression (sinus selection): ***

 Matrix symmetry control

 Exact system (order 2):
 999.018 f0 + 539.793 f1 = -539.750
 539.793 f0 + 999.016 f1 = 415.712
 Solution: f = [-1.080605, 1.000000]

 Töplitz system (order 2):
 998.969 f0 + 539.750 f1 = -539.750
 539.750 f0 + 998.969 f1 = 415.712
 Solution: f = [-1.080619, 1.000008]

 Durbin solution (order of AP = 1, sample length = 2000, RMS sample = 0.707169):

 ACF: ["000" => 999.677, "001" => 539.750,]
 f = ["000" => -0.540,]
 RMS residual = 0.595153

 Durbin solution (order of AP = 2, sample length = 2000, MSE of the sample = 0.707169):

 ACF: ["000" => 998.969, "001" => 539.750, "002" => -415.712,]
 f = ["000" => -0.540,]
 beta [1] = 1.00000781118 f = ["000" => -1.081, "001" => 1.000,]
 RMS residual = 0.000009

 Durbin solution (order of AP = 3, sample length = 2000, RMS sample = 0.707169):

 ACF: ["000" => 998.142, "001" => 538.985, "002" => -415.712, "003" => -988.206,]
 f = ["000" => -0.540,]
 beta [1] = 0.999521483275 f = ["000" => -1.080, "001" => 1.000,]
 beta [2] = 0.925724185475 f = ["000" => -0.154, "001" => 0.000, "002" => 0.926,]
 RMS residual = 0.000488

 Durbin solution (order of AP = 4, sample length = 2000, RMS sample = 0.707169):

 ACF: ["000" => 998.122, "001" => 538.857, "002" => -415.831, "003" => -988.206, "004" => -652.029,]
 f = ["000" => -0.540,]
 beta [1] = 0.999342177677 f = ["000" => -1.079, "001" => 0.999,]
 beta [2] = 0.925843078378 f = ["000" => -0.154, "001" => 0.000, "002" => 0.926,]
 beta [3] = 6.00603640925 f = ["000" => 5.406, "001" => 0.000, "002" => 0.000, "003" => 6.006,]
 RMS residual = 0.004358

 Durbin solution (order of AP = 5, sample length = 2000, RMS sample = 0.707169):

 ACF: ["000" => 997.550, "001" => 538.964, "002" => -415.143, "003" => -987.569, "004" => -652.029, "005" => 282.984,]
 f = ["000" => -0.540,]
 beta [1] = 0.999977661062 f = ["000" => -1.081, "001" => 1.000,]
 beta [2] = 0.925422594935 f = ["000" => -0.155, "001" => 0.000, "002" => 0.925,]
 beta [3] = 5.96426000454 f = ["000" => 5.364, "001" => 0.000, "002" => 0.000, "003" => 5.964,]
 beta [4] = -1.11184318911 f = ["000" => -1.267, "001" => -0.000, "002" => 0.000, "003" => 0.000, "004" => -1.112,]
 RMS residual = 0.000027

 Durbin solution (order of AP = 6, sample length = 2000, RMS sample = 0.707169):

 ACF: ["000" => 996.630, "001" => 538.238, "002" => -415.008, "003" => -986.697, "004" => -651.222, "005" => 282.984, "006 "=> 957.015,]
 f = ["000" => -0.540,]
 beta [1] = 0.999627453765 f = ["000" => -1.080, "001" => 1.000,]
 beta [2] = 0.925654012051 f = ["000" => -0.155, "001" => 0.000, "002" => 0.926,]
 beta [3] = 5.98719502289 f = ["000" => 5.387, "001" => 0.000, "002" => 0.000, "003" => 5.987,]
 beta [4] = -1.11131942365 f = ["000" => -1.266, "001" => -0.000, "002" => -0.000, "003" => 0.000, "004" => -1.111,]
 beta [5] = -0.877666481891 f = ["000" => -0.291, "001" => -0.000, "002" => -0.000, "003" => 0.000, "004" => -0.000, " 005 "=> -0.878,]
 RMS residual = 0.000360

The program showed high efficiency on a limited (sinus) sample with a large amount of input data.

There is an English translation: math.stackexchange.com/questions/1887709/…

Answer 4 · 2012-03-29T15:22:45

Your array of elements can be represented as the value of some function at points 1, 2, 3 ... N

Then, to search for a new value (with the argument of the function N + 1), you can use Newton interpolation . Programming is pretty simple.

So we find the polynomial P (x), which takes the value a [i] from the array in the cell with the subscript i (that is, P (i) = a [i]), then to find a [N + 1] value , just substitute in this polynomial instead of x, the value N + 1. Those. answer = P (N + 1).

Well, even if N = 10 ^ 6, then binary exponentiation can be used.

dzhioev dzhioev 10.8k sixteen 35 · Answer 5 · 2012-03-29T14:21:49

There is no definitive answer to this question, it all depends on the dependence your series represents. I recommend to try the least squares method - this is a simple method, especially if we assume that the dependence is linear.

Yes, you are right about this I forgot. Values are random, so I wrote a “prediction” question. - Dex

angry 5,855 15 56 171 · Answer 6 · 2012-03-30T22:58:53

In addition to the neural network, there is another interesting algorithm: ant :)

Applying to your system, it will work something like this: Suppose that the source of the signal when trying to connect can give the following answers: confirmation of transmission readiness or failure.
With the transfer - we increase the "attractiveness" of the PP by some value "A", in case of failure - we decrease it by "B".
"A", "B" - it is necessary to find experimentally for the optimal operation of the system. In addition, you will need to adjust the maximums for the "attractiveness" counters and the counter decreases with time.