enter image description here

There is such a set of points, each point represents the gps coordinate of the bus (x, y), each point has a timestamp. On the plotted graph visible measurement errors are visible. How can they be removed? The solution should be simple, since the total coordinates are about 100 thousand. Interested in the idea, but it is desirable that it can be implemented without any problems using Java tools.

Sample source data:

1447037729 <tab> 3054.619968 <tab> 2409.828279 <tab> 570d8 

The first field is UNIX time, the second and third are (x, y), respectively, the fourth is the bus identifier (buses around 50). Initial data: https://drive.google.com/file/d/0B4bA9d5B_O_BcVpPUXpYTmZBUFE/view

  • noise can be removed using OpenCV, threshold. example here - Stack
  • @Stack image is just as an illustration of what is happening. We need an algorithm on the set of points (x, y), so that after the operation of this algorithm, there is a set of relatively accurate pairs (x, y). - Simankov
  • Related question: stackoverflow.com/questions/478422/… - sercxjo
  • 2
    Calculate the distance between points, the maximum speed of the bus is known. (Can he move a kilometer in a second?) Throw out incredible movements from the input data. - Sergey
  • 2
    Can take into account the real coordinates of the roads on which the bus can go? - avp

5 answers 5

The reduced set of points is two dependencies: x(t) and y(t) , and for each there is impulse noise. Such data is ideally suited for a median processing in a sliding window of 7–9 elements, when the i-th element in time is replaced by the median value of the elements with numbers (ih, i + h) with h = 3 ... 4.
Processing for x(t) and y(t) should be carried out independently, after which they replace the original arrays.

The processing is effective at a high level of impulse noise (in the test example, the third part of the data is distorted). An additional plus is that the source data format is preserved. Edge processing is carried out on smaller windows.
The drawbacks of processing in a sliding window appear at the turn of the sequence, since the protrusions and dips less than h wide are flattened out.

The demo program presents recurrent array sorting in the window. For this, the points lying between the old (deleted) and the new (added) elements are shifted towards the old element, after which a new element is written in the place of the most recent duplicate. This dramatically reduces computational costs.

Demo program (PHP):

 function print_a($a, $name){ print("$name: "); foreach($a as $item){ printf("%2d, ",$item); } } function slide_median($h, $a){ $size = count($a); $result = []; $slide = []; array_push($slide, reset($a)); array_push($result,$slide[0]); print_a($slide, "&emsp;Сортировка в окне"); print_a($result, "<br>Массив результата"); for($i=1; $i<=$h; $i++){ array_push($slide, next($a), next($a)); sort($slide); array_push($result, $slide[$i]); print_a($slide, "&emsp;Сортировка в окне"); print_a($result, "<br>Массив результата"); } for($i=0; $i < $size-2*$h-1; $i++){ $old = $a[$i]; $new = $a[$i+2*$h+1]; if($old < $new){ for($key = 0; $key <= 2*$h; $key++){ if($new < $slide[$key]){ break; } if(($old <= $slide[$key])&&($slide[$key] < $new)) $slide[$key] = $slide[$key+1]; } $slide[$key-1] = $new; } if($old > $new){ for($key = 2*$h; $key >= 0; $key--){ if($new > $slide[$key]){ break; } if(($old >= $slide[$key])&&($slide[$key] > $new)) $slide[$key] = $slide[$key-1]; } $slide[$key+1] = $new; } array_push($result, $slide[$h]); print("&emsp;old = $old, new =$new"); print_a($slide, "&emsp;Сортировка в окне"); print_a($result, "<br>Массив результата"); } for($i = $h-1; $i > 0; $i--){ $slide = array_slice($a, $size-2*$i-1, 2*$i+1); sort($slide); array_push($result, $slide[$i]); print_a($slide, "&emsp;Сортировка в окне"); print_a($result, "<br>Массив результата"); } $slide = [$a[$size-1]]; array_push($result, $slide[0]); print_a([end($a)], "&emsp;Сортировка в окне"); print_a($a, "<br><br>Исходный массив: "); print_a($result, "<br>Массив результата"); return $result; }; $a = range(20, 40); foreach($a as &$item){ $item += 5*mt_rand(-1,1)*(int)(mt_rand(0,199)/100); } print_a($a, "Исходный массив: "); slide_median(3, $a); 

Results (impulse noise, amplitude 5):

 Source array:: 20, 21, 22, 23, 19, 20, 21, 27, 28, 29, 30, 31, 32, 28, 29, 35, 36, 42, 43, 39, 35, Sort in the window: 20, 
 Result array: 20, Sort on window: 20, 21, 22, 
 Result array: 20, 21, Sorted in the window: 19, 20, 21, 22, 23, 
 Array of result: 20, 21, 21, Sort on the window: 19, 20, 20, 21, 21, 22, 23, 
 Result array: 20, 21, 21, 21, old = 20, new = 27 Sort in a window: 19, 20, 21, 21, 22, 23, 27, 
 Result array: 20, 21, 21, 21, 21, old = 21, new = 28 Sort in the window: 19, 20, 21, 22, 23, 27, 28, 
 Result array: 20, 21, 21, 21, 21, 22, old = 22, new = 29 Sort in the window: 19, 20, 21, 23, 27, 28, 29, 
 The result array: 20, 21, 21, 21, 21, 22, 23, old = 23, new = 30 Sort in the window: 19, 20, 21, 27, 28, 29, 30, 
 Result array: 20, 21, 21, 21, 21, 22, 23, 27, old = 19, new = 31 Sort in the window: 20, 21, 27, 28, 29, 30, 31, 
 Result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, old = 20, new = 32 Sort in the window: 21, 27, 28, 29, 30, 31, 32, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, old = 21, new = 28 Sort in the window: 27, 28, 28, 29, 30, 31, 32, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, old = 27, new = 29 Sort in the window: 28, 28, 29, 29, 30, 31, 32, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, old = 28, new = 35 Sort in the window: 28, 29, 29, 30, 31, 32, 35, 
 Result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, old = 29, new = 36 Sort in a window: 28, 29, 30, 31, 32, 35, 36, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, old = 30, new = 42 Sort by window: 28, 29, 31, 32, 35, 36, 42, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, old = 31, new = 43 Sort in a window: 28, 29, 32, 35, 36, 42, 43, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, old = 32, new = 39 Sort in a window: 28, 29, 35, 36, 39, 42, 43, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, old = 28, new = 35 Sort in the window: 29, 35, 35, 36, 39, 42, 43, 
 Result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 36, Sort in the window: 35, 36, 39, 42 , 43, 
 Result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 36, 39, Sort in the window: 35, 39, 43 , 
 Array of result: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 36, 39, 39, Sort in the window: 35, 

 Source array:: 20, 21, 22, 23, 19, 20, 21, 27, 28, 29, 30, 31, 32, 28, 29, 35, 36, 42, 43, 39, 35, 
 The result array: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 36, 39, 39, 35,

Comparison of the moving median and the moving average for intense impulse noise was performed using the following program:

 function print_a($a, $name){ print("$name: "); foreach($a as $item){ printf("%3d, ",$item); } } function slide_median($h, $a){ $size = count($a); $result = []; $slide = []; array_push($slide, reset($a)); array_push($result,$slide[0]); for($i=1; $i<=$h; $i++){ array_push($slide, next($a), next($a)); sort($slide); array_push($result, $slide[$i]); } for($i=0; $i < $size-2*$h-1; $i++){ $old = $a[$i]; $new = $a[$i+2*$h+1]; if($old < $new){ for($key = 0; $key <= 2*$h; $key++){ if($new < $slide[$key]){ break; } if(($old <= $slide[$key])&&($slide[$key] < $new)) $slide[$key] = $slide[$key+1]; } $slide[$key-1] = $new; } if($old > $new){ for($key = 2*$h; $key >= 0; $key--){ if($new > $slide[$key]){ break; } if(($old >= $slide[$key])&&($slide[$key] > $new)) $slide[$key] = $slide[$key-1]; } $slide[$key+1] = $new; } array_push($result, $slide[$h]); } for($i = $h-1; $i > 0; $i--){ $slide = array_slice($a, $size-2*$i-1, 2*$i+1); sort($slide); array_push($result, $slide[$i]); } $slide = [$a[$size-1]]; array_push($result, $slide[0]); print_a($a, "<br><br>Исходный массив "); print_a($result, "<br>Массив медиан &emsp;"); return $result; }; function slide_average($h, $a){ $size = count($a); $b = array_merge([0], $a); $sum = reset($a); $result = [$sum]; for($i=1; $i<=$h; $i++){ $sum += next($a)+next($a); $average = (int)($sum/(2*$i+1)+.5); array_push($result, $average); } reset($b); for($i=0; $i < $size-2*$h-1; $i++){ $sum += next($a) - next($b); $average = (int)($sum/(2*$h+1)+.5); array_push($result, $average); } for($i = $h-1; $i >=0; $i--){ $sum -= (next($b) + next($b)); $average = (int)($sum/(2*$i+1)+.5); array_push($result, $average); } print_a($a, "<br><br>Исходный массив "); print_a($result, "<br>Массив средних &ensp;"); return $result; }; $a = range(200, 240); foreach($a as &$item){ $item += 50*mt_rand(-1,1)*(int)(mt_rand(0,149)/100); } slide_median(3, $a); slide_average(3, $a); 

Results:

 Source array: 200, 201, 202, 203, 204, 155, 206, 207, 208, 209, 210, 211, 212, 213, 264, 265, 216, 217, 218, 219, 220, 221, 222, 223 , 224, 225, 226, 227, 228, 229, 230, 231, 232, 183, 234, 235, 236, 287, 238, 239, 240, 
 Array of medians: 200, 201, 202, 202, 203, 204, 206, 207, 208, 209, 210, 211, 212, 213, 216, 217, 218, 219, 219, 219, 220, 221, 222, 223 , 224, 225, 226, 227, 228, 229, 229, 230, 231, 232, 234, 235, 236, 238, 239, 239, 240, 

 Source array: 200, 201, 202, 203, 204, 155, 206, 207, 208, 209, 210, 211, 212, 213, 264, 265, 216, 217, 218, 219, 220, 221, 222, 223 , 224, 225, 226, 227, 228, 229, 230, 231, 232, 183, 234, 235, 236, 287, 238, 239, 240, 
 Array of medium: 200, 201, 202, 196, 197, 198, 199, 200, 201, 209, 210, 218, 226, 227, 228, 229, 230, 231, 225, 219, 220, 221, 222, 223 , 224, 225, 226, 227, 228, 229, 223, 224, 225, 226, 234, 235, 236, 244, 248, 239, 240, 

It can be seen that the moving median smoothes intensive random outliers of data better.

For data from a real array (x, y were rounded to integers):



 X80c48 array processing

 Source Array: 10790, 10728, 10565, 10228, 10148, 9911, 9861, 9880, 9887, 9894, 9907, 9910, 9917, 9932, 9937, 9925, 9900, 9684, 9579, 9446, 9040, 8912, 8703, 8457 , 8350, 8338, 8129, 8040, 7900, 7836, 7731, 7490, 7271, 7250, 7165, 7013, 6912, 6848, 6823, 6857, 6868, 6894, 6902, 6903, 6904, 7114, 7067, 7047, 6994 , 6883, 6848, 6787, 6722, 6412, 6236, 6136, 6012, 5991, 5659, 5648, 5595, 5327, 5079, 5015, 4678, 4639, 4569, 4294, 4241, 4164, 4137, 3948, 3905, 3771 , 3731, 3664, 3577, 3422, 3304, 3086, 3053, 2977, 2967, 3248, 3257, 3202, 2954, 2834, 2594, 2574, 2611, 2730, 2766, 2514, 2387, 2368, 2365, 2344, 2312 , 2098, 1905, 1722, 1579, 1233, 1206, 963, 815, 825, 
 Median array: 10790, 10728, 10565, 10228, 10148, 9911, 9894, 9894, 9894, 9894, 9907, 9910, 9917, 9917, 9917, 9917, 9900, 9684, 9579, 9446, 9040, 8912, 8703, 8457 , 8350, 8338, 8129, 8040, 7900, 7836, 7731, 7490, 7271, 7250, 7165, 7013, 6912, 6868, 6868, 6868, 6868, 6894, 6902, 6903, 6904, 6994, 6994, 6994, 6994 , 6883, 6848, 6787, 6722, 6412, 6236, 6136, 6012, 5991, 5659, 5648, 5595, 5327, 5079, 5015, 4678, 4639, 4569, 4294, 4241, 4164, 4137, 3948, 3905, 3771 , 3731, 3664, 3577, 3422, 3304, 3086, 3086, 3086, 3086, 3053, 2977, 2967, 2954, 2834, 2730, 2730, 2611, 2594, 2574, 2514, 2387, 2368, 2365, 2344, 2312 , 2098, 1905, 1722, 1579, 1233, 1206, 963, 825, 825, 

 Source Array: 10790, 10728, 10565, 10228, 10148, 9911, 9861, 9880, 9887, 9894, 9907, 9910, 9917, 9932, 9937, 9925, 9900, 9684, 9579, 9446, 9040, 8912, 8703, 8457 , 8350, 8338, 8129, 8040, 7900, 7836, 7731, 7490, 7271, 7250, 7165, 7013, 6912, 6848, 6823, 6857, 6868, 6894, 6902, 6903, 6904, 7114, 7067, 7047, 6994 , 6883, 6848, 6787, 6722, 6412, 6236, 6136, 6012, 5991, 5659, 5648, 5595, 5327, 5079, 5015, 4678, 4639, 4569, 4294, 4241, 4164, 4137, 3948, 3905, 3771 , 3731, 3664, 3577, 3422, 3304, 3086, 3053, 2977, 2967, 3248, 3257, 3202, 2954, 2834, 2594, 2574, 2611, 2730, 2766, 2514, 2387, 2368, 2365, 2344, 2312 , 2098, 1905, 1722, 1579, 1233, 1206, 963, 815, 825, 
 Array of medium: 10790, 10694, 10492, 10319, 10189, 10069, 9973, 9927, 9893, 9894, 9904, 9912, 9917, 9918, 9886, 9839, 9772, 9644, 9498, 9323, 9117, 8927, 8749, 8561 , 8418, 8274, 8150, 8046, 7923, 7771, 7645, 7520, 7394, 7262, 7136, 7040, 6981, 6927, 6888, 6872, 6871, 6879, 6920, 6950, 6976, 6990, 6987, 6980, 6963 , 6907, 6813, 6697, 6575, 6450, 6328, 6167, 6013, 5897, 5767, 5616, 5473, 5286, 5140, 4986, 4800, 4645, 4514, 4389, 4285, 4180, 4066, 3985, 3903, 3819 , 3717, 3625, 3508, 3258, 3198, 3151, 3127, 3113, 3094, 3063, 3008, 2952, 2861, 2786, 2723, 2660, 2597, 2564, 2534, 2496, 2437, 2341, 2254, 2159 , 2046, 1885, 1722, 1529, 1346, 1192, 1008, 868, 825, 

 Array Processing y80c48

 The original array: 7289, 7275, 7243, 7178, 7163, 7119, 7109, 6903, 6785, 6680, 6448, 6379, 6264, 5869, 5709, 5448, 5353, 5083, 5081, 5083, 5063, 5062, 5042, 5166 , 5186, 5172, 4916, 4925, 4982, 5009, 5025, 5007, 4993, 4991, 4986, 4976, 4968, 4964, 4962, 4636, 4488, 4148, 4038, 4029, 4010, 3960, 3798, 3709, 3462 , 3475, 3480, 3489, 3542, 3567, 3581, 3598, 3586, 3647, 3649, 3656, 3693, 3727, 3736, 3782, 3788, 3797, 3831, 3837, 3743, 3710, 3550, 3511, 3406 , 3379, 3435, 3520, 3455, 3367, 3204, 3179, 3123, 3115, 2534, 2519, 2472, 2351, 2281, 2141, 2129, 2066, 1862, 1801, 1522, 1302, 1218, 1200, 1038, 851 , 812, 778, 746, 721, 621, 529, 420, 386, 279, 
 Median array: 7289, 7275, 7243, 7178, 7163, 7119, 7109, 6903, 6785, 6680, 6448, 6379, 6264, 5869, 5709, 5448, 5353, 5083, 5083, 5081, 5081, 5081, 5083, 5063 , 5062, 5042, 5009, 5009, 5007, 4993, 4993, 4993, 4993, 4991, 4986, 4976, 4968, 4964, 4962, 4636, 4488, 4148, 4038, 4029, 4010, 3960, 3798, 3709, 3489 , 3489, 3489, 3489, 3598, 3567, 3581, 3586, 3598, 3647, 3649, 3656, 3693, 3727, 3736, 3782, 3788, 3788, 3788, 3788, 3743, 3710, 3550, 3511, 3511 , 3455, 3435, 3406, 3379, 3367, 3204, 3179, 3123, 3115, 2534, 2519, 2472, 2351, 2281, 2141, 2129, 2066, 1862, 1801, 1522, 1302, 1218, 1200, 1038, 851 , 812, 778, 746, 721, 621, 529, 420, 386, 279, 

 The original array: 7289, 7275, 7243, 7178, 7163, 7119, 7109, 6903, 6785, 6680, 6448, 6379, 6264, 5869, 5709, 5448, 5353, 5083, 5081, 5083, 5063, 5062, 5042, 5166 , 5186, 5172, 4916, 4925, 4982, 5009, 5025, 5007, 4993, 4991, 4986, 4976, 4968, 4964, 4962, 4636, 4488, 4148, 4038, 4029, 4010, 3960, 3798, 3709, 3462 , 3475, 3480, 3489, 3542, 3567, 3581, 3598, 3586, 3647, 3649, 3656, 3693, 3727, 3736, 3782, 3788, 3797, 3831, 3837, 3743, 3710, 3550, 3511, 3406 , 3379, 3435, 3520, 3455, 3367, 3204, 3179, 3123, 3115, 2534, 2519, 2472, 2351, 2281, 2141, 2129, 2066, 1862, 1801, 1522, 1302, 1218, 1200, 1038, 851 , 812, 778, 746, 721, 621, 529, 420, 386, 279, 
 Array of mediums: 7289, 7269, 7230, 7197, 7141, 7071, 6991, 6887, 6775, 6653, 6475, 6305, 6114, 5924, 5729, 5544, 5375, 5260, 5168, 5110, 5083, 5098, 5111, 5087 , 5067, 5056, 5051, 5031, 5005, 4980, 4990, 4999, 4998, 4992, 4984, 4977, 4926, 4854, 4735, 4601, 4466, 4330, 4187, 4067, 3956, 3858, 3778, 3699, 3625 , 3559, 3522, 3502, 3519, 3536, 3552, 3574, 3596, 3612, 3630, 3651, 3671, 3699, 3719, 3740, 3765, 3785, 3788, 3784, 3751, 3711, 3655, 3591, 3533, 3502 , 3465, 3439, 3395, 3363, 3326, 3280, 3140, 3006, 2878, 2756, 2628, 2488, 2347, 2280, 2186, 2090, 1972, 1832, 1700, 1567, 1420, 1276, 1135, 1028, 949 , 878, 795, 723, 661, 600, 529, 447, 362, 279,

Compared to the moving average algorithm, the moving median handles the data much more carefully.

There is a translation of the article into English .

  • The coordinates should not be considered independent, there is a connection between them - these are the laws of physics and they should be taken into account. But the normal and tangential accelerations are independent and you can try to smooth them. - sercxjo
  • @sercxjo I spoke only about the independent handling of two dependencies. But about the acceleration at such noises will have to forget. - Yuri Negometyanov

For processing noisy data, you should think about the Kalman filter . Since you have a bus here - just a question - does this bus move along a previously known route?

  • No, the route is unknown, but you can roughly sketch it out - Simankov
  • Kalman filtering is a recurrent implementation of the least squares method. Tell me what and how you are going to optimize. - Yuri Negometyanov
  • @Yuri Negometyanov - a textbook example of the use of this filter - just getting data on the speed and position of the object based on incoming noisy position data. - gbg
  • @gbg In anthologies, they write about normally distributed noise and a model with stable parameters. In this case, there are impulse noise and high dynamics of movement, which will aggravate the initial instability of the filter (visually - the processing results will be depressing to follow the jumps in the original data). Therefore, you should think about non-parametric processing methods. - Yuri Negometyanov

If you need to smooth the graph - there are filtering and approximation algorithms. First, it is a moving average method , which can be slightly adjusted input data for their greater smoothness. Further, you can break the graph into small pieces and approximate them, say, by parabolas using the least squares method (I told about his idea here ). You can also use the Fourier transform of the input data, remove the high frequencies from there (which will be the noise), and then use the inverse transform.

    The graph shows both obvious and implicit errors. There are turns, plus at every turn there are dubious points that you cannot tell right away, this is a mistake, or a bus like this squiggle was driving. That is, simple options such as dropping points for "incredible" movements in sections of turns are unlikely to help.

    I can suggest using local regression smoothing, then build a confidence interval (or rather, a confidence region), not for regression, but for values (see here ), and discard points beyond it. The 95% interval can be very wide, you may need to take 90%, 80%, etc.

    As for the 100 thousand observations, there is no big data here, on a quite ordinary computer one can try a variety of methods.

      I propose two criteria for discarding bad points.

      1. Zigzags - two turns in different directions more than 90 degrees in a row. This is actually possible when rolling back and slipping, but if it happens on one line, then the appearance of the chart will not hurt, and if with a shift, then it is most likely the result of an error. However, at rare points it is possible that the bus did indeed go like this, here the criterion is not clear to me. Another often observed erroneous drift during parking . If you find a zigzag at low speed, you can copy the adjacent point instead of deleting it. It is also possible to consider the criterion of parking constant very low speed for a long time, but it can be traffic in a traffic jam.
      2. Too much normal acceleration with alternating sign (the bus is not lucky) - here I suggest a threshold of 0.4 m / s ^ 2 relative to the previous value.

      Program

       #include <stdio.h> #include <string.h> #include <math.h> #define ZZMAXLEN 40 #define MAXDA 0.4 int main(int argn, char **argv) { double x[4], y[4]; // кольцевые буферы данных int t[4]={0,0,0,0}, n=0, h=0; char id[9]; if(argn!=2) { fprintf(stderr, "Usage:\n\t%s bus-ID\n", argv[0]); return 1; } while(scanf("%d %lf %lf %8s", t+h, x+h, y+h, id)==4) { if(strcmp(id, argv[1])) continue; h= h+1 & 3; if(++n < 4) continue; int dt1= t[h+1&3]-t[h]; int dt2= t[h+2&3]-t[h+1&3]; int dt3= t[h+3&3]-t[h+2&3]; double dx1= x[h+1&3]-x[h]; double dy1= y[h+1&3]-y[h]; double dx2= x[h+2&3]-x[h+1&3]; double dy2= y[h+2&3]-y[h+1&3]; double dx3= x[h+3&3]-x[h+2&3]; double dy3= y[h+3&3]-y[h+2&3]; int filter=0; if(dx1*dx2+dy1*dy2 < 0 && dx2*dx3+dy2*dy3 < 0 && dx1*dx3+dy1*dy3 > 0 && dx2*dx2+dy2*dy2 < ZZMAXLEN*ZZMAXLEN) { // разворот на угол более 90 градусов 2 раза подряд if((dx2*dx2+dy2*dy2)>0 && (dx2*dx2+dy2*dy2)/(dt2*dt2)<0.05) { fprintf(stderr, "D %d\n", n-1); // дрейф filter=2; } else { fprintf(stderr, "Z %d\n", n-2); filter=1; } } // скорости dx1/=dt1; dx2/=dt2; dx3/=dt3; dy1/=dt1; dy2/=dt2; dy3/=dt3; double vx12= (dx1+dx2)/(dt1+dt2); double vy12= (dy1+dy2)/(dt1+dt2); double vx23= (dx2+dx3)/(dt2+dt3); double vy23= (dy2+dy3)/(dt2+dt3); double v12= sqrt(vx12*vx12+vy12*vy12); double v23= sqrt(vx23*vx23+vy23*vy23); // ускорения double ax1= (dx2-dx1)*2/(dt1+dt2); double ay1= (dy2-dy1)*2/(dt1+dt2); double ax2= (dx3-dx2)*2/(dt2+dt3); double ay2= (dy3-dy2)*2/(dt2+dt3); // нормальные ускорения double an1= v12? (vx12*ay1-vy12*ax1)/v12 : 0; double an2= v23? (vx23*ay2-vy23*ax2)/v23 : 0; if(fabs(an1-an2) > MAXDA && an1*an2<0) { fprintf(stderr, "N %d %f %f\n", n-2, an1, an2); // два резких поворота filter=1; } if(filter) { // определяем какую точку удалить if(fabs(an1) > fabs(an2)) if(filter==2) { x[h+1&3]=x[h+2&3]; y[h+1&3]=y[h+2&3]; } else { x[h+1&3]=x[h+2&3]; y[h+1&3]=y[h+2&3]; t[h+1&3]=t[h+2&3]; } else if(filter==2) { x[h+2&3]=x[h+1&3]; y[h+2&3]=y[h+1&3]; } else { x[h+2&3]=x[h+1&3]; y[h+2&3]=y[h+1&3]; t[h+2&3]=t[h+1&3]; } } // тут можно пропустить точки с поворяющимся временем, но я оставил для построения графиков printf("%d\t%lf\t%lf\t%s\n", t[h], x[h], y[h], id); } t[h]=0; do { h=h-1&3; } while (t[h-1&3]<t[h] && t[h-1&3]>0); do { printf("%d\t%lf\t%lf\t%s\n", t[h], x[h], y[h], argv[1]); h= h+1&3; } while (t[h-1&3]<t[h] && t[h]>0); return 0; } 

      The results of the work are obtained on the standard output, and in the standard error output, the list of remote points and the cause (N is normal acceleration, Z is zigzag, D is zigzag in drift):

       $ ./a.out 80c48 < data.txt > 80c48.txt N 154 -0.163844 0.490944 D 249 D 252 Z 303 Z 444 Z 567 N 631 0.283293 -0.136238 Z 636 N 727 0.321036 -0.103197 N 984 -0.366378 0.309231 Z 991 N 1082 0.414078 -0.203378 N 1199 -0.020139 0.572870 D 1213 Z 1414 D 1507 D 1515 D 1517 D 1538 

      And an illustration of the deletion point 154:

      part of the result graphically

      Apparently, the source data has already been processed by some kind of filter, since the time counts are uneven.

      This method cuts out single errors, but GPS data can often accumulate errors (and the Kalman filter too), and to combat this, we can suggest attracting too deflected points to the road grid, which can be obtained by averaging many trajectories.