📜 ⬆️ ⬇️

The task of classification through the eyes of a schoolboy: determining whether a car is in the parking lot by frames from a video surveillance camera

Hello, I am a schoolchild of 11 classes, interested in programming, around-IT topics.

I am writing this post in order to share my project, which took 10 hours of my life on weekends and was done to understand the possibilities of modern data analysis methods. The publication can be considered as an example of a successful implementation for people who are unaware of this area of ​​knowledge, as well as a request to indicate my mistakes for people who are, accordingly, knowledgeable.

It is given: video stream from a video surveillance camera, on which there is a fragment of 100x50 pixels with an image of a particular parking space, on which only a specific car may be present or absent.

Camera Image

Parking space image


Task: to determine the presence or absence of a car in a parking space.

Obtaining images from the camera


I use the openCV library for image retrieval and preprocessing.
I used the following code to build a dataset that I use to train a neural network: I photograph an hourly parking space, and after receiving 60 photos, I manually divide them into a photo with and without a machine.

dataminer.py
import cv2 as cv import numpy as np import time cap = cv.VideoCapture() r = 0 while r <=100: cap.open('http://**.**.***.***:***/*****CG?container=mjpeg&stream=main') #URL-адрес видеопотока hasFrame, frame = cap.read()#Чтение кадра из потока frame = frame[100:200, 300:750] box = [0,335,100,385] quantframe = frame[box[0]:box[2], box[1]:box[3]]#Сохранение в отдельную переменную части кадра с изображением машины r+=1 cv.imwrite(str(r)+'.png',quantframe) #Сохранение изображения машины в файл print('saved') cap.release() time.sleep(3600) key = cv.waitKey(1) if key & 0xFF == ord('q'): cv.destroyAllWindows() break 


Image processing


I thought it was the right decision to train the neural network not on the original images, but on images with car outlines found using the cv2.findcontours (...) function.

Here is the code that converts the source image into a contour image:

Finding contours
 def contoursfinder(image): img = image.copy() srcimg = img.copy() hsv_min = np.array((0, 0, 0), np.uint8) hsv_max = np.array((255, 255, 60), np.uint8) #Поскольку контуры находятся по различиям в цвете между частями картинки, необходимо подобрать параметры, исходя из цветовой гаммы картинки hsv = cv.cvtColor( img, cv.COLOR_BGR2HSV ) thresh = cv.inRange( hsv, hsv_min, hsv_max ) contours, _ = cv.findContours(thresh, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE) img = np.zeros((100,50,3), np.uint8) cv.drawContours(img, contours, -1, (255,255,255), 1, cv.LINE_AA) return cv.cvtColor(img, cv.COLOR_BGR2GRAY) 


The result of the function:

Hidden text

Neural network


I used the library tensorflow (keras).

The network architecture is written off from the example from the Internet: for me, the explanation of why is not so obvious. If knowledgeable people tell or tell where to read, why this architecture is effective or why some other one will be more effective, I will be immensely grateful.
The neural network model: sequential, consists of two dense hidden layers of 256 and 128 neurons and input, output layers.

Code
 model = keras.Sequential([ keras.layers.Flatten(input_shape=(100, 50)), keras.layers.Dense(256, activation=tf.nn.relu), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.softmax) ]) model.compile(optimizer=tf.train.AdamOptimizer(), loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(trainingimagesarr, trainingimagesarrlabel, epochs=1, callbacks=[tbCallBack]) 


Before training, the entire numpy matrix was divided by 255, in order to give numbers to the input of the neural network in the range from 0 to 1.

 trainingimagesarr = trainingimagesarr / 255.0 trainingimagesarrlabel = np.array(trainingimagesarrlabel)-1 

Now I can call the function:

 def realtest(): cap = cv.VideoCapture() cap.open('http://**.**.***.***:***/*****CG?container=mjpeg&stream=main') hasFrame, frame = cap.read() frame = frame[100:200, 300:750] quantframe = frame[0:100,275:325] quantframe = contoursfinder(quantframe) return model.predict(np.array([quantframe]))[0][1]>0.60 

obtain information about the presence of the car in the parking lot.

Do not kick much, but slightly :-)

Thank!

Source: https://habr.com/ru/post/440608/