📜 ⬆️ ⬇️

FunTech ML-meetup



Recently, FunCorp joined the beautiful - machine learning. Our backend engineer taught search engines to read memes. On this occasion, we decided to collect the ML-mitap in order to share our work, and at the same time learn from more experienced specialists from other companies, where machine learning is already an important component of the business. We decided to collect - collected. We will hold on February 9th. Program under the cat.

Program


“Discover launch experience for 90 million users: five recommendations for ML developers”, Andrey Zakonov, vk.com


About the report




“Production in ML”, Mark Andreev, Conundrum.ai


About the report


The report will discuss:


“How to teach search engines to read memes”, Grigory Kuzovnikov, FunCorp


About the report


iFunny is an application with funny pictures and videos. The only textual content that is available is user comments, but in order to attract traffic from search engines, it is not enough, so it was decided to extract text from images and place it on the pages. A service was created specifically for this:


The service is written in Python using tensorflow. Nobody in the team had any experience in developing ML services, so we went through all the steps:

  1. Setting the job.
  2. The first experiments, when we tried to do something that somehow works, experimenting with the architecture of neural networks.
  3. Drawing up a training set.
  4. Training and selection of model coefficients.
  5. Creating a service using our trained model. Wrapping it in a docker container.
  6. Depla and binding service to our php-monolith. Idle start.
  7. The first results of the work and comments from rentals.
  8. Using recognition results in combat.
  9. Analysis of the results.
  10. We are here now. We still have to redo and retrain models to increase the number of correctly recognized memes.


“Machine learning in Yandex.Taxi”, Roman Khalkechev, Yandex.Taxi


About the report


The report will discuss the Yandex.Taxi device.

There will be a detailed story:


“Getting rid of the curse of Sklearn: writing XGBoost from scratch,” Artyom Khapkin, Mail.ru Group


About the report


The story about boosting. What you need to know to write it yourself. What are the pitfalls, how can you improve its work.

At present, it is difficult to imagine a place where ensemble boosting algorithms over decisive trees are not used. These are search engines, recommendation ranking algorithms, Kaggle competitions and many more.

There are many ready-made implementations of the algorithm: Catboost, Lightgbm, Xgboost, and more. However, there are cases when using ready-made solutions out of the box is not very good - the understanding of the operation of the algorithm is lost, and for certain tasks such implementations are not very suitable, etc.

In this report, we will analyze the principles of the algorithm, and moving from simple to complex, we implement our own Xgboosting algorithm, which can then be adjusted for any machine learning tasks - classification, regression, ranking, etc.

More information in Telegram
You can register in Timepad . Limited number of seats.

For those who can not come or do not have time to register, there will be a broadcast on our channel .

Source: https://habr.com/ru/post/436900/