What is Google Cloud AutoML

AutoML - what do the clouds have to offer?

Author:Floarea Serban

The various AutoML services evaluate numerous ML algorithms with automatic modeling based on previously defined data. The generated models can be provided on the public cloud or in a container and later integrated into applications via API.

What is AutoML?

AutoML is a process that automates the repetitive and reusable tasks of data science processes. This process allows data engineers, data scientists, analysts and developers to develop models with higher scalability, speed, efficiency and productivity as well as good model quality.

According to Gartner, AutoML has aroused great interest in recent years: "Sales lead scoring, risk assessment and next-best-action recommendation". In addition, one of Gartner's strategic hypotheses is that by 2022 the number of applications using AutoML will increase from 1% to 25%.

The traditional development of ML models requires considerable resources and diverse employee profiles. On the one hand, the data engineers have to obtain and provide large amounts of data from various sources. On the other hand, it is up to the data scientists and business analysts to understand, aggregate and transform the data. At the same time, ML researchers are constantly developing and optimizing new algorithms and model structures. These are integrated into reusable libraries by software developers, which ultimately end up productively in the applications. Only then does the circle come full circle and the optimized models can (at best) generate business value.

AutoML aims to automate the entire data science process - from data cleansing to parameter optimization. This process goes through the following steps:

  1. Data cleansing
  2. Feature pre-processing, selection and construction
  3. Model Selection
  4. Parameter optimization

So far, most AutoML tools have focused on model selection and parameter optimization.