Skip to content

Latest commit

 

History

History
184 lines (135 loc) · 6.53 KB

File metadata and controls

184 lines (135 loc) · 6.53 KB
title Retrain a Model
sidebarTitle Retrain a Model

Description

The RETRAIN statement is used to retrain the already trained predictors with the new data. The predictor is updated to leverage the new data in optimizing its predictive capabilities.

Retraining takes at least as much time as the training process of the predictor did because now the dataset used to retrain has new or updated data in addition to the old data.

Syntax

Here is the syntax:

RETRAIN [MODEL] project_name.predictor_name
[FROM [integration_name | project_name]
    (SELECT column_name, ...
     FROM [integration_name. | project_name.]table_name)
PREDICT target_name
USING engine = 'engine_name',
      tag = 'tag_name',
      active = 0/1];

On execution, we get:

Query OK, 0 rows affected (0.058 sec)

Where:

Expressions Description
project_name Name of the project where the model resides.
predictor_name Name of the model to be retrained.
integration_name Optional. Name of the integration created using the CREATE DATABASE statement or file upload.
(SELECT column_name, ... FROM table_name) Optional. Selecting data to be used for training and validation.
target_column Optional. Column to be predicted.
engine_name You can optionally provide an ML engine, based on which the model is retrained.
tag_name You can optionally provide a tag that is visible in the training_options column of the mindsdb.models table.
active Optional. Setting it to 0 causes the retrained version to be inactive. And setting it to 1 causes the retrained version to be active.
**Model Versions** Every time the model is retrained, its new version is created with the incremented version number.
You can query for all model versions like this:

```sql
SELECT *
FROM project_name.models;
```

For more information on managing model versions, check out our [docs here](/sql/api/manage-models-versions/).

When to RETRAIN the Model?

It is advised to RETRAIN the predictor whenever the update_status column value from the mindsdb.models table is set to available.

Here is when the update_status column value is set to available:

  • When the new version of MindsDB is available that causes the predictor to become obsolete.
  • When the new data is available in the table that was used to train the predictor.

To find out whether you need to retrain your model, query the mindsdb.models table and look for the update_status column.

Here are the possible values of the update_status column:

Name Description
available It indicates that the model should be updated.
updating It indicates that the retraining process of the model takes place.
up_to_date It indicates that your model is up to date and does not need to be retrained.

Let's run the query.

SELECT name, update_status
FROM mindsdb.models
WHERE name = 'predictor_name';

On execution, we get:

+------------------+---------------+
| name             | update_status |
+------------------+---------------+
| predictor_name   | up_to_date    |
+------------------+---------------+

Where:

Name Description
predictor_name Name of the model to be retrained.
update_status Column informing whether the model needs to be retrained.

Alternatively, use the DESCRIBE command as below:

DESCRIBE MODEL predictor_name;

Example

Let's look at an example using the home_rentals_model model.

First, we check the status of the predictor.

SELECT name, update_status
FROM mindsdb.models
WHERE name = 'home_rentals_model';

On execution, we get:

+--------------------+---------------+
| name               | update_status |
+--------------------+---------------+
| home_rentals_model | available     |
+--------------------+---------------+

Alternatively, use the DESCRIBE command as below:

DESCRIBE MODEL home_rentals_model;

The available value of the update_status column informs us that we should retrain the model.

RETRAIN mindsdb.home_rentals_model;

On execution, we get:

Query OK, 0 rows affected (0.058 sec)

Now, let's check the status again.

SELECT name, update_status
FROM mindsdb.models
WHERE name = 'home_rentals_model';

On execution, we get:

+--------------------+---------------+
| name               | update_status |
+--------------------+---------------+
| home_rentals_model | updating      |
+--------------------+---------------+

And after the retraining process is completed:

SELECT name, update_status
FROM mindsdb.models
WHERE name = 'home_rentals_model';

On execution, we get:

+--------------------+---------------+
| name               | update_status |
+--------------------+---------------+
| home_rentals_model | up_to_date    |
+--------------------+---------------+