Overview

Sections

Best practices

A machine learning project has a lot of flexibility and user control over it, but it is generally accepted that an ML workflow adopt the product life cycle below.


Business Case Definition

Before starting a machine learning project, it is best to step back and define the business problem at hand. It takes some finesse to match business problems with the appropriate algorithms, and the appropriate data. Often, business problems fall into one of the following categories.

ML Business Cases


Fetching Data

The enterprise can produce incredible amounts of data from different systems. For use in the Infor AI platform, data should be managed and stored in the data lake.

Data Preparation

Data preparation tasks can be numerous and time consuming. It is recommended that one be familiar with tidy data standards when it comes to cleaning data. It is also important to understand any caveats to the particular algorithm that you might use to process the data. Infor AI has the following built in tools for data manipulation and preparation, as well as the ability to run python scripts to allow for complete customization when manipulating data.

Data Block
Description

Select Columns

Select or exclude a subset of columns from the current dataset.

Remove Duplicates

Remove duplicates in selected features.

Construct Feature

Create a new feature out of the existing ones by using mathematical, logical, or casting operations.

Index Data

Transform categorical values into numeric for the selected columns. Each category will be assigned a number according to its occurrence in the data, highest occurrence having number 0.

Smooth Data

Remove noise from a dataset to allow natural patterns to stand out.

Split Data

Split the dataset into training data and test data by specifying the split ratio for the training dataset.

Scripting

Execute a customized Python script to perform an activity which is not available in the catalog.

Ingest to Data Lake

Ingest data to Infor Data Lake

One Hot Encoder

Transform categorical features into a binary matrix (vectors) to distinguish each categorical label. The vector consists of 0s in all cells, with the exception of a single 1 in a cell used uniquely to identify the label.

Feature Scaling

Scale features with varying magnitudes, units and range into normalized values.

Handle Missing Data

Replace missing values in selected features (with mean / mode / constant value / interpolation), or remove the entire row exceeding a selected ratio of missing data.

Target Encoder

Numerization of categorical variables via target – replaces the categorical variable with just one new numerical variable and replaces each category of the categorical variable with its corresponding probability of the target (if categorical) or average of the target (if numerical)

Edit Metadata

Select the Target label. Edit the metadata of the selected features by changing its data type, tagging the categorical values, changing the variable name or defining their machine learning type.

Balance Data

Balance the dataset using undersampling or oversampling methods.

Execute SQL

SQL operations (filter out data, join datasets, aggregate data etc.).


Model Training

Training a model requires the prepared dataset and the algorithm to be used in training. Supervised algorithms can be scored for accuracy using the train/test split functionality and the score and evaluate model blocks. The compare model block will allow for the training of multiple models to compare performance statistics.


Model Fitting & Tuning

Algorithm hyperparameters are available in each of the algorithm blocks. The specific parameters will be different depending on algorithm selection, and can details for each hyperparameter can be found in the documentation of the chosen algorithm.


Model Deployment

In the quest, select the checkbox on activities desired for the deployed model and push the activities to the production quest. This quest can be deployed as an endpoint accessible via the ION API gateway.


Model Maintenance

These best practices offer essential guidance to enhance your processes. For a personalized and thorough implementation tailored to your needs, reach out to Infor Professional Services. Their expertise ensures optimal results for your unique challenges.

Was this section helpful?

What made this section unhelpful for you?

On this page
  • Best practices
View as Markdown

Ask an AI

Open in ChatGPTOpen in Claude