Creating a Model from Tabular Data - Create ML - Core ML

Hamed HashemiHamed Hashemi
3 min read

Hi devs, 👋🏻 This is a tutorial about how to create a Core ML model with your tabular data, using Apple's sample code in the documentation. The problem with the original sample code is using outdated MLDataTable, which is now replaced with DataFrame. You can go to the original documentation page from this link.

Download the training file in the original documentation linked above. You can also click here to download the file. There is a CSV file in the resources folder, which contains Martian housing data, in a data table with columns of information about the habitat on Mars:

  • Price

  • Size (area in acres)

  • Number of greenhouses

  • Number of solar panels

  • Primary purpose. Once you have the training data, you can follow along with me.

The playground trains the regressor and classifier models, each with a group of columns relevant to that model. Once trained, the regressor is ready to predict the price of a habitat, and the classifier is ready to predict the purpose of a habitat. The playground concludes by saving each model to a file, ready for integration into an app.

First, we need a playground file, you can delete the code in the downloaded playground from the documentation or create a new one: with Xcode open, go to File-> New-> Playground. Either way, we need to import some dependencies.

import Foundation
import CreateML
import TabularData

Import the data

// Locating our CSV training data.
let csvFile = Bundle.main.url(forResource: "MarsHabitats", withExtension: "csv")!
// initializing the DataFrame with training data.
let dataFrame = try DataFrame(contentsOfCSVFile: csvFile)

print(dataFrame)

Isolating the dataTable data

To predict the price, we don't need the purpose column. so we generate a new table from the main data frame.

let regressorColumns = ["price", "solarPanels", "greenhouses", "size"]
let regressorTable = dataFrame[regressorColumns]

To predict the purpose of a habitat, the classifier needs a similar group of columns.

let classifireColumns = ["purpose", "solarPanels", "greenhouses", "size"]
let classifireTable = dataFrame[classifireColumns]

Divide the Data for Training and Evaluation

In this example, the playground sets aside 20% of each model’s data rows for evaluation, leaving the remaining 80% for training.

// Data for Regression
let (regressorEvaluationTable, regressorTrainingTable) = regressorTable.randomSplit(by: 0.20, seed: 5)
// Data for Classification
let (classifierEvaluationTable, classifierTrainingTable) = classifierTable.randomSplit(by: 0.20, seed: 5)

Train the regressor

Notice that we have to convert the type to DataFrame again. Because after dividing the training and evaluation tables, we are left with 'DataFrame.Slice' type, and we need the 'DataFrame' type.

let regressorTrainingFrame = DataFrame(regressorTrainingTable)
let regressor = try MLLinearRegressor(trainingData: regressorTrainingFrame, targetColumn: "price")

/// The largest distances between predictions and the expected values
let worstTrainingError = regressor.trainingMetrics.maximumError
let worstValidationError = regressor.validationMetrics.maximumError

Evaluate the regressor

let regressorEvaluationFrame = DataFrame(regressorEvaluationTable)
let regressorEvaluation = regressor.evaluation(on: regressorEvaluationFrame)
let worstEvaluationError = regressorEvaluation.maximumError

Train the classifier

let classifierTrainingFrame = DataFrame(classifierTrainingTable)
let classifier = try MLClassifier(trainingData: classifierTrainingFrame, targetColumn: "purpose")

/// Classifier training accuracy as a percentage
let trainingError = classifier.trainingMetrics.classificationError
let trainingAccuracy = (1.0 - trainingError) * 100
/// Classifier validation accuracy as a percentage
let validationError = classifier.validationMetrics.classificationError
let validationAccuracy = (1.0 - validationError) * 100

Evaluate the classifier

let classifierEvaluationFrame = DataFrame(classifierEvaluationTable)
let classifierEvaluation = classifier.evaluation(on: classifierEvaluationFrame)

/// Classifier evaluation accuracy as a percentage
let evaluationError = classifierEvaluation.classificationError
let evaluationAccuracy = (1.0 - evaluationError) * 100

Save the Models

// Regressor
let regressorMetadata = MLModelMetadata(author: "Hamed Hashemi", shortDescription: "Predicts the price of a habitat on Mars.", version: "1.0")
/// Save the trained regressor model to the Desktop. replace the "your user" with your mac's username.
try regressor.write(to: URL(filePath: "/Users/\("your user")/Desktop/FileName.mlmodel"), metadata: regressorMetadata)


// Calssifier
let classifierMetadata = MLModelMetadata(author: "Hamed Hashemi", shortDescription: "Predicts the purpose of a habitat on Mars.", version: "1.0")
/// Save the trained classifier model to the Desktop. replace the "your user" with your mac's username.
try classifier.write(to: URL(filePath: "/Users/\("your user")/Desktop/FileName.mlmodel"), metadata: classifierMetadata)

Congrats! You have a model on your desktop now.

0
Subscribe to my newsletter

Read articles from Hamed Hashemi directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Hamed Hashemi
Hamed Hashemi