Creating a Model from Tabular Data - Create ML - Core ML
Hi devs, 👋🏻 This is a tutorial about how to create a Core ML model with your tabular data, using Apple's sample code in the documentation. The problem with the original sample code is using outdated MLDataTable, which is now replaced with DataFrame. You can go to the original documentation page from this link
.
Download the training file in the original documentation linked above. You can also click here
to download the file. There is a CSV file in the resources folder, which contains Martian housing data, in a data table with columns of information about the habitat on Mars:
Price
Size (area in acres)
Number of greenhouses
Number of solar panels
Primary purpose. Once you have the training data, you can follow along with me.
The playground trains the regressor and classifier models, each with a group of columns relevant to that model. Once trained, the regressor is ready to predict the price of a habitat, and the classifier is ready to predict the purpose of a habitat. The playground concludes by saving each model to a file, ready for integration into an app.
First, we need a playground file, you can delete the code in the downloaded playground from the documentation or create a new one: with Xcode open, go to File-> New-> Playground. Either way, we need to import some dependencies.
import Foundation
import CreateML
import TabularData
Import the data
// Locating our CSV training data.
let csvFile = Bundle.main.url(forResource: "MarsHabitats", withExtension: "csv")!
// initializing the DataFrame with training data.
let dataFrame = try DataFrame(contentsOfCSVFile: csvFile)
print(dataFrame)
Isolating the dataTable data
To predict the price, we don't need the purpose column. so we generate a new table from the main data frame.
let regressorColumns = ["price", "solarPanels", "greenhouses", "size"]
let regressorTable = dataFrame[regressorColumns]
To predict the purpose of a habitat, the classifier needs a similar group of columns.
let classifireColumns = ["purpose", "solarPanels", "greenhouses", "size"]
let classifireTable = dataFrame[classifireColumns]
Divide the Data for Training and Evaluation
In this example, the playground sets aside 20% of each model’s data rows for evaluation, leaving the remaining 80% for training.
// Data for Regression
let (regressorEvaluationTable, regressorTrainingTable) = regressorTable.randomSplit(by: 0.20, seed: 5)
// Data for Classification
let (classifierEvaluationTable, classifierTrainingTable) = classifierTable.randomSplit(by: 0.20, seed: 5)
Train the regressor
Notice that we have to convert the type to DataFrame again. Because after dividing the training and evaluation tables, we are left with 'DataFrame.Slice' type, and we need the 'DataFrame' type.
let regressorTrainingFrame = DataFrame(regressorTrainingTable)
let regressor = try MLLinearRegressor(trainingData: regressorTrainingFrame, targetColumn: "price")
/// The largest distances between predictions and the expected values
let worstTrainingError = regressor.trainingMetrics.maximumError
let worstValidationError = regressor.validationMetrics.maximumError
Evaluate the regressor
let regressorEvaluationFrame = DataFrame(regressorEvaluationTable)
let regressorEvaluation = regressor.evaluation(on: regressorEvaluationFrame)
let worstEvaluationError = regressorEvaluation.maximumError
Train the classifier
let classifierTrainingFrame = DataFrame(classifierTrainingTable)
let classifier = try MLClassifier(trainingData: classifierTrainingFrame, targetColumn: "purpose")
/// Classifier training accuracy as a percentage
let trainingError = classifier.trainingMetrics.classificationError
let trainingAccuracy = (1.0 - trainingError) * 100
/// Classifier validation accuracy as a percentage
let validationError = classifier.validationMetrics.classificationError
let validationAccuracy = (1.0 - validationError) * 100
Evaluate the classifier
let classifierEvaluationFrame = DataFrame(classifierEvaluationTable)
let classifierEvaluation = classifier.evaluation(on: classifierEvaluationFrame)
/// Classifier evaluation accuracy as a percentage
let evaluationError = classifierEvaluation.classificationError
let evaluationAccuracy = (1.0 - evaluationError) * 100
Save the Models
// Regressor
let regressorMetadata = MLModelMetadata(author: "Hamed Hashemi", shortDescription: "Predicts the price of a habitat on Mars.", version: "1.0")
/// Save the trained regressor model to the Desktop. replace the "your user" with your mac's username.
try regressor.write(to: URL(filePath: "/Users/\("your user")/Desktop/FileName.mlmodel"), metadata: regressorMetadata)
// Calssifier
let classifierMetadata = MLModelMetadata(author: "Hamed Hashemi", shortDescription: "Predicts the purpose of a habitat on Mars.", version: "1.0")
/// Save the trained classifier model to the Desktop. replace the "your user" with your mac's username.
try classifier.write(to: URL(filePath: "/Users/\("your user")/Desktop/FileName.mlmodel"), metadata: classifierMetadata)
Congrats! You have a model on your desktop now.
Subscribe to my newsletter
Read articles from Hamed Hashemi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by