Human Activity Recognition Using Logistic Regression


1. Introduction
Overview of Human Activity Recognition with Smartphones
Human Activity Recognition (HAR) is an exciting field in machine learning that focuses on identifying physical activities based on sensor data collected from smartphones, smartwatches, and other wearable devices. With advancements in mobile technology, HAR has found applications in health monitoring, fitness tracking, and human-computer interaction.
The Objective of This Project
In this project, we aim to predict one of six human activities based on motion sensor data from smartphones:
Walking
Walking Upstairs
Walking Downstairs
Sitting
Standing
Laying
By analyzing the sensor readings, our model will classify each observation into one of these activity categories.
Why Logistic Regression?
Logistic Regression is a widely used classification algorithm that predicts the probability of an instance belonging to a particular class. Since our problem involves multi-class classification (six activity labels), we will use the One-vs-Rest (OvR) approach, where a separate logistic regression model is trained for each class.
Dataset: Human Activity Recognition with Smartphones
We will use the Human Activity Recognition with Smartphones dataset, which contains:
561 sensor features extracted from accelerometer and gyroscope data.
A target variable (
activity
) representing the six activity categories.Data was collected from 30 volunteers performing daily activities while carrying a smartphone.
Now, let's begin by loading and exploring the dataset.
2. Data Import and Exploration
Importing the Data
We start by loading the dataset and inspecting its structure.
Load the Dataset
import pandas as pd
# Load the dataset
data = pd.read_csv(R'Human_Activity_Recognition_Using_Smartphones_Data.csv',sep=',')
# Display first few rows
print(data.head())
tBodyAcc-mean()-X tBodyAcc-mean()-Y tBodyAcc-mean()-Z tBodyAcc-std()-X \
0 0.288585 -0.020294 -0.132905 -0.995279
1 0.278419 -0.016411 -0.123520 -0.998245
2 0.279653 -0.019467 -0.113462 -0.995380
3 0.279174 -0.026201 -0.123283 -0.996091
4 0.276629 -0.016570 -0.115362 -0.998139
tBodyAcc-std()-Y tBodyAcc-std()-Z tBodyAcc-mad()-X tBodyAcc-mad()-Y \
0 -0.983111 -0.913526 -0.995112 -0.983185
1 -0.975300 -0.960322 -0.998807 -0.974914
2 -0.967187 -0.978944 -0.996520 -0.963668
3 -0.983403 -0.990675 -0.997099 -0.982750
4 -0.980817 -0.990482 -0.998321 -0.979672
tBodyAcc-mad()-Z tBodyAcc-max()-X ... fBodyBodyGyroJerkMag-skewness() \
0 -0.923527 -0.934724 ... -0.298676
1 -0.957686 -0.943068 ... -0.595051
2 -0.977469 -0.938692 ... -0.390748
3 -0.989302 -0.938692 ... -0.117290
4 -0.990441 -0.942469 ... -0.351471
fBodyBodyGyroJerkMag-kurtosis() angle(tBodyAccMean,gravity) \
0 -0.710304 -0.112754
1 -0.861499 0.053477
2 -0.760104 -0.118559
3 -0.482845 -0.036788
4 -0.699205 0.123320
angle(tBodyAccJerkMean),gravityMean) angle(tBodyGyroMean,gravityMean) \
0 0.030400 -0.464761
1 -0.007435 -0.732626
2 0.177899 0.100699
3 -0.012892 0.640011
4 0.122542 0.693578
angle(tBodyGyroJerkMean,gravityMean) angle(X,gravityMean) \
0 -0.018446 -0.841247
1 0.703511 -0.844788
2 0.808529 -0.848933
3 -0.485366 -0.848649
4 -0.615971 -0.847865
angle(Y,gravityMean) angle(Z,gravityMean) Activity
0 0.179941 -0.058627 STANDING
1 0.180289 -0.054317 STANDING
2 0.180637 -0.049118 STANDING
3 0.181935 -0.047663 STANDING
4 0.185151 -0.043892 STANDING
[5 rows x 562 columns]
Feature Overview:
The dataset consists of 561 sensor-based numerical features derived from smartphone accelerometer and gyroscope readings.
Feature names follow a pattern, such as
tBodyAcc-mean()-X
,tBodyAcc-std()-Y
, etc., indicating time-domain (tBodyAcc) and frequency-domain (fBodyAcc, fBodyGyro) features.Features represent statistical metrics (mean, standard deviation, skewness, kurtosis) and angles derived from motion sensor signals.
Feature Scaling:
Values range between -1 and 1, confirming that the dataset is already normalized.
This ensures compatibility with machine learning models like logistic regression, which benefits from scaled input features.
Target Variable (
Activity
):The
Activity
column is categorical, representing six human activity types.In this sample, all five rows belong to the STANDING class, but further exploration will confirm the dataset balance.
Potential Preprocessing Steps:
Convert
Activity
into numerical labels usingLabelEncoder
.No missing values were detected, so imputation is not needed.
Features may exhibit high correlations, requiring feature selection techniques.
Check Dataset Structure
# Summary of the dataset
data.info()
float64 561
object 1
dtype: int64
All sensor features are floating-point numbers, while the target variable activity
is a categorical object.
Activity Label Distribution
Before training the model, we analyze the distribution of activity labels to check if the dataset is balanced.
Compute Class Distribution
# Count occurrences of each activity
activity_counts = data['activity'].value_counts()
# Display distribution
print(activity_counts)
LAYING 1944
STANDING 1906
SITTING 1777
WALKING 1722
WALKING_UPSTAIRS 1544
WALKING_DOWNSTAIRS 1406
Name: Activity, dtype: int64
Visualize Class Distribution
import matplotlib.pyplot as plt
import seaborn as sns
# Plot activity distribution
plt.figure(figsize=(8, 5))
sns.barplot(x=activity_counts.index, y=activity_counts.values, palette="viridis")
plt.xticks(rotation=45)
plt.xlabel("Activity")
plt.ylabel("Count")
plt.title("Distribution of Activity Labels")
plt.show()
The dataset is fairly balanced, meaning we don't need to apply any resampling techniques.
Encoding Activity Labels
Since scikit-learn classifiers require numerical target values, we need to convert the categorical Activity
column into numerical form. Label Encoding is the best approach for this because it assigns a unique integer to each activity category.
Why Label Encoding Won't Affect the Model
A common concern with Label Encoding is that it introduces an ordinal relationship between the classes (e.g., 0 < 1 < 2). However, this is not a problem in our case because:
Scikit-learn’s Logistic Regression does not assume order in categorical labels. It treats them as distinct classes in a One-vs-Rest (OvR) or Softmax (multinomial mode) framework.
Unlike ordinal regression, multi-class logistic regression learns separate decision boundaries for each category, meaning the numerical values assigned to labels do not impose a ranking.
One-Hot Encoding is unnecessary for the target variable (
y
) in classification tasks becauseLogisticRegression
expects a single column with integer labels.
Encode Activity Labels with LabelEncoder
from sklearn.preprocessing import LabelEncoder
# Initialize LabelEncoder
le = LabelEncoder()
# Encode the Activity column
data["Activity"] = le.fit_transform(data["Activity"])
# Display mapping of labels to numerical values
label_mapping = dict(zip(le.classes_, le.transform(le.classes_)))
print("Activity Label Mapping:", label_mapping)
Activity Label Mapping: {'LAYING': 0, 'SITTING': 1, 'STANDING': 2, 'WALKING': 3, 'WALKING_DOWNSTAIRS': 4, 'WALKING_UPSTAIRS': 5}
This ensures that our Activity
labels are numeric without introducing unintended ordering effects that could mislead the model.
3. Feature Analysis and Correlation
Understanding the relationships between features is crucial for building a strong and interpretable model. In this section, we perform a correlation analysis to identify any redundant features that may affect the performance of our logistic regression model.
Visualize the Correlation Distribution
We begin by computing the correlation matrix for all features and visualizing it using a histogram. The correlation matrix shows how each feature is related to others, with values ranging from -1 to 1. A correlation close to 1 or -1 indicates a strong relationship, while a value near 0 suggests no significant relationship.
# Compute the correlation matrix for all features
correlation_matrix = data.iloc[:, :-1].corr()
# Plot a histogram of the absolute correlation values
import matplotlib.pyplot as plt
import seaborn as sns
# Get the absolute values of the correlations for the histogram
abs_correlation = correlation_matrix.abs().stack().reset_index(name="correlation")
plt.figure(figsize=(10, 6))
sns.histplot(abs_correlation["correlation"], bins=30, kde=True)
plt.title("Histogram of Absolute Correlation Values")
plt.xlabel("Correlation Value")
plt.ylabel("Frequency")
plt.show()
The histogram of absolute correlation values shows that most feature pairs have a low to moderate correlation, with a significant peak at 0. This indicates that the majority of features in the dataset are weakly correlated with each other. As the correlation value increases toward 1, the frequency of feature pairs decreases, suggesting that strong correlations (either positive or negative) are less common. The plot suggests that most features provide unique information with minimal redundancy, while a small portion of features may be highly correlated and potentially redundant.
Identify Highly Correlated Features
Next, we identify feature pairs that have a high correlation (greater than 0.8). These features may be redundant, and keeping both in the model can lead to multicollinearity, which affects the stability and interpretability of the regression model.
# Sort the correlation values and filter for those above 0.8
highly_correlated = corr_values.sort_values('correlation', ascending=False).query('abs_correlation > 0.8')
# Display the most highly correlated feature pairs
print(highly_correlated)
feature1 feature2 correlation \
0 tBodyAcc-mean()-X tBodyAcc-mean()-X 1.000000
114648 tBodyAccMag-min() tBodyAccMag-min() 1.000000
114086 tBodyAccMag-max() tBodyAccMag-max() 1.000000
113537 tBodyAccMag-mad() tGravityAccMag-mad() 1.000000
113524 tBodyAccMag-mad() tBodyAccMag-mad() 1.000000
... ... ... ...
151896 fBodyAcc-std()-Z fBodyGyro-std()-X 0.800002
114712 tBodyAccMag-min() fBodyAcc-std()-X 0.800001
150565 fBodyAcc-std()-X tGravityAccMag-min() 0.800001
122005 tGravityAccMag-min() fBodyAcc-std()-X 0.800001
150552 fBodyAcc-std()-X tBodyAccMag-min() 0.800001
abs_correlation
0 1.000000
114648 1.000000
114086 1.000000
113537 1.000000
113524 1.000000
... ...
151896 0.800002
114712 0.800001
150565 0.800001
122005 0.800001
150552 0.800001
[46191 rows x 4 columns]
Implications of Correlated Features in Logistic Regression
In logistic regression, multicollinearity occurs when two or more predictors are highly correlated with each other. This can cause unstable coefficients, making it difficult to interpret the model. Multicollinearity can also lead to overfitting, where the model memorizes the training data and performs poorly on unseen data.
By identifying and analyzing the highly correlated features, we can ensure that the features chosen for the logistic regression model are independent and informative. This step helps to improve model stability and avoid issues with overfitting.
4. Splitting Data for Model Training
Properly splitting the data into training and testing sets is a crucial step in model training. It ensures that the model is evaluated on unseen data, providing an unbiased assessment of its performance. In this section, we use StratifiedShuffleSplit to maintain the class balance, ensuring that the proportion of each activity class is preserved in both the training and testing sets.
Train-Test Split
To split the dataset, we use StratifiedShuffleSplit from Scikit-learn. This technique ensures that each class is represented proportionally in both the training and test sets, which is especially important when dealing with imbalanced datasets. In this case, since the activities are balanced, it will help ensure the proportions are consistent between the splits.
# Import StratifiedShuffleSplit from sklearn
from sklearn.model_selection import StratifiedShuffleSplit
# Initialize StratifiedShuffleSplit with a test size of 30%
sss = StratifiedShuffleSplit(n_splits=1, test_size=0.3, random_state=42)
# Create train-test splits
for train_index, test_index in sss.split(data, data['Activity']):
train_set, test_set = data.iloc[train_index], data.iloc[test_index]
# Define the features and target variable
X_train = train_set.drop("Activity", axis=1)
y_train = train_set["Activity"]
X_test = test_set.drop("Activity", axis=1)
y_test = test_set["Activity"]
# Display class distribution in the training set
print("Training Set Class Distribution:")
print(y_train.value_counts() / len(y_train))
# Display class distribution in the testing set
print("\nTesting Set Class Distribution:")
print(y_test.value_counts() / len(y_test))
Explanation:
StratifiedShuffleSplit is used to split the data into one training set (70%) and one test set (30%) while maintaining the relative distribution of the target variable (activity classes).
We then define the features (
X_train
,X_test
) by dropping the target variableActivity
, and the target variable (y_train
,y_test
) by extracting theActivity
column.We verify that the distribution of classes in both the training and testing sets is consistent with the original dataset.
Verify the Class Distribution in Train and Test Sets
After the split, we display the class distribution of the training and testing sets to ensure that the proportion of each activity is maintained across both sets.
Output:
Training Set Class Distribution:
0 0.188792
2 0.185046
1 0.172562
3 0.167152
5 0.149951
4 0.136496
Name: Activity, dtype: float64
Testing Set Class Distribution:
0 0.188673
2 0.185113
1 0.172492
3 0.167314
5 0.149838
4 0.136570
Name: Activity, dtype: float64
This confirms that the class distribution is maintained in both the train and test sets.
5. Logistic Regression Model Training
Training a Baseline Logistic Regression Model
We begin by training a baseline logistic regression model without any regularization. The model uses the liblinear
solver and is fitted on the training data (X_train
, y_train
). This solver is suitable for small datasets and works well with logistic regression models, including multi-class classification problems.
from sklearn.linear_model import LogisticRegression
# Standard Logistic regression
lr = LogisticRegression(solver='liblinear').fit(X_train, y_train)
This model fits the training data, providing coefficients for each feature that the model uses to classify the data into one of the activity categories.
One-vs-Rest (OvR) Approach for Multi-Class Classification
Logistic Regression is naturally a binary classifier, but in the case of multi-class classification, such as the one in this project with six possible activities (labels), a strategy called One-vs-Rest (OvR) is used. In the OvR approach, a separate binary classifier is trained for each class, where the class is treated as the positive class, and all other classes are treated as the negative class. This method ensures that the model can classify multiple classes.
Hyperparameter Tuning with Cross-Validation
Next, we apply hyperparameter tuning using LogisticRegressionCV
from sklearn
. This method automatically uses cross-validation to determine the best hyperparameters (such as the regularization strength, C
).
We will fit models with both L1 (Lasso) and L2 (Ridge) regularization:
from sklearn.linear_model import LogisticRegressionCV
# L1 regularized Logistic regression
lr_l1 = LogisticRegressionCV(Cs=10, penalty='l1', solver='liblinear', cv=4).fit(X_train, y_train)
# L2 regularized Logistic regression
lr_l2 = LogisticRegressionCV(Cs=10, penalty='l2', solver='liblinear', cv=4).fit(X_train, y_train)
Comparing L1 and L2 Regularization
L1 Regularization (Lasso): This approach tends to shrink some coefficients to zero, effectively performing feature selection. It's particularly useful when dealing with high-dimensional datasets.
L2 Regularization (Ridge): L2 regularization, on the other hand, does not shrink coefficients to zero but rather penalizes large coefficients, helping to prevent overfitting.
The regularization strength, C
, is the inverse of the regularization parameter, where smaller values indicate stronger regularization. The best value for C
is selected using cross-validation, ensuring that the model doesn't overfit or underfit the data.
The next steps would involve comparing the performance of these models on the test data, as well as analyzing their coefficients, which reflect the importance of each feature in predicting the activity classes.
6. Model Interpretation and Coefficients
After training the logistic regression models with and without regularization, we need to evaluate their performance. The evaluation metrics used include accuracy, precision, recall, and F1-score. These metrics give us a detailed understanding of how well our models are performing, especially in multi-class classification problems.
Comparing the Magnitudes of the Coefficients
The following code compares the magnitude of the coefficients for the baseline logistic regression model (lr), L1 regularized model (lr_l1), and L2 regularized model (lr_l2). By examining the coefficients for each model, we can understand the contribution of each feature to the predictions.
# Combine all the coefficients into a dataframe
coefficients = list()
coeff_labels = ['lr', 'l1', 'l2']
coeff_models = [lr, lr_l1, lr_l2]
for lab, mod in zip(coeff_labels, coeff_models):
coeffs = mod.coef_
coeff_label = pd.MultiIndex(levels=[[lab], [0,1,2,3,4,5]],
codes=[[0,0,0,0,0,0], [0,1,2,3,4,5]])
coefficients.append(pd.DataFrame(coeffs.T, columns=coeff_label))
coefficients = pd.concat(coefficients, axis=1)
coefficients.sample(10)
The coefficients highlight the relationship between the features and the activity classes. Notably, the L1 regularization shrinks some coefficients towards zero, which helps in feature selection, while the L2 regularization smooths the coefficients to prevent overfitting.
Visualizing the Coefficients
To better understand how the coefficients of the models differ, we can plot them:
fig, axList = plt.subplots(nrows=3, ncols=2)
axList = axList.flatten()
fig.set_size_inches(10,10)
for ax in enumerate(axList):
loc = ax[0]
ax = ax[1]
data = coefficients.xs(loc, level=1, axis=1)
data.plot(marker='o', ls='', ms=2.0, ax=ax, legend=False)
if ax is axList[0]:
ax.legend(loc=4)
ax.set(title='Coefficient Set ' + str(loc))
ax.set_xlabel("Feature Index")
ax.set_ylabel("Coefficient Value")
plt.tight_layout()
This plot shows the coefficient values for each feature in the logistic regression model across different sets, with three regularization approaches: L2 (Ridge), L1 (Lasso), and no regularization (standard logistic regression). Each plot corresponds to a specific feature's coefficient across different models. The green dots represent the standard logistic regression model (lr), the orange dots represent L1 regularization (L1), and the blue dots represent L2 regularization (L2).
From the plots, we can observe that the magnitude and distribution of the coefficients vary depending on the regularization method. L1 regularization tends to shrink some coefficients to zero, making it more useful for feature selection, while L2 regularization generally results in smaller, more evenly distributed coefficients. The standard logistic regression model (without regularization) shows more variability in coefficient magnitudes. These plots are essential for comparing the effects of regularization on the model's coefficients and understanding how each method affects the model's complexity.
7. Model Evaluation and Predictions
Making Predictions
In this part, we use the trained models to predict the activity labels on the test set. Additionally, we generate the probability scores for each activity class. These predictions and probabilities are stored for later evaluation.
# Predict the class and the probability for each
y_pred = list()
y_prob = list()
coef_labels = ['lr', '11', '12']
coef_models = [lr, lr_l1, lr_l2]
for lab, mod in zip(coef_labels, coef_models):
y_pred.append(pd.Series(mod.predict(X_test), name=lab))
y_prob.append(pd.Series(mod.predict_proba(X_test).max(axis=1), name=lab))
y_pred = pd.concat(y_pred, axis=1)
y_prob = pd.concat(y_prob, axis=1)
print(y_pred.head())
lr I1 I2
0 3 3 3
1 5 5 5
2 3 3 3
3 1 1 1
4 0 0 0
The result will show the predicted class for each model across all test samples, and the predicted probabilities for each class.
Example output for y_pred.head()
:
For the predicted probabilities, the following output for y_prob.head()
would be generated:
lr 11 12
0 0.998939 0.998996 0.999998
1 0.988165 0.999799 0.999477
2 0.987592 0.995806 0.999697
3 0.981381 0.999181 0.999865
4 0.998277 0.999921 0.999997
Evaluating Model Performance
To assess the model's performance, we calculate various classification metrics, including precision, recall, F1-score, accuracy, and ROC-AUC. This code calculates each of the metrics:
from sklearn.metrics import precision_recall_fscore_support as score
from sklearn.metrics import confusion_matrix, accuracy_score, roc_auc_score
from sklearn.preprocessing import label_binarize
metrics = list()
cm = dict()
for lab in coeff_labels:
# Precision, recall, f-score from the multi-class support function
precision, recall, fscore, _ = score(y_test, y_pred[lab], average='weighted')
# The usual way to calculate accuracy
accuracy = accuracy_score(y_test, y_pred[lab])
# ROC-AUC scores can be calculated by binarizing the data
auc = roc_auc_score(label_binarize(y_test, classes=[0,1,2,3,4,5]),
label_binarize(y_pred[lab], classes=[0,1,2,3,4,5]),
average='weighted')
# Last, the confusion matrix
cm[lab] = confusion_matrix(y_test, y_pred[lab])
metrics.append(pd.Series({'precision':precision, 'recall':recall, 'fscore':fscore, 'accuracy':accuracy, 'auc':auc},
name=lab))
metrics = pd.concat(metrics, axis=1)
print(metrics)
lr l1 l2
precision 0.984144 0.983514 0.984477
recall 0.984142 0.983495 0.984466
fscore 0.984143 0.983492 0.984464
accuracy 0.984142 0.983495 0.984466
auc 0.990384 0.989949 0.990553
Let’s visualize the results using a bar chart:
metrics.plot(kind='bar', figsize=(12, 6), rot=0, legend=False)
plt.title("Comparison of Evaluation Metrics with Different Regularization Methods")
plt.xlabel("Metrics")
plt.ylabel("Score")
plt.legend(loc='upper center', bbox_to_anchor=(0.5, 1.3), ncol=3)
plt.tight_layout()
plt.show()
The evaluation metrics for the three models (lr, l1, and l2) are very similar, with precision, recall, fscore, and accuracy ranging from 0.9835 to 0.9845 across all models. The auc values are also high, indicating strong model performance. The lr
and l2
models show slightly better results than l1
, with l2
achieving the highest auc (0.990553). Overall, all models perform similarly well, with minimal differences in the metrics.
8. Confusion Matrix for Each Model
Displaying the Confusion Matrix
A confusion matrix is a useful tool to evaluate the performance of a classification model by showing the counts of actual versus predicted labels for each class. It helps us understand how well the model is performing in terms of its predictions.
We can display the confusion matrix for each model—logistic regression (lr
), L1-regularized logistic regression (l1
), and L2-regularized logistic regression (l2
). The following code generates and plots the confusion matrix for each model:
# Create the confusion matrix for each model
fig, axList = plt.subplots(nrows=2, ncols=2)
axList = axList.flatten()
fig.set_size_inches(12, 10)
axList[-1].axis('off')
# Confusion matrix labels
labels = ['0', '1', '2', '3', '4', '5'] # Adjust if you have more/less classes
for ax, lab in zip(axList[:-1], coeff_labels):
cm = confusion_matrix(y_test, y_pred[lab])
sns.heatmap(cm, ax=ax, annot=True, fmt='d', cmap='Blues', xticklabels=labels, yticklabels=labels)
ax.set_title(f'Confusion Matrix - {lab}')
ax.set_xlabel('Predicted')
ax.set_ylabel('True')
plt.tight_layout()
plt.show()
The confusion matrices for the three models (Logistic Regression lr
, L1-regularized Logistic Regression l1
, and L2-regularized Logistic Regression l2
) show that the models perform similarly with high accuracy in most classes (diagonal values). Here’s the concise analysis:
lr Model:
The majority of the predictions are correct (high values on the diagonal).
Misclassifications are minimal but present in some classes, with
1
and5
being occasionally confused with other classes (e.g.,21
misclassified as1
and22
as2
).
l1 Model:
The
l1
model also performs well with high values on the diagonal, though it has a slightly higher number of misclassifications, especially between classes1
and5
(with27
misclassified).Class
1
has more misclassifications (506
predicted correctly, but27
misclassified as2
).
l2 Model:
The
l2
model shows minimal misclassifications, similar to thelr
model.Class
2
has more misclassifications (e.g.,20
predicted as3
), but otherwise, the performance is strong.
Overall, all three models perform similarly well, but the l2
model and lr
model exhibit slightly better performance with fewer misclassifications compared to the l1
model. The confusion matrices indicate strong model behavior across the classes, with only minor misclassifications that may be addressed through further tuning or analysis.
9. Conclusion
In this article, we successfully built and evaluated a multi-class logistic regression model for predicting human activity using smartphone data. We started by preparing the data, encoding activity labels, and performing feature analysis and correlation. The results indicated that the dataset contained many highly correlated features, which we carefully considered during model training.
We then trained baseline models using standard logistic regression, L1 (Lasso) regularization, and L2 (Ridge) regularization. We evaluated these models using accuracy, precision, recall, F1-score, and AUC, finding that all models performed similarly, with marginal differences in their ability to classify the activities correctly.
In addition, we visualized the confusion matrix for each model, which showed the number of correct and incorrect predictions across different activity classes. The model performance was high across the board, demonstrating the effectiveness of logistic regression for this classification task.
Finally, we observed that all models performed well with similar metrics, and the regularization did not drastically affect the model's ability to predict activity. Regularization could potentially help prevent overfitting when applied to more complex datasets. For future work, exploring other models, like random forests or gradient boosting machines, might provide further improvements in accuracy.
In summary, logistic regression, with or without regularization, proved to be a solid choice for activity recognition, achieving high accuracy and strong performance across multiple metrics.
Subscribe to my newsletter
Read articles from Henry Ha directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Henry Ha
Henry Ha
Data Scientist, write about: Tech & Business & Lifeskills