Human Action Recognition with convolutional neural networks using Accelerometer Data

Human Action Recognition with convolutional neural networks using Accelerometer Data

Environment setup/Project Requirements :
  1. Python
  2. Numpy, Pandas, Tensorflow, Matplotlib, sklearn libraries need to be installed.
  3. Jupyter Notebook
Note: I don't recommend using Pycharm or Atom environments because it'll be hard to preprocess the data and visualizing the data
Workflow:
  1. Importing Data
  2. Organizing data
  3. Creating a data frame
  4. Pre-Processing data 
  5. Visulaizing data using matplotlib
  6. Creating a time frame
  7. Dividing training and testing data.
  8. Convolutional Neural network Model.
  9. Checking accuracy through a confusion matrix
Step 1: Importing Data
First, we import the required modules and read the data. Since data is in text format we cannot use the "pandas" framework instead, we use the "File handling" method in python.
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.layers import Conv2D, MaxPool2D
from tensorflow.keras.optimizers import Adam
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
file = open('C:/Users/saidh/Downloads/WISDM_ar_v1.1/WISDM_ar_v1.1_raw.txt')
lines = file.readlines()
Step 2: Organizing Data
Data was not cleaned as well as not organized properly, so in order to perform operations, we have to create a data frame. In order to create a data frame, we have to organize data properly like maintaining a particular number of columns, etc.
for i, line in enumerate(lines):
    try:
        line = line.split(',')
        last = line[5].split(';')[0]
        last = last.strip()
        if last == '':
            break;
        temp = [line[0], line[1], line[2], line[3], line[4], last]
        processedList.append(temp)
    except:
        print('Error at line number: ', i)
Step 3: Creating DataFrame
In order to pre-process data, it should be clean enough. So to clean the data and make it useful to our process we have to create a data frame we have use pandas library. So in order to use the "pandas" framework we have to create a data frame. 
data = pd.DataFrame(data = processedList, columns = columns)
Step 4: Preprocessing the data
In order to create the best model we have to give clean data, So we need to do some preprocessing. The given data has an uneven sample of data, which may cause skewness. So in order to remove skewness and decrease variance, we have created balanced data and standardized data using "StandardScalar". Since the machine learning model cannot understand string kind of data, we have to encode it in number format using "LabelEncode()".
Walking = df[df['activity']=='Walking'].head(3555).copy()
Jogging = df[df['activity']=='Jogging'].head(3555).copy()
Upstairs = df[df['activity']=='Upstairs'].head(3555).copy()
Downstairs = df[df['activity']=='Downstairs'].head(3555).copy()
Sitting = df[df['activity']=='Sitting'].head(3555).copy()
Standing = df[df['activity']=='Standing'].copy()

label = LabelEncoder()
balanced_data['label'] = label.fit_transform(balanced_data['activity'])
balanced_data.head()
Step 5: Visualizing the data using matplotlib.
Our data is in form of signals(x-axis, y-axis, z-axis), so we have visualized the data to see the difference between sitting, standing, walking, jogging, upstairs, downstairs.

Step 6: Creating a time frame
In order to train the model, we create some time frames according to the samples. Initially, we take a time sample of 4-time units, as data is sampled with a sampling rate of 20Hz so 20x4=80 samples.
(sampling rate*time = number of sample/ time unit). So in order to move the frame, we have taken a hop-size of  2-time units which is 40 samples.
def get_frames(df, frame_size, hop_size):
    N_FEATURES = 3
    frames = []
    labels = []
    for i in range(0, len(df) - frame_size, hop_size):
        x = df['x'].values[i: i + frame_size]
        y = df['y'].values[i: i + frame_size]
        z = df['z'].values[i: i + frame_size]
       
        label = stats.mode(df['label'][i: i + frame_size])[0][0]
        frames.append([x, y, z])
        labels.append(label)

Step 7: Dividing training and testing sets
Generally, we divide our data into training, validation, and testing data sets. In this case, I divided data into training and testing data sets due to the less amount of data available. If you want you can divide data into a validation set also.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0, stratify = y)

Step 8: Convolutional neural networks 
Most of the time, when data is in signal for more tex form, people will use Recurrent neural networks like LSTM's(Long short term memory), etc. But we know convolutional networks are best in extracting the features, so I'm using cnn.
model = Sequential()
model.add(Conv2D(16, (2, 2), activation = 'relu', input_shape = X_train[0].shape))
model.add(Dropout(0.5))

model.add(Conv2D(32, (2, 2), activation='relu'))
model.add(Dropout(0.2))

model.add(Flatten())

model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))

model.add(Dense(6, activation='softmax'))
Step 9: Checking accuracy through a confusion matrix
If we can deploy or insert t this code into some device we can measure the accuracy of the model but we cannot do that, so in order to test the accuracy, we use a confusion matrix so that we can measure the accuracy for each and every class.
You can access the complete code in my GitHub

Note: If you are interested you can extend the projects by deploying or dumping this code into some device.
Note: Thanks to every open source contributor like (StackOverflow, Github, Answersmind, Starhub)

Sources: 
1) Human Activity Recognition using Accelerometer and Gyroscope Data from Smartphones, International conference on Emerging Trends in Communication, Control and Computing, IEEE

Expected Viva:

1)Why Convolutional neural networks? why not LSTM's?
A) Firstly, cnn's are best in extracting features from the data, and Basically, LSTM's are used to predict the future based on previous data but here we are predicting the current action of the person so there is no need for particular LSTM's.

2)Is it necessary it balance the data?
A)I think it is necessary because of skewness, if we do not balance the data there is a high chance of skewness as a result there will be no accurate results.

3)What is the project used for?
A)This project can be used to detect the activities done by the person, which will be helpful in calculating the calories as a result they can stay healthy.

4)What is the need for this algorithm instead of using some accelerometer or some other sensor?
A)Sensors and accelerometer are expensive components as a result below the middle class may not afford them and moreover, there is a high chance of damage to the sensor. If we can use this algorithm in some web application or in any app everyone can install the app and get more accurate results.

5)What is the frame size and hop size? (In step 6)
A) Basically, we have given some time intervals for predicting/classifying into the class, as a person do a certain task for some time right, means he cannot jog for just a second and sit in the next second, so in order to maintain that we have divided the task into particular time samples.

6)Why didn't you use a validation set?
A)As I felt like the data that was available is too little to train a neural network, as they need tons of data to train for a particular task. so I thought there is no need for a validation set. Even now if got less accuracy I would have used the K-fold cross-validation method.

7)What is the confusion matrix? why we use it?
A)Basically confusion matrix compares the true label vs the predicted label and gives the accuracy for a particular class.

8)What is the technique you used in this project?
A)This was a classification problem, which was done with the help of convolution neural networks.

made with 💓 by G. Sai Dheeraj

Comments

Popular posts from this blog

Object Detection Using OpenCV and Transfer Learning.

Path Detection based on Bird Eye view with OpenCV module

Traffic signs recognition with convolutional Neural networks