Chapter 1 Introduction

Malaria is a deadly disease predominantly caused by a parasite name Plasmodium falciparum which is transmitted by mosquito. It shows symptoms such as high fever and chills, which not treated in due time can be fatal. The disease is diagnosed by capturing the cell images of the blood and the presence of parasitic stains can be used to detect the Malarial disease.

1.1 Dataset

The Kaggle dataset consists of two folders namely uninfected and parasitized cell images. The images are color images which are resized to a dimension of \[128 \times 128\]. The original dataset provides a total of around 25000 images. However, for running our simple DL algorithm we will be using only 100 images for each class.

1.2 Python Programming ENV

The following section explains about different Python Libraries we would be using for our Malaria detection method. As we are doing the development in the Google Colab, we don’t need to install any module here. It is alreadly configured in the Colab ENV, only we have to import the required libraries for our application. The Python codes is grouped into different code chunks for each understanding.

1.2.1 Importing of common array and plotting librariers

This is for data importing and manipulation using NUMPY and PANDAS. Matplotlib and Seaborn for plotting graphs and images.

1.2.2 Importing of OpenCV and Scikit-learn

OpenCV and Scikit-learn for image processing and machine learning respectivley.

1.3 Data Preprocessing

Here we initialize a set of lists X & Z for the images and the labels.

1.3.1 Creation of lists for the images and the labels

Here we create two functions for creating image lists and labels.

Here we call the functions to perform the array formation for the images and the labels.

## 
  0%|          | 0/100 [00:00<?, ?it/s]
100%|##########| 100/100 [00:00<00:00, 1217.99it/s]
## 100
## 
  0%|          | 0/100 [00:00<?, ?it/s]
100%|##########| 100/100 [00:00<00:00, 1520.84it/s]
## 200

1.6 Deep Learning Model based on CNN

This section explains about our deep learning model based on CNN. Here we will use 4 Convolutional blocks and 4 Max-Pooling blocks. Finally we will be having a classification layer based on SOFTMAX function for classifying the cell images as uninfected or parasitized.

## WARNING:tensorflow:From /home/rajumanj/anaconda3/envs/rajutf/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
## Instructions for updating:
## If using Keras pass *_constraint arguments to layers.

1.7 Deep learning model architecture

This sections shows the DL model based on CNN for cell image classification. The number of parametes are also provided here.

## Model: "sequential"
## _________________________________________________________________
## Layer (type)                 Output Shape              Param #   
## =================================================================
## conv2d (Conv2D)              (None, 128, 128, 32)      2432      
## _________________________________________________________________
## max_pooling2d (MaxPooling2D) (None, 64, 64, 32)        0         
## _________________________________________________________________
## conv2d_1 (Conv2D)            (None, 64, 64, 64)        18496     
## _________________________________________________________________
## max_pooling2d_1 (MaxPooling2 (None, 32, 32, 64)        0         
## _________________________________________________________________
## conv2d_2 (Conv2D)            (None, 32, 32, 96)        55392     
## _________________________________________________________________
## max_pooling2d_2 (MaxPooling2 (None, 16, 16, 96)        0         
## _________________________________________________________________
## conv2d_3 (Conv2D)            (None, 16, 16, 96)        83040     
## _________________________________________________________________
## max_pooling2d_3 (MaxPooling2 (None, 8, 8, 96)          0         
## _________________________________________________________________
## flatten (Flatten)            (None, 6144)              0         
## _________________________________________________________________
## dense (Dense)                (None, 512)               3146240   
## _________________________________________________________________
## dense_1 (Dense)              (None, 2)                 1026      
## =================================================================
## Total params: 3,306,626
## Trainable params: 3,306,626
## Non-trainable params: 0
## _________________________________________________________________

1.8 Model Compile

Here we compile the CNN model for checking any errors and initialize the ADAM optimizer. We have used Cross-Entropy has our loss function.

1.10 Plotting Variables of Training Sequence

## dict_keys(['loss', 'acc', 'val_loss', 'val_acc'])

References

Bhandary, Abhir, G Ananth Prabhu, V Rajinikanth, K Palani Thanaraj, Suresh Chandra Satapathy, David E Robbins, Charles Shasky, Yu-Dong Zhang, João Manuel RS Tavares, and N Sri Madhava Raja. 2020. “Deep-Learning Framework to Detect Lung Abnormality–a Study with Chest X-Ray and Lung Ct Scan Images.” Pattern Recognition Letters 129. Elsevier: 271–78.

Lakshmi, D, K Palani Thanaraj, and M Arunmozhi. 2019. “Convolutional Neural Network in the Detection of Lung Carcinoma Using Transfer Learning Approach.” International Journal of Imaging Systems and Technology. Wiley Online Library.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. http://yihui.org/knitr/.