Deep Learning to detect cracks in walls.
Classification is one of the common uses of Deep learning, where it's used to identify one among the two. In this article, we will try to find if an image is of a crack or a normal wall with no crack. For this detection, we have taken a folder data and inside it, we have created 2 folders, one with the name crack and the other with the name non-crack. Let's start the coding.
pip install tensorflow
pip install opencv-python
pip install matplotlib
TensorFlow's library will be used in the Deep learning pipeline. OpenCV will be used to clean the images and matplotlib will be used to visualize the image.
!pip list
pip list will tell us if all the libraries that we installed are available or not. So we can see TensorFlow, OpenCV, and matplotlib all three libraries are available in the list.
import tensorflow as ts
import os
gpus= ts.config.experimental.list_physical_devices('GPU')
for cpu in gpus:
ts.config.experimental.set_memory_growth(gpu,True)
The above line of code is to limit the usage of GPU.
import cv2
import imghdr
data_dir = '/content/sample_data/Data'
Providing variables to the path of the data folder.
image_exts = ['png','jpeg','jpg','bmp']
While working with the images we will work with images with the extension of png,jpeg,jpg,bmp. Apart from these, if there is any other format, then we won't consider that.
for image_class in os.listdir(data_dir):
for image in os.listdir(os.path.join(data_dir, image_class)):
image_path = os.path.join(data_dir, image_class, image)
try:
img = cv2.imread(image_path)
tip = imghdr.what(image_path)
if tip not in image_exts:
print('Image not in ext list {}'.format(image_path))
os.remove(image_path)
except Exception as e:
print('Issue with image {}'.format(image_path))
The above line of code will remove any file that does not image in the format of PNG, JPG, BMP, or JPEG.
import numpy as np
from matplotlib import pyplot as plt
data = ts.keras.utils.image_dataset_from_directory(data_dir)
image_dataset_from_directory: This function is part of the TensorFlow library and is used for creating a tf. data. Dataset from image files in a directory. It automatically labels the data based on subdirectory names.
The resulting data object is likely a tf. data. Dataset object, which is a powerful tool for efficiently loading and preprocessing data for machine learning models.
The above line will build an image dataset from our image. It will take care of the labels/classes.
data_iterator = data.as_numpy_iterator()
batch = data_iterator.next()
The as_numpy_iterator method is used to convert a tf. data. Dataset into an iterator that yields numpy arrays. A batch of data is created to limit the usage of memory. The resulting batch variable likely contains a tuple or dictionary with the input data and corresponding labels. The structure would depend on how your dataset was organized and the arguments you passed to image_dataset_from_directory.
fig, ax = plt.subplots(ncols=4, figsize=(20,20))
for idx, img in enumerate(batch[0][:4]):
ax[idx].imshow(img.astype(int))
ax[idx].title.set_text(batch[1][idx])
This code creates a visualization of the first four images in the batch along with their labels, displaying them in a grid of subplots. The specific appearance of the images and labels will depend on the data in the batch and the format of the labels. A subplot of 4 images is created with the size of 20,20.fig is the entire figure and ax is the array of images. A loop of 4 images is created. Every time this code is run, a new fig with 4 images will be generated.
data = data.map(lambda x,y: (x/255, y))
data.as_numpy_iterator().next()
this code snippet normalizes the pixel values of the images by dividing them by 255, a common preprocessing step for image data. The resulting dataset is then converted to a numpy iterator, and the first batch of the modified data is fetched for further processing or visualization. The choice to divide pixel values by 255 is related to the typical range of pixel values in digital images. In standard 8-bit grayscale and RGB images, pixel values range from 0 to 255.
train_size = int(len(data)*.7)
val_size = int(len(data)*.2)
test_size = int(len(data)*.1)
train_size
We are dividing the total size into train validation and test. These calculations follow a common split convention in machine learning where the dataset is divided into training, validation, and test sets. The specific percentages (70%, 20%, 10%) may vary based on the problem and the available data, and these values are often adjusted based on the specific requirements of the task at hand.
train = data.take(train_size)
val = data.skip(train_size).take(val_size)
test = data.skip(train_size+val_size).take(test_size)
Build the deep learning model. The take method is used to select a specified number of elements from the beginning of the dataset. The skip(train_size) method skips the first train_size samples, and then take(val_size) is used to take the next val_size samples. This effectively selects samples from the dataset after the training set.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout
model = Sequential()
The Sequential model is a linear stack of layers, and you can add layers to it sequentially.
model.add(Conv2D(16, (3,3), 1, activation='relu', input_shape=(256,256,3)))
model.add(MaxPooling2D())
Here we have 2 layers. One convolution and one max pooling layer. The convolution layer has 16 filters of size (3,3)and a stride of 1. With activation function as ReLU applied on the image with size (256,256) of 3 bands. The second layer is the max pooling layer. It reduces the spatial dimension of the output. The stride of 1 means at a time only one pixel will be moved. Max pooling is a downsampling operation that reduces the spatial dimensions of the representation and retains the most important information. By default, it uses a pool size of (2, 2).
model.add(Conv2D(32, (3,3), 1, activation='relu'))
model.add(MaxPooling2D())
Another convolution layer with 32 filters and 1 stride with the size of (3,3). Another max-pooling layer to reduce spatial dimensions.
model.add(Conv2D(16, (3,3), 1, activation='relu'))
model.add(MaxPooling2D())
Another convolution layer with 16 filters and 1 stride with an image size of (3,3). One more max-pooling layer to reduce the spatial dimension of the convolution layer.
model.add(Flatten())
This layer is used to flatten the output from the previous layer into a one-dimensional array. It prepares the data for the fully connected layers.
model.add(Dense(256, activation='relu'))
A fully connected layer with 256 units and ReLU activation is added.
model.add(Dense(1, activation='sigmoid'))
The final layer with a fully connected layer with a sigmoid activation function. Sigmoid activation is commonly used for binary classification problems, where the output is interpreted as a probability.
model.compile('adam', loss=ts.losses.BinaryCrossentropy(), metrics=['accuracy'])
ts. losses.BinaryCrossentropy(): This sets the loss function to binary cross entropy. Binary cross entropy is commonly used for binary classification problems, where the task is to predict between two classes. Adam is a popular optimization algorithm that adapts the learning rate during training. The model is configured for binary classification, using the Adam optimizer, binary cross-entropy loss, and accuracy as the evaluation metric during training. This configuration is suitable for binary classification tasks where you are predicting between two classes.
model.summary()
model.compile is used to configure the model for training. Adam is a popular optimizer used in gradient-based optimization. The optimizer in the context of model training is a crucial component responsible for adjusting the parameters (weights and biases) of a neural network during the training process. Its primary function is to minimize the value of the loss function by iteratively updating the model’s parameters based on the gradients of the loss concerning those parameters. loss=ts.losses.BinaryCrossentropy(): This sets the loss function to binary cross entropy. Binary crossentropy is commonly used for binary classification problems, where the model is trained to predict probabilities for two classes.
The model. summary() function in Keras is used to display a concise summary of the neural network model, providing information about the architecture, layer types, output shapes, and the number of parameters.
logdir='/content/sample_data/logs'
tensorboard_callback = ts.keras.callbacks.TensorBoard(log_dir=logdir)
This line creates an instance of the TensorBoard callback, and tensorboard_callback is a variable that holds this instance. TensorBoard is useful for visualizing various aspects of the model training process, including metrics, loss curves, and histograms of weights.
hist = model.fit(train, epochs=20, validation_data=val, callbacks=[tensorboard_callback])
This line of code uses the fit method to train the neural network model on the training data (train) for 20 epochs, validating the validation data (val), and using the TensorBoard callback for logging.
fig = plt.figure()
plt.plot(hist.history['loss'], color='teal', label='loss')
plt.plot(hist.history['val_loss'], color='orange', label='val_loss')
fig.suptitle('Loss', fontsize=20)
plt.legend(loc="upper left")
plt.show()
The plot provides a visual representation of how the training and validation loss change over the epochs. The x-axis typically represents the number of epochs, and the y-axis represents the loss values. The training loss curve shows how well the model fits the training data, while the validation loss curve gives insights into how well the model generalizes to unseen data.
fig = plt.figure()
plt.plot(hist.history['accuracy'], color='teal', label='accuracy')
plt.plot(hist.history['val_accuracy'], color='orange', label='val_accuracy')
fig.suptitle('Accuracy', fontsize=20)
plt.legend(loc="upper left")
plt.show()
from tensorflow.keras.metrics import Precision, Recall, BinaryAccuracy
pre = Precision()
re = Recall()
acc = BinaryAccuracy()
for batch in test.as_numpy_iterator():
X, y = batch
yhat = model.predict(X)
pre.update_state(y, yhat)
re.update_state(y, yhat)
acc.update_state(y, yhat)
This code evaluates precision, recall, and binary accuracy metrics on a test set using the trained model.
print(f'precison is {pre.result().numpy()}')
print(f'recall is {re.result().numpy()}')
print(f'accuracy is {acc.result().numpy()}')
We want to check if there is a crack in the image or not. For that, we will test out the modal on one image.
import cv2
from tensorflow.image import resize
import matplotlib.pyplot as plt
img = cv2.imread('/content/sample_data/crack.jpg')
plt.imshow(img)
plt.show()
This code will display an image.
resize = ts.image.resize(img, (256,256))
plt.imshow(resize.numpy().astype(int))
plt.show()
This will resize the image to (256,256).
yhat = model.predict(np.expand_dims(resize/255, 0))
yhat
This will predict the value of either 0 or 1.
if yhat > 0.5:
print(f'Predicted class is Crack free')
else:
print(f'Predicted class is Crack ')
Any value more than 0.5 is considered as 1 so we are considering the image as True and less than 0.5 is considered as false.
from tensorflow.keras.models import load_model
model.save(os.path.join('/content/sample_data/Models','classifier.h5'))
new_model = load_model('/content/sample_data/Models/classifier.h5')
new_model.predict(np.expand_dims(resize/255, 0))
Once we have verified the model, we need to save it. The above code does that.
Hope this article will help someone to clear their understanding.
Credits: https://www.youtube.com/watch?v=jztwpsIzEGc
The code can be found at https://github.com/adityakumar529/Crack-detection/tree/main.