• +91-9967578720
  • hello@codeheroku.com

I recently started the deep learning course on fastai

I had previously done a course on neural networks and deep learning, and I was really intimidated by the mathematical approach the course took. I couldn’t understand a lot of proofs, coding things was completely different from the theory concepts and after 4 weeks of video lectures and assignments I couldn’t even build a decent pet classifier by myself.

And that’s where I have to thank fastai since its top down approach really helped me dive right in. I really recommend the course to everyone regardless of your background. Your background will only prove useful for the awesome projects you’ll be able to build.

I’m only two weeks into the course and I’m already on my 4th or 5th project. To test my skills I decided to take part in an online hackathon. And with just a few lines of code I was in the top 2% of the competition as shown below. As of writing this article, out of 2276 participants, my rank on the leader board is 37. So how did I do it? Let’s find out.



Hackathon Page


Hackathon Rankings

The jupyter notebook for the same can be found on Kaggle.

Problem description and class mappings



Hackathon problem statement

My solution:

Import the library

from fastai import *
from fastai.vision import *

We start by importing the fastai library. Once imported we load our dataset. There are various ways of loading the dataset using the fastai library (from a folder, from a url, etc.) Here I’m using the data block api instead of factory methods because I find it more readable and intuitive.

df = pd.read_csv('../input/scene_classification/scene_classification/train.csv')
df.head()
Read data from CSV

We have a csv file with the names of images and their labels. We will use this to create our data bunch. Sometimes we might use the folder name as labels. For example, there will be 2 folders called cats and dogs where cat images will be in the cats folder and so on.

tfms = get_transforms(do_flip=False)
data = (ImageItemList.from_csv(path, csv_name='../train.csv') #Where to find the data? -> in path and its subfolders
        .random_split_by_pct()              #How to split in train/valid? -> use the folders
        .label_from_df()            #How to label? -> depending on the folder of the filenames
        .add_test_folder(test_folder = '../test')              #Optionally add a test set (here default name is test)
        .transform(tfms, size=128)       #Data augmentation? -> use tfms with a size of 64
        .databunch())

The way to read this code would be:

  1. There is a list of images in the given path, and their corresponding labels in a csv file
  2. Split these images into train and validation set randomly.
  3. Label the data using the csv file shown. (or df = data frame)
  4. Add a test set. (This step is optional)
  5. Set the size of the images and transform them.
  6. And finally create a data bunch. (This can be thought of as a format in which fastai stores images)

Viewing our data

data.show_batch(rows=3, figsize=(8,10))
View data

As we can see our data includes various scenes, with their labels shown above them. Now we need to train our model to recognize these scenes. This problem is known as image classification.

Training the model

learn = create_cnn(data, models.resnet34, metrics=[error_rate, accuracy], model_dir="/tmp/model/")
learn.fit_one_cycle(4)

We download a pre trained model called resnet34 because it’s generally a good idea to do so instead of training from scratch. The model has been trained to identify thousands of categories of images and it’s initial weights will help us learn better. We take this model and we train it for a number of cycles. We might want to record the learning rate, check various metrics and learn some more. We can interpret the results plotting the confusion matrix. Fastai has a great method called .most_confused that helps us find out what our model is most confused about. For example,

interp.most_confused(min_val=2)

.most_confused matrix
This shows that our model was confused between 2(glacier) as 3(mountain) 87 times and between 0 (buildings) and 5 (street) 36 times which makes sense because these things are easy to confuse.

Once we think we’ve trained enough without overfitting, we make predictions on the test set and submit our results.

Some tips for getting started:

If you don’t come from a programming background then getting things running can be a little difficult but keep trying. Get help from a friend if required. Also, when your code keeps giving errors its very easy to blame it and give up. But if you cannot create a data bunch with one method, try another one. Try reading the docs and source code to make sense of things. The key is to remember you will not know everything at once. But you will know a lot if you just keep trying. If you just keep pushing yourself.

Write Comments