Toy example: dog’s breed detection

A toy example to identify Dog’s breed, “Dogs breed detector” [*], images for training come from dog dataset.

Neural Network type CNN
Deep Learning Framework(s) Tensorflow, Keras
Programming language Python
GPU version yes
CPU version yes
DEEPaaS API yes
DEEP DS template yes
DEEP-Nextcloud access yes

Keywords: image classification, CNN, transfer learning, Tensorflow

DEEP-OC Dockerfile: https://github.com/indigo-dc/DEEP-OC-dogs_breed_det

App source code: https://github.com/indigo-dc/dogs_breed_det

Pre-trained weights: https://nc.deep-hybrid-datacloud.eu/s/D7DLWcDsRoQmRMN

Description

The project applies Transfer learning for dog’s breed identification, implemented with Tensorflow and Keras: From a pre-trained model (VGG16 | VGG19 | Resnet50 | InceptionV3) the last layer is removed, then a new Fully Connected (FC) classification layer is added, which is trained. All images first pass through the pre-trained network and converted into the tensor with the shape of the ‘before-last’ layer of the pre-trained network, into so-called ‘bottleneck_features’. These bottleneck_features are used then as input for the FC classification network.

Local Workflow

The described workflow supposes usage of downloaded from DEEP Open Catalog Docker images, i.e. you need either docker or udocker tool.

1. Workflow intro

a. DEEPaaS API uses port 5000 for access, one therefore has to map the container and host ports, see Examples.

  1. Following two directories inside the docker container are used for input and output:
Directory inside container Description
/srv/dogs_breed_det/data raw data, ready-for-training data, etc
/srv/dogs_breed_det/models for the trained weights

If you want to perform full training, then you need to mount your data into /srv/dogs_breed_det/data.

If you want to keep trained weights in the persistant place, then you have to mount /srv/dogs_breed_det/models to a persistent volume. You can either use your local directories or connect your remote storage, see Examples.

2. Data input

Original dataset consists of dog’s images for 133 breeds. The images are compressed in one dogImages.zip file. The archive contains three directories for train, test, and valid datasets:

test/
    001.Affenpinscher/
    002.Afghan_hound/
    ...
train/
    001.Affenpinscher/
    002.Afghan_hound/
    ...
valid/
    001.Affenpinscher/
    002.Afghan_hound/
    ...

These directories are automatically de-archived in /srv/dogs_breed_det/data/dogImages/.

Training labels are also created automatically based on the directory names, truncating leading numbers, e.g. ‘002.’.

The minimum requirement for training is to make dogImages.zip available in /srv/dogs_breed_det/data/raw/ directory.

If you want to use your own dataset then it has to follow similar structure.

If local directories are mounted into the container, the following directory structure is suggested:

data/
    dogImages/
    raw/
models/
Local dir to mount Corresponding place in the container
LOCAL_DIR/data /srv/dogs_breed_det/data
LOCAL_DIR/models /srv/dogs_breed_det/models

In the ‘local’ case, you place dogImages.zip in LOCAL_DIR/data/raw, which makes it available in /srv/dogs_breed_det/data/raw.

If you connect a remote storage, the following directories have to be created there:

/Datasets/dogs_breed/data
/Datasets/dogs_breed/data/dogImages
/Datasets/dogs_breed/data/raw
/Datasets/dogs_breed/models

In the ‘remote’ case, you place dogImages.zip in /Datasets/dogs_breed/data/raw, which makes it available in /srv/dogs_breed_det/data/raw.

3. Accessing application

  • In a minimum case to classify images with already trained Resnet50 model, start the container as:

    docker run -ti -p 5000:5000 deephdc/deep-oc-dogs_breed_det:cpu deepaas-run --listen-ip=0.0.0.0
    
  • In more advanced cases (see Examples) you may need to mount various directories or pass environment settings.

  • Direct your web browser to http://127.0.0.1:5000

4. Test the classifier

  • Go to /models/{model_name}/predict , click “Try it out” button

  • Choose an image file for dog’s breed identification (N.B. “URL to retrieve data” is not (yet) implemented)

  • Type model_name, one of the Dogs_Resnet50, Dogs_InceptionV3, Dogs_VGG16, Dogs_VGG19

  • The equivalent API call is:

    curl -X POST "http://127.0.0.1:5000/models/Dogs_Resnet50/predict" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "[email protected];type=image/jpeg"
    

Note

By default only weigths for Dogs_Resnet50 are available (automatically downloaded from the shared link, see above “Pre-trained weights” URL), all other models have to be trained first!

5. Train the classifier

  • Connect your data storage with the corresponding directory inside the container (see “Data input” above and Examples below)

  • Go to /models/{model_name}/train , click “Try it out” button

  • Type model_name, one of the Dogs_Resnet50, Dogs_InceptionV3, Dogs_VGG16, Dogs_VGG19

  • Execute training

  • The equivalent API call is:

    curl -X PUT "http://127.0.0.1:5000/models/Dogs_Resnet50/train" -H "accept: application/json"
    

DEEP Pilot infrastructure submission

Please, refer to Quickstart Guide, section “Run model on DEEP Pilot infrastructure”, on what is required to start the application on DEEP Pilot infrastructure.

Examples

Mount local host directories

Example 1 (GPU, default):

docker run -ti -p 5000:5000 -v ~/data:/srv/dogs_breed_det/data \
-v ~/models:/srv/dogs_breed_det/models \
deephdc/deep-oc-dogs_breed_det deepaas-run --listen-ip=0.0.0.0

Example 2 (CPU):

docker run -ti -p 5000:5000 -v ~/data:/srv/dogs_breed_det/data \
-v ~/models:/srv/dogs_breed_det/models \
deephdc/deep-oc-dogs_breed_det:cpu deepaas-run --listen-ip=0.0.0.0

Connecting remote storage by using rclone.conf from your host

rclone tool allows to connect to a plenty of remote storages. The tool is already installed in the Docker image and expects your data/ and models/ sub-directories to be under deepnc:/Datasets/dogs_breed/. If no data found in your container, rclone attempts to connect to deepnc:/ and download necessary data from there.

If you are familiar with the rclone tool, you probably have rclone.conf file on your host. You can rename one of the pre-configured remote storages to deepnc, then mount host directory with your rclone.conf file into the container:

Example 3: using in the container rclone.conf from your host

docker run -ti -p 5000:5000 -v $HOSTDIR_WITH_RCLONE_CONF:/srv/rclone \
-e RCLONE_CONFIG=/srv/rclone/rclone.conf \
deephdc/deep-oc-dogs_breed_det:cpu deepaas-run --listen-ip=0.0.0.0

dogImages.zip file is expected to be in /Datasets/dogs_breed/data/raw

Example 4: rclone.conf with DEEP-Nextcloud configured as deepnc remote storage:

[deepnc]
type = webdav
url = https://nc.deep-hybrid-datacloud.eu/remote.php/webdav/
vendor = nextcloud
user = DEEP-IAM-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
pass = YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

Example 5: rclone.conf with Google Drive configured as deepnc remote storage:

[deepnc]
type = drive
scope = drive
token = {"access_token":"ya29.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX","token_type":"Bearer","refresh_token":"1/-XXXXXXXXXXXXXXXXXXXX","expiry":"2019-01-14T20:26:13.21767343Z"}

Note

Check rclone documentation on how to configure different types of remote storage.

Connecting remote storage by passing rclone configuration as environment settings

It is also possible to pass necessary rclone configuration parameters as environment settings during instantiation of the container, best is to create a runnable bash script:

Example 6: connecting DEEP-Nextcloud remote storage

#!/bin/bash

rclone_conf="/srv/.rclone.conf"
rclone_url=https://nc.deep-hybrid-datacloud.eu/remote.php/webdav/
rclone_vendor=nextcloud
rclone_user=DEEP-IAM-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
rclone_pass=YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

docker run -ti -p 5000:5000 -e RCLONE_CONFIG=$rclone_conf \
   -e RCLONE_CONFIG_DEEPNC_TYPE="webdav" \
   -e RCLONE_CONFIG_DEEPNC_VENDOR="nextcloud" \
   -e RCLONE_CONFIG_DEEPNC_URL=$rclone_url \
   -e RCLONE_CONFIG_DEEPNC_USER=$rclone_user \
   -e RCLONE_CONFIG_DEEPNC_PASS=$rclone_pass \
   deephdc/deep-oc-dogs_breed_det:cpu deepaas-run --listen-ip=0.0.0.0
[*]Dogs breed detector is originally forked from udacity/dogs-project