Flink with AI – how to use Flink with MLflow model in Jupyter Notebook

squirrel

In this article I will show how to process streams with Apache Flink and MLflow model

Before you will continue reading please watch short introduction:

Apache Flink allows for an efficient and scalable way of processing streams. It is a distributed processing engine which supports multiple sources like: Kafka, NiFi and many others (if we need custom, we can create them ourselves).

Apache Flink also provides the framework for defining streams operations in languages like: Java, Scala, Python and SQL.

To simplify the such definitions we can use Jupyter Notebook as a interface. Of course we can write in Python using PyFlink library but we can make it even easier using writing jupyter notebook extension (“magic words”).

Using Flink extension (magic.ipynb) we can simply use Flink SQL sql syntax directly in Jupyter Notebook.

To use the extesnions we need to load it:

%reload_ext flinkmagic

Then we need to initialize the Flink StreamEnvironment:

%flink_init_stream_env

Now we can use the SQL code for example:

FileSystem connector:

%%flink_execute_sql
CREATE TABLE MySinkTable (
    word varchar,
    cnt bigint) WITH (
        'connector.type' = 'filesystem',
        'format.type' = 'csv',
        'connector.path' = '/opt/flink/notebooks/data/word_count_output1')

MySQL connector:

%%flink_execute_sql
CREATE TABLE MySinkDbSmsTable (
    smstext varchar,
    smstype varchar) WITH (
        'connector.type' = 'jdbc',
        'connector.url' = 'jdbc:mysql://mysql:3306/test',
        'connector.table' = 'sms',
        'connector.driver' = 'com.mysql.jdbc.Driver',
        'connector.write.flush.interval' = '10',
        'connector.username' = 'root',
        'connector.password' = 'my-secret-pw')

Kafka connector:

%%flink_execute_sql
CREATE TABLE MySourceKafkaTable (word varchar) WITH (
    'connector.type' = 'kafka',
    'connector.version' = 'universal',
    'connector.topic' = 'test',
    'connector.startup-mode' = 'latest-offset',
    'connector.properties.bootstrap.servers' = 'kafka:9092',
    'connector.properties.group.id' = 'test',
    'format.type' = 'csv'
        )

The magic keyword will automatically execute SQL in existing StreamingEnvironment.

Now we can apply the Machine Learning model. In plain Flink we can use UDF function defined in python but we will use MLflow model which wraps the ML frameworks (like PyTorch, Tensorflow, Scikit-learn etc.). Because MLflow expose homogeneous interface we can create another “jupyter magic” which will automatically load MLflow model as a Flink function.

%flink_mlflow "SPAM_CLASSIFIER" "/mlflow/mlruns/2/64a89b0a6b7346498316bfae4c298535/artifacts/model" "[DataTypes.STRING()]" "DataTypes.STRING()"

Now we can simply write Flink SQL query:

%%flink_sql_query
SELECT word as smstext, SPAM_CLASSIFIER(word) as smstype FROM MySourceKafkaTable

which in our case will fetch kafka events and classify it using MLflow spam classifier. The results will be displayed in the realtime in the Jupyter Notebook as a events DataFrame.

If we want we can simply use other python libraries (like matplotlib and others) to create graphical representation of the results eg. pie chart.

You can find the whole code including: Flink examples, extension and Dockerfiles here: https://github.com/qooba/flink-with-ai.

You can also use docker image: qooba/flink:dev to test and run notebooks inside. Please check the run.sh where you have all components (Kafka, MySQL, Jupyter with Flink, MLflow repository).

Animated Art with AI – face reeanactment in action

faces

In this article I will show how to use artificial intelligence to add motion to the images and photos.

Before you will continue reading please watch short introduction:

Face reenactment

To bring photos to life we can use the face reenactment algorithm designed to transfer the facial movements in the video to another image.

face reenactment diagram

In this project I have used github implementation: https://github.com/AliaksandrSiarohin/first-order-model. Where the extensive description of the neural network architecture can be found in this paper. The solution contains of two parts: motion module and generation module. The motion module at the first stage extracts the key points from the source and target image. In fact in the solution we assume that reference image which we can to the source and target image exists and at the first stage the transformations from reference image to source ([latex]T_{S \leftarrow R} (p_k)[/latex]) and target ([latex]T_{T \leftarrow R} (p_k)[/latex]) image is calculated respectively. Then the first order Taylor expansions [latex]\frac{d}{dp}T_{S \leftarrow R} (p)| {p=p_k}[/latex] and [latex]\frac{d}{dp}T_{T \leftarrow R} (p)| {p=p_k}[/latex] is used to calculate dense motion field. The generation module use calculated dense motion field and source image to generate new image that will resemble target image.

face reenactment diagram

The whole solution is packed into docker image thus we can simply reproduce the results using command:

docker run -it --rm --gpus all -v $(pwd)/torch_models:/root/.torch/models -v $(pwd)/checkpoints:/ai/checkpoints -v $(pwd)/test:/ai/test qooba/aifacereeanactment python3 ./prepare.py --source_image /ai/test/test.jpg --driving_video /ai/test/test.mp4 --output /ai/test/test_generated.mp4

NOTE: additional volumes (torch_models and checkpoints) are mount because during first run the trained neural networks are downloaded.

To reproduce the results we need to provide two files motion video and source image. In above example I put them into test directory and mount it into docker container (-v $(pwd)/test:/ai/test) to use them into it.

Below you have all command line options:

usage: prepare.py [-h] [--config CONFIG] [--checkpoint CHECKPOINT]
                  [--source_image SOURCE_IMAGE]
                  [--driving_video DRIVING_VIDEO] [--crop_image]
                  [--crop_image_padding CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...]]
                  [--crop_video] [--output OUTPUT] [--relative]
                  [--no-relative] [--adapt_scale] [--no-adapt_scale]
                  [--find_best_frame] [--best_frame BEST_FRAME] [--cpu]

first-order-model

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       path to config
  --checkpoint CHECKPOINT
                        path to checkpoint to restore
  --source_image SOURCE_IMAGE
                        source image
  --driving_video DRIVING_VIDEO
                        driving video
  --crop_image, -ci     autocrop image
  --crop_image_padding CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...], -cip CROP_IMAGE_PADDING [CROP_IMAGE_PADDING ...]
                        autocrop image paddings left, upper, right, lower
  --crop_video, -cv     autocrop video
  --output OUTPUT       output video
  --relative            use relative or absolute keypoint coordinates
  --no-relative         don't use relative or absolute keypoint coordinates
  --adapt_scale         adapt movement scale based on convex hull of keypoints
  --no-adapt_scale      no adapt movement scale based on convex hull of
                        keypoints
  --find_best_frame     Generate from the frame that is the most alligned with
                        source. (Only for faces, requires face_aligment lib)
  --best_frame BEST_FRAME
                        Set frame to start from.
  --cpu                 cpu mode.

New Face with AI

faces

In this article I will show how to use artificial intelligence to generate human faces.

Before you will continue reading please watch short introduction:

Generative adversarial network

To generate realistic human faces, we can use neural networks with GAN (Generative adversarial network) architecture.

neural network architecture

The GaN network consists of two parts of the Generator whose task is to generate the image from random input and a discriminator that checks if the generated image is realistic.

training progress

During training, the networks compete with each other, the generator tries to generate better and better images and thereby deceive the Discriminator. On the other hand, the Discriminator learns to distinguish between real and generated photos.

To train the discriminator, we use both real photos and those generated by the generator.

Finally, we can achieve the following results using DCGAN network. As you can see some faces look realistic while some are distorted, additionally the network can only generate low resolution images.

training results

We can achieve much better results using the StyleGaN (arxiv article) network, which, among other things, differs in that the next layers of the network are progressively added during training.

I generated the images using pretrained networks and the effect is really amazing.

results stylegan

Unblur low quality face images with AI

clay

In this article I will show how to improve the quality of blurred face images using artificial intelligence. For this purpose I will use neural networks and FastAI library (ver. 1)

The project code is available on my github: https://github.com/qooba/aiunblur You can also use ready docker image: https://hub.docker.com/repository/docker/qooba/aiunblur

Before you will continue reading please watch short introduction:

I have based o lot on the fastai course thus I definitely recommend to go through it.

Data

To train neural network how to rebuild the face images we need to provide the faces dataset which will show how low quality and blurred images should be reconstructed. Thus we need pairs of low and high quality images.

To prepare the data set we can use available fases dataset eg. FFHQ, Tufts Face Database, CelebA

We will treat the original images as a high resolution data and rescale them to prepare low resolution input:

import fastai
from fastai.vision import *
from fastai.callbacks import *
from fastai.utils.mem import *
from torchvision.models import vgg16_bn
from pathlib import Path

path = Path('/opt/notebooks/faces')
path_hr = path/'high_resolution'
path_lr = path/'small-96'

il = ImageList.from_folder(path_hr)

def resize_one(fn, i, path, size):
    dest = path/fn.relative_to(path_hr)
    dest.parent.mkdir(parents=True, exist_ok=True)
    img = PIL.Image.open(fn)
    targ_sz = resize_to(img, size, use_min=True)
    img = img.resize(targ_sz, resample=PIL.Image.BILINEAR).convert('RGB')
    img.save(dest, quality=60)

sets = [(path_lr, 96)]
for p,size in sets:
    if not p.exists(): 
        print(f"resizing to {size} into {p}")
        parallel(partial(resize_one, path=p, size=size), il.items)

Now we can create data bunch for training:

bs,size=32,128
arch = models.resnet34
src = ImageImageList.from_folder(path_lr).split_by_rand_pct(0.1, seed=42)

def get_data(bs,size):
    data = (src.label_from_func(lambda x: path_hr/x.name)
           .transform(get_transforms(max_zoom=2.), size=size, tfm_y=True)
           .databunch(bs=bs,num_workers=0).normalize(imagenet_stats, do_y=True))

    data.c = 3
    return data

data = get_data(bs,size)

Training

In this solution we will use a neural network with UNET architecture.

neural network architecture

The UNET neural network contains two parts Encoder and Decoder which are used to reconstruct the face image. During the first stage Encoder fetch the input, extracts and aggregates the image features. At each stage the features maps are donwsampled. Then Decoder uses extracted features and tries to rebuild the image upsampling it at each decoding stage. Finally we get regenerated images.

Additionally we need to define the Loss Function which will tell the model if the image was rebuilt correctly and allow to train the model.

To do this we will use additional neural network VGG-16. We will put Generated image and Original image (which is our target) to the network input. Then will compare the features extracted for both images at selected layers and according to this calculated the loss.

Finally we will use Adam optmizer to minimize the loss and achieve better result.

def gram_matrix(x):
    n,c,h,w = x.size()
    x = x.view(n, c, -1)
    return (x @ x.transpose(1,2))/(c*h*w)

base_loss = F.l1_loss

vgg_m = vgg16_bn(True).features.cuda().eval()
requires_grad(vgg_m, False)

blocks = [i-1 for i,o in enumerate(children(vgg_m)) if isinstance(o,nn.MaxPool2d)]

class FeatureLoss(nn.Module):
    def __init__(self, m_feat, layer_ids, layer_wgts):
        super().__init__()
        self.m_feat = m_feat
        self.loss_features = [self.m_feat[i] for i in layer_ids]
        self.hooks = hook_outputs(self.loss_features, detach=False)
        self.wgts = layer_wgts
        self.metric_names = ['pixel',] + [f'feat_{i}' for i in range(len(layer_ids))
              ] + [f'gram_{i}' for i in range(len(layer_ids))]

    def make_features(self, x, clone=False):
        self.m_feat(x)
        return [(o.clone() if clone else o) for o in self.hooks.stored]
    
    def forward(self, input, target):
        out_feat = self.make_features(target, clone=True)
        in_feat = self.make_features(input)
        self.feat_losses = [base_loss(input,target)]
        self.feat_losses += [base_loss(f_in, f_out)*w
                             for f_in, f_out, w in zip(in_feat, out_feat, self.wgts)]
        self.feat_losses += [base_loss(gram_matrix(f_in), gram_matrix(f_out))*w**2 * 5e3
                             for f_in, f_out, w in zip(in_feat, out_feat, self.wgts)]
        self.metrics = dict(zip(self.metric_names, self.feat_losses))
        return sum(self.feat_losses)
    
    def __del__(self): self.hooks.remove()

feat_loss = FeatureLoss(vgg_m, blocks[2:5], [5,15,2])

learn = unet_learner(data, arch, wd=wd, loss_func=feat_loss, callback_fns=LossMetrics,
                     blur=True, norm_type=NormType.Weight)

Results

After training we can use the model to regenerate the images:

results

Application

Finally we can export the model and create the drag and drop application which fix the face images in web application.

results

The whole solution is packed into docker images thus you can simply start it using commands:

# with GPU
docker run -d --gpus all --rm -p 8000:8000 --name aiunblur qooba/aiunblur

# without GPU
docker run -d --rm -p 8000:8000 --name aiunblur qooba/aiunblur

To use GPU additional nvidia drivers (included in the NVIDIA CUDA Toolkit) are needed.

Fly AI with Tello drone

balloons

The popularity of drones and the area of their application is becoming greater each year.

In this article I will show how to programmatically control Tello Ryze drone, capture camera video and detect objects using Tensorflow. I have packed the whole solution into docker images (the backend and Web App UI are in separate images) thus you can simply run it.

The project code is available on my github https://github.com/qooba/aidrone You can also use ready docker image: https://hub.docker.com/repository/docker/qooba/aidrone

Before you will continue reading please watch short introduction: https://youtu.be/g8oZ8ltRArY

Architecture

architecture diagram

The application will use two network interfaces. The first will be used by the python backend to connect the the Tello wifi to send the commands and capture video stream. In the backend layer I have used the DJITelloPy library which covers all required tello move commands and video stream capture. To efficiently show the video stream in the browser I have used the WebRTC protocol and aiortc library. Finally I have used the Tensorflow 2.0 object detection with pretrained SSD ResNet50 model.

The second network interface will be used to expose the Web Vue application. I have used nginx to serve the frontend application

Application

drone controls

Using Web interface you can control the Tello movement where you can:

  • start video stream
  • stop video stream
  • takeoff - which starts Tello flight
  • land
  • up
  • down
  • rotate left
  • rotate right
  • forward
  • backward
  • left
  • right

In addition using draw detection switch you can turn on/off the detection boxes on the captured video stream (however this introduces a delay in the video thus it is turned off by default). Additionally I send the list of detected classes through web sockets which are also displayed.

drone detection

As mentioned before I have used the pretrained model thus It is good idea to train your own model to get better results for narrower and more specific class of objects.

Finally the whole solution is packed into docker images thus you can simply start it using commands:

docker network create -d bridge app_default
docker run --name tello --network app_default --gpus all -d --rm -p 8890:8890 -p 8080:8080 -p 8888:8888 -p 11111:11111/udp  qooba/aidrone /bin/bash -c "python3 drone.py"
docker run -d --rm --network app_default --name nginx -p 80:80 -p 443:443 qooba/aidrone:front

To use GPU additional nvidia drivers (included in the NVIDIA CUDA Toolkit) are needed.