Bored with classical computers? – Quantum AI with OpenFermion

calculator

In this article I will show how we can prepare and perform calculations on quantum computers using OpenFermion, Cirq and PySCF.

Before you will continue reading please watch short introduction:

Currently, there are many supercomputing centers, where we can run complicated simulations. However, there are still problems that are beyond the capabilities of classical computers, which can be addressed by quantum computers.

materials science

Quantum chemistry and materials science problems which that are described by the laws of quantum mechanics can be mapped to the quantum computers and projected to qubits.

OpenFermion is the library which can help to perform such calculations on a quantum computer.

Additionally we will use the PySCF package which will help to perform initial structure optimization (if you are interested in PySCF package I have shared the example DFT based band structure calculation of the single layer graphene structure pyscf_graphene.ipynb).

materials science

In our example we will investigate [latex]H_2[/latex] molecule for simplicity. We will use the PySCF package to find optimal bond length of the molecule.

Thanks to the OpenFermion-PySCF plugin we can smoothly use the molecule initial state obtained from PySCF package run in OpenFermion library (openfermionpyscf_h2.ipynb).

from openfermion.chem import MolecularData
from openfermionpyscf import run_pyscf

geometry = create_molecule(bond_length)
basis = 'sto-3g'
multiplicity = 1

run_scf = 1
run_mp2 = 1
run_cisd = 0
run_ccsd = 0
run_fci = 1

molecule = MolecularData(geometry, basis, multiplicity)
 
# Run pyscf.
molecule = run_pyscf(molecule,
                     run_scf=run_scf,
                     run_mp2=run_mp2,
                     run_cisd=run_cisd,
                     run_ccsd=run_ccsd,
                     run_fci=run_fci)

materials science

Now it is time to compile the molecule to the representation readable by the quantum computer using OpenFermion and Cirq library. Currently you can use several methods to achieve this:

Using one of this methods we get optimized quantum circuit. In our case the quantum cirquit for [latex]H_2[/latex] system will be represented by 4 qubits and operations that act on them (moment is collection of operations that act at the same abstract time slice).

materials science

Finally we can use quantum circuit to run the calculations on the cirq simulator or on the real quantum computer.

TinyMLOps with Arduino

Bee

In this article I will show how to build MLOps process TinyML on Arduino Nano 33 BLE Sense.

Before you will continue reading please watch short introduction:

In the last article (TinyML with Arduino) I have shown the example TinyML model which will classify jelly bears using RGB sensor. The next step, will be to build a process that will simplify, the model versions management, and the deployment.

mlops

The MLflow project is prepared in the Jupyter Notebook. Then we can convert the Notebook to the python code using nbdev library and version it in the Git repository.

Now we are ready to run the MLflow project using command:

mlflow run https://git_repository.git#path --no-conda --experiment-name="arduino"

The model is saved in the MLflow registry and the model version is associated with the git commit version.

mlops git

The MLflow model contains additional artifacts:

  • artifacts.ino - the arduino code which loads and uses the model
  • model.h - the Tensorflow Lite model encoded to hex
  • reduirements.ino.txt - the list of Arduino dependencies required by the arduino code

Example requirements.ino.txt file:

Arduino_TensorFlowLite@2.4.0-ALPHA
Arduino_APDS9960@1.0.3
Arduino_HTS221@1.0.0
Arduino_LPS22HB@1.0.1
Arduino_LSM9DS1@1.1.0
arduinoFFT@1.5.6

mlops arduino

Finally we can run the command:

docker run -it --network app_default --device=/dev/ttyACM0:/dev/ttyACM0 -e AWS_ACCESS_KEY_ID=minio -e AWS_SECRET_ACCESS_KEY=minio123 -e MLFLOW_S3_ENDPOINT_URL=http://minio:9000 -e MLFLOW_TRACKING_URI=http://mlflow:5000 qooba/tinyml-arduino:mlops ./mlops.sh -r ${RUN_ID}

where:

  • –device=/dev/ttyACM0 - is arduino device connected using USB
  • AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY - are minio credentials
  • MLFLOW_S3_ENDPOINT_URL - is minio url
  • MLFLOW_TRACKING_URI - is mlflow url
  • ${RUN_ID} - is run id of model saved in MLflow registry

Additionally we have several command line options:

ARDUINO MLOPS

Syntax: docker run -it qooba/tinyml-arduino:mlops -h [-r MLFLOW_RUN_ID] [-s ARDUINO_SERIAL] [-c ARDUINO_CORE] [-m ARDUINO_MODEL]
options:
-h|--help     Print help
-r|--run      MLflow run id
-s|--serial   Arduino device serial (default: /dev/ttyACM0)
-c|--core     Arduino core (default: arduino:mbed_nano)
-m|--model    Arduino model (default: arduino:mbed_nano:nano33ble)

arduino docker

After running the code the docker image qooba/tinyml-arduino:mlops will fetch the model for indicated RUN_ID from MLFlow. Then it will install required dependencies using the file requirements.ino.txt.

It will compile the model and the Arduino code. And finally upload it to the device.

Thanks to this, we can more easily manage subsequent versions of models, and automate the deployment process.

TinyML with Arduino

Ant

In this article I will show how to build Tensorflow Lite based jelly bears classifier using Arduino Nano 33 BLE Sense.

Before you will continue reading please watch short introduction:

Currently a machine learning solution can be deployed not only on very powerful machines containing GPU cards but also on a really small devices. Of course such a devices has a some limitation eg. memory etc. To deploy ML model we need to prepare it. The Tensorflow framework allows you to convert neural networks to Tensorflow Lite which can be installed on the edge devices eg. Arduino Nano.

Arduino Nano 33 BLE Sense is equipped with many sensors that allow for the implementation of many projects eg.:

  • Digital microphone
  • Digital proximity, ambient light, RGB and gesture sensor
  • 3D magnetometer, 3D accelerometer, 3D gyroscope
  • Capacitive digital sensor for relative humidity and temperature

Examples which I have used in this project can be found here.

Arduino sensors

To simplify device usage I have build Arduino Lab project where you can test and investigate listed sensors directly on the web browser.

The project dependencies are packed into docker image to simplify usage.

Before you start the project you will need to connect Arduino through USB (the Arduino will communicate with docker container through /dev/ttyACM0)

git clone https://github.com/qooba/tinyml-arduino.git
cd tinyml-arduino
./run.server.sh
# in another terminal tab
./run.nginx.sh
# go inside server container 
docker exec -it arduino /bin/bash
./start.sh

For each sensor type you can click Prepare button which will build and deploy appropriate Arduino code.


NOTE: Sometimes you will have to deploy to arduino manually to do this you will need to go to arduino container

docker exec -it arduino /bin/bash
cd /arduino
make rgb

Here you have complete Makefile with all types of implemented sensors.


You can start observations using Watch button. Arduino pdm Arduino temperature Arduino rgb

Now we will build TinyML solution. In the first step we will capture training data: Arduino capture

The training data will be saved in the csv format. You will need to repeat the proces for each class you will detect.

Captured data will be uploaded to the Colab Notebook. Here I fully base on the project Fruit identification using Arduino and TensorFlow. In the notebook we train the model using Tensorflow then convert it to Tensorflow Lite and finally encode to hex format (model.h header file) which is readable by Arduino.

Now we compile and upload model.h header file using drag and drop mechanism.

Arduino upload

Finally we can classify the jelly bears by the color:

Arduino classify

Feast with AI – feed your MLflow models with feature store

feast

In this article I will show how to prepare complete MLOPS solution based on the Feast feature store and MLflow platform.

Before you will continue reading please watch short introduction:

The whole solution will be deployed on the kubernetes (mlflow_feast.yaml).

component

We will use:

propensity to buy

To better visualize the whole process we will use the Propensity to buy example where I base on the Kaggle examples and data.

mlops

We start in Jupyter Notebook where we prepare Feast feature store schema which is kept in S3.

We can simply inspect the Feast schema in Jupyter Notebook:

from feast import FeatureStore
from IPython.core.display import display, HTML
import json
from json2html import *
import warnings
warnings.filterwarnings('ignore')

class FeastSchema:
    def __init__(self, repo_path: str):
        self.store = FeatureStore(repo_path=repo_path)
    
    def show_schema(self, skip_meta: bool= False):
        feast_schema=self.__project_show_schema(skip_meta)        
        display(HTML(json2html.convert(json = feast_schema)))

    def show_table_schema(self, table: str, skip_meta: bool= False):
        feasture_tables_dictionary=self.__project_show_schema(skip_meta)
        display(HTML(json2html.convert(json = {table:feasture_tables_dictionary[table]})))

    def __project_show_schema(self, skip_meta: bool= False):
        entities_dictionary={}
        feast_entities=self.store.list_entities()
        for entity in feast_entities:
            entity_dictionary=entity.to_dict()
            entity_spec=entity_dictionary['spec']
            entities_dictionary[entity_spec['name']]=entity_spec
        
        feasture_tables_dictionary={}
        feast_feature_tables=self.store.list_feature_views()
        for feature_table in feast_feature_tables:
            feature_table_dict=json.loads(str(feature_table))
            feature_table_spec=feature_table_dict['spec']
            feature_table_name=feature_table_spec['name']
            feature_table_spec.pop('name',None)
            if 'entities' in feature_table_spec:
                feature_table_entities=[]
                for entity in feature_table_spec['entities']:
                    feature_table_entities.append(entities_dictionary[entity])
                feature_table_spec['entities']=feature_table_entities
                
            if not skip_meta:
                feature_table_spec['meta']=feature_table_dict['meta']
            else:
                feature_table_spec.pop('input',None)
                feature_table_spec.pop('ttl',None)
                feature_table_spec.pop('online',None)
                
            feasture_tables_dictionary[feature_table_name]=feature_table_spec
        
        return feasture_tables_dictionary
    

        
    
FeastSchema(".").show_schema()
#FeastSchema(".").show_schema(skip_meta=True)
#FeastSchema(".").show_table_schema('driver_hourly_stats')
#FeastSchema().show_tables()

In our case we store the data in Apache Parquet files in S3 bucket. Using the Feast we can fetch the historical features and train the model using Scikit-learn library

bucket_name="propensity"
filename="training_sample"

store = FeatureStore(repo_path=".")

s3 = fs.S3FileSystem(endpoint_override=os.environ.get("FEAST_S3_ENDPOINT_URL"))
entity_df=pd.read_parquet(f'{bucket_name}/{filename}_entities.parquet', filesystem=s3)
entity_df["event_timestamp"]=datetime.now()


training_df = store.get_historical_features(
    entity_df=entity_df, 
    feature_refs = [
        'propensity_data:basket_icon_click',
        'propensity_data:basket_add_list',
        'propensity_data:basket_add_detail',
        'propensity_data:sort_by',
        'propensity_data:image_picker',
        'propensity_data:account_page_click',
        'propensity_data:promo_banner_click',
        'propensity_data:detail_wishlist_add',
        'propensity_data:list_size_dropdown',
        'propensity_data:closed_minibasket_click',
        'propensity_data:checked_delivery_detail',
        'propensity_data:checked_returns_detail',
        'propensity_data:sign_in',
        'propensity_data:saw_checkout',
        'propensity_data:saw_sizecharts',
        'propensity_data:saw_delivery',
        'propensity_data:saw_account_upgrade',
        'propensity_data:saw_homepage',
        'propensity_data:device_mobile',
        'propensity_data:device_computer',
        'propensity_data:device_tablet',
        'propensity_data:returning_user',
        'propensity_data:loc_uk',
        'propensity_data:ordered'
    ],
).to_df()

predictors = training_df.drop(['propensity_data__ordered','UserID','event_timestamp'], axis=1)
targets = training_df['propensity_data__ordered']

X_train, X_test, y_train, y_test = train_test_split(predictors, targets, test_size=.3)

classifier=GaussianNB(var_smoothing=input_params['var_smoothing'])
classifier=classifier.fit(X_train,y_train)

predictions=classifier.predict(X_test)

conf_matrix=sklearn.metrics.confusion_matrix(y_test,predictions)
ac_score=sklearn.metrics.accuracy_score(y_test, predictions)

propensity_model_path = 'propensity.joblib'
joblib.dump(classifier, propensity_model_path)

artifacts = {
    "propensity_model": propensity_model_path,
    "feature_store": "feature_store.yaml"
}

The model will use online Feast redis features as well as additional features from the request thus we need to wrap the MLflow model and define it:

import mlflow.pyfunc
class PropensityWrapper(mlflow.pyfunc.PythonModel):
    
    def load_context(self, context):
        import joblib
        from feast import FeatureStore
        import pandas as pd 
        import os
        
        self.model = joblib.load(context.artifacts["propensity_model"])
        self.store = FeatureStore(repo_path=os.environ.get("FEAST_REPO_PATH"))
        
    def predict(self, context, model_input):
        users=list(model_input.to_dict()["UserID"].values())
        
        feature_vector = self.store.get_online_features(
            feature_refs=[
                'propensity_data:basket_icon_click',
                'propensity_data:basket_add_list',
                'propensity_data:basket_add_detail',
                'propensity_data:sort_by',
                'propensity_data:image_picker',
                'propensity_data:account_page_click',
                'propensity_data:promo_banner_click',
                'propensity_data:detail_wishlist_add',
                'propensity_data:list_size_dropdown',
                'propensity_data:closed_minibasket_click',
                'propensity_data:checked_delivery_detail',
                'propensity_data:checked_returns_detail',
                'propensity_data:sign_in',
                'propensity_data:saw_checkout',
                'propensity_data:saw_sizecharts',
                'propensity_data:saw_delivery',
                'propensity_data:saw_account_upgrade',
                'propensity_data:saw_homepage',
                'propensity_data:returning_user',
                'propensity_data:loc_uk'
            ],
            entity_rows=[{"UserID": uid} for uid in users]
        ).to_dict()
        
        data=pd.DataFrame.from_dict(feature_vector)
        merged_data = pd.merge(model_input,data, how="inner", on=["UserID"], suffixes=('_x', '')).drop(['UserID'], axis=1)
        return self.model.predict(merged_data)

Now we can log the MLflow model to the repository:

import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
import mlflow.pyfunc

#conda_env=mlflow.pyfunc.get_default_conda_env()

with mlflow.start_run():
    
    #mlflow.log_param("var_smoothing", input_params['var_smoothing'])
    mlflow.log_metric("accuracy_score", ac_score)
    
    tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme

    if tracking_url_type_store != "file":
        mlflow.pyfunc.log_model("model",
                                 registered_model_name="propensity_model",
                                 python_model=PropensityWrapper(),
                                 artifacts=artifacts,
                                 conda_env=conda_env)
    else:
        mlflow.pyfunc.log_model("model",
                                 path=my_model_path,
                                 python_model=PropensityWrapper(),
                                 artifacts=artifacts,
                                 conda_env=conda_env)

We can export the code and run is using MLflow cli:

mlflow run . --no-conda --experiment-name="propensity" -P var_smoothing=1e-9

Now we need to materialize features to Redis:

feast materialize 2021-03-22T23:42:00 2021-06-22T23:42:00

Using MLflow we can simply deploy model as a microservice in k8s. In our case we want to deploy the model models:/propensity_model/Production which is currently assigned for Production. During start the MLflow will automatically fetch the proper model from S3:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-serving
  namespace: qooba
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow-serving
      version: v1
  template:
    metadata:
      labels:
        app: mlflow-serving
        version: v1
    spec:
      containers:
      - image: qooba/mlflow:serving
        imagePullPolicy: IfNotPresent
        name: mlflow-serving
        env:
        - name: MLFLOW_TRACKING_URI
          value: http://mlflow.qooba.svc.cluster.local:5000
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              name: minio-auth
              key: username
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: minio-auth
              key: password
        - name: MLFLOW_S3_ENDPOINT_URL
          value: http://minio.qooba.svc.cluster.local:9000
        - name: FEAST_S3_ENDPOINT_URL
          value: http://minio.qooba.svc.cluster.local:9000
        - name: REDIS_TYPE
          value: REDIS
        - name: REDIS_CONNECTION_STRING
          value: redis.qooba.svc.cluster.local:6379,db=0
        - name: FEAST_TELEMETRY
          value: "false"
        - name: FEAST_REPO_PATH
          value: /feast_repository
        - name: PORT
          value: "5000"
        - name: MODEL
          value: models:/propensity_model/Production
        ports:
        - containerPort: 5000
        volumeMounts:
          - mountPath: /feast_repository
            name: config
      volumes:
        - name: config
          configMap:
            name: mlflow-serving
            items:
            - key: feature_store
              path: feature_store.yaml

On each HTTP request:

import requests
import json

url="http://mlflow-serving.qooba.svc.cluster.local:5000/invocations"

headers={
    'Content-Type': 'application/json; format=pandas-records'
}

data=[
    {"UserID": "a720-6b732349-a720-4862-bd21-644732",
     'propensity_data:device_mobile': 1.0,
     'propensity_data:device_computer': 0.0,
     'propensity_data:device_tablet': 0.0
    }
]

response=requests.post(url, data=json.dumps(data), headers=headers)
response.text

The model will fetch the client features (based on UserID) from Redis and HTTP request and generate prediction.

Flink with AI – how to use Flink with MLflow model in Jupyter Notebook

squirrel

In this article I will show how to process streams with Apache Flink and MLflow model

Before you will continue reading please watch short introduction:

Apache Flink allows for an efficient and scalable way of processing streams. It is a distributed processing engine which supports multiple sources like: Kafka, NiFi and many others (if we need custom, we can create them ourselves).

Apache Flink also provides the framework for defining streams operations in languages like: Java, Scala, Python and SQL.

To simplify the such definitions we can use Jupyter Notebook as a interface. Of course we can write in Python using PyFlink library but we can make it even easier using writing jupyter notebook extension (“magic words”).

Using Flink extension (magic.ipynb) we can simply use Flink SQL sql syntax directly in Jupyter Notebook.

To use the extesnions we need to load it:

%reload_ext flinkmagic

Then we need to initialize the Flink StreamEnvironment:

%flink_init_stream_env

Now we can use the SQL code for example:

FileSystem connector:

%%flink_execute_sql
CREATE TABLE MySinkTable (
    word varchar,
    cnt bigint) WITH (
        'connector.type' = 'filesystem',
        'format.type' = 'csv',
        'connector.path' = '/opt/flink/notebooks/data/word_count_output1')

MySQL connector:

%%flink_execute_sql
CREATE TABLE MySinkDbSmsTable (
    smstext varchar,
    smstype varchar) WITH (
        'connector.type' = 'jdbc',
        'connector.url' = 'jdbc:mysql://mysql:3306/test',
        'connector.table' = 'sms',
        'connector.driver' = 'com.mysql.jdbc.Driver',
        'connector.write.flush.interval' = '10',
        'connector.username' = 'root',
        'connector.password' = 'my-secret-pw')

Kafka connector:

%%flink_execute_sql
CREATE TABLE MySourceKafkaTable (word varchar) WITH (
    'connector.type' = 'kafka',
    'connector.version' = 'universal',
    'connector.topic' = 'test',
    'connector.startup-mode' = 'latest-offset',
    'connector.properties.bootstrap.servers' = 'kafka:9092',
    'connector.properties.group.id' = 'test',
    'format.type' = 'csv'
        )

The magic keyword will automatically execute SQL in existing StreamingEnvironment.

Now we can apply the Machine Learning model. In plain Flink we can use UDF function defined in python but we will use MLflow model which wraps the ML frameworks (like PyTorch, Tensorflow, Scikit-learn etc.). Because MLflow expose homogeneous interface we can create another “jupyter magic” which will automatically load MLflow model as a Flink function.

%flink_mlflow "SPAM_CLASSIFIER" "/mlflow/mlruns/2/64a89b0a6b7346498316bfae4c298535/artifacts/model" "[DataTypes.STRING()]" "DataTypes.STRING()"

Now we can simply write Flink SQL query:

%%flink_sql_query
SELECT word as smstext, SPAM_CLASSIFIER(word) as smstype FROM MySourceKafkaTable

which in our case will fetch kafka events and classify it using MLflow spam classifier. The results will be displayed in the realtime in the Jupyter Notebook as a events DataFrame.

If we want we can simply use other python libraries (like matplotlib and others) to create graphical representation of the results eg. pie chart.

You can find the whole code including: Flink examples, extension and Dockerfiles here: https://github.com/qooba/flink-with-ai.

You can also use docker image: qooba/flink:dev to test and run notebooks inside. Please check the run.sh where you have all components (Kafka, MySQL, Jupyter with Flink, MLflow repository).