APE: Aligning Pretrained Encoders to Quickly Learn Aligned Multimodal Representations

This paper was accepted at the workshop “Has It Trained Yet?” at NeurIPS.
Recent advances in learning aligned multimodal representations have been primarily driven by training large neural networks on massive, noisy paired-modality datasets. In this work, we ask whether it is possible to achieve similar results with substantially less training time and data. We achieve this by taking advantage of existing pretrained unimodal encoders and careful curation of alignment data relevant to the downstream task of interest. We study a natural approach to aligning existing encoders via small auxiliary…Apple Machine Learning Research

TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation

Denoising Diffusion models have demonstrated their proficiency for generative sampling. However, generating good samples often requires many iterations. Consequently, techniques such as binary time-distillation (BTD) have been proposed to reduce the number of network calls for a fixed architecture. In this paper, we introduce TRAnsitive Closure Time-distillation (TRACT), a new method that extends BTD. For single step diffusion,TRACT improves FID by up to 2.4x on the same architecture, and achieves new single-step Denoising Diffusion Implicit Models (DDIM) state-of-the-art FID (7.4 for…Apple Machine Learning Research

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

This work studies the use of attention masking in transformer transducer based speech recognition for building a single configurable model for different deployment scenarios. We present a comprehensive set of experiments comparing fixed masking, where the same attention mask is applied at every frame, with chunked masking, where the attention mask for each frame is determined by chunk boundaries, in terms of recognition accuracy and latency. We then explore the use of variable masking, where the attention masks are sampled from a target distribution at training time, to build models that can…Apple Machine Learning Research

Maximize performance and reduce your deep learning training cost with AWS Trainium and Amazon SageMaker

Maximize performance and reduce your deep learning training cost with AWS Trainium and Amazon SageMaker

Today, tens of thousands of customers are building, training, and deploying machine learning (ML) models using Amazon SageMaker to power applications that have the potential to reinvent their businesses and customer experiences. These ML models have been increasing in size and complexity over the last few years, which has led to state-of-the-art accuracies across a range of tasks and also pushing the time to train from days to weeks. As a result, customers must scale their models across hundreds to thousands of accelerators, which makes them more expensive to train.

SageMaker is a fully managed ML service that helps developers and data scientists easily build, train, and deploy ML models. SageMaker already provides the broadest and deepest choice of compute offerings featuring hardware accelerators for ML training, including G5 (Nvidia A10G) instances and P4d (Nvidia A100) instances.

Growing compute requirements calls for faster and more cost-effective processing power. To further reduce model training times and enable ML practitioners to iterate faster, AWS has been innovating across chips, servers, and data center connectivity. The new Trn1 instances powered by AWS Trainium chips offer the best price-performance and the fastest ML model training on AWS, providing up to 50% lower cost to train deep learning models over comparable GPU-based instances without any drop in accuracy.

In this post, we show how you can maximize your performance and reduce cost using Trn1 instances with SageMaker.

Solution overview

SageMaker training jobs support ml.trn1 instances, powered by Trainium chips, which are purpose built for high-performance ML training applications in the cloud. You can use ml.trn1 instances on SageMaker to train natural language processing (NLP), computer vision, and recommender models across a broad set of applications, such as speech recognition, recommendation, fraud detection, image and video classification, and forecasting. The ml.trn1 instances feature up to 16 Trainium chips, which is a second-generation ML chip built by AWS after AWS Inferentia. ml.trn1 instances are the first Amazon Elastic Compute Cloud (Amazon EC2) instances with up to 800 Gbps of Elastic Fabric Adapter (EFA) network bandwidth. For efficient data and model parallelism, each ml.trn1.32xl instance has 512 GB of high-bandwidth memory, delivers up to 3.4 petaflops of FP16/BF16 compute power, and features NeuronLink, an intra-instance, high-bandwidth, nonblocking interconnect.

Trainium is available in two configurations and can be used in the US East (N. Virginia) and US West (Oregon) Regions.

The following table summarizes the features of the Trn1 instances.

Instance Size Trainium
Accelerators
Accelerator
Memory
(GB)
vCPUs Instance
Memory
(GiB)
Network
Bandwidth
(Gbps)
EFA and
RDMA
Support
trn1.2xlarge 1 32 8 32 Up to 12.5 No
trn1.32xlarge 16 512 128 512 800 Yes
trn1n.32xlarge (coming soon) 16 512 128 512 1600 Yes

Let’s understand how to use Trainium with SageMaker with a simple example. We will train a text classification model with SageMaker training and PyTorch using the Hugging Face Transformers Library.

We use the Amazon Reviews dataset, which consists of reviews from amazon.com. The data spans a period of 18 years, comprising approximately 35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. The following code is an example from the AmazonPolarity test set:

{
title':'Great CD',
'content':"My lovely Pat has one of the GREAT voices of her generation. I have listened to this CD for YEARS and I still LOVE IT. When I'm in a good mood it makes me feel better. A bad mood just evaporates like sugar in the rain. This CD just oozes LIFE. Vocals are jusat STUUNNING and lyrics just kill. One of life's hidden gems. This is a desert isle CD in my book. Why she never made it big is just beyond me. Everytime I play this, no matter black, white, young, old, male, female EVERYBODY says one thing ""Who was that singing ?""",
'label':1
}

For this post, we only use the content and label fields. The content field is a free text review, and the label field is a binary value containing 1 or 0 for positive or negative reviews, respectively.

For our algorithm, we use BERT, a transformer model pre-trained on a large corpus of English data in a self-supervised fashion. This model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification, or question answering.

Implementation details

Let’s begin by taking a closer look at the different components involved in training the model:

  • AWS Trainium – At its core, each Trainium instance has Trainium devices built into it. Trn1.2xlarge has 1 Trainium device, and Trn1.32xlarge has 16 Trainium devices. Each Trainium device consists of compute (2 NeuronCore-v2), 32 GB of HBM device memory, and NeuronLink for fast inter-device communication. Each NeuronCore-v2 consists of a fully independent heterogenous compute unit with separate engines (Tensor/Vector/Scalar/GPSIMD). GPSIMD are fully programmable general-purpose processors that you can use to implement custom operators and run them directly on the NeuronCore engines.
  • Amazon SageMaker Training – SageMaker provides a fully managed training experience to easily train models without having to worry about infrastructure. When you use SageMaker Training, it runs everything needed for a training job, such as code, container, and data, in a compute infrastructure separate from the invocation environment. This allows us to run experiments in parallel and iterate fast. SageMaker provides a Python SDK to launch training jobs. The example in this post uses the SageMaker Python SDK to trigger the training job using Trainium.
  • AWS Neuron – Because Trainium NeuronCore has its own compute engine, we need a mechanism to compile our training code. The AWS Neuron compiler takes the code written in Pytorch/XLA and optimizes it to run on Neuron devices. The Neuron compiler is integrated as part of the Deep Learning Container we will use for training our model.
  • PyTorch/XLA – This Python package uses the XLA deep learning compiler to connect the PyTorch deep learning framework and cloud accelerators like Trainium. Building a new PyTorch network or converting an existing one to run on XLA devices requires only a few lines of XLA-specific code. We will see for our use case what changes we need to make.
  • Distributed training – To run the training efficiently on multiple NeuronCores, we need a mechanism to distribute the training into available NeuronCores. SageMaker supports torchrun with Trainium instances, which can be used to run multiple processes equivalent to the number of NeuronCores in the cluster. This is done by passing the distribution parameter to the SageMaker estimator as follows, which starts a data parallel distributed training where the same model is loaded into different NeuronCores that process separate data batches:
distribution={"torch_distributed": {"enabled": True}}

Script changes needed to run on Trainium

Let’s look at the code changes needed to adopt a regular GPU-based PyTorch script to run on Trainium. At a high level, we need to make the following changes:

  1. Replace GPU devices with Pytorch/XLA devices. Because we use torch distribution, we need to initialize the training with XLA as the device as follows:
    device = "xla"
    torch.distributed.init_process_group(device)

  2. We use the PyTorch/XLA distributed backend to bridge the PyTorch distributed APIs to XLA communication semantics.
  3. We use PyTorch/XLA MpDeviceLoader for the data ingestion pipelines. MpDeviceLoader helps improve performance by overlapping three steps: tracing, compilation, and data batch loading to the device. We need to wrap the PyTorch dataloader with the MpDeviceDataLoader as follows:
    train_device_loader = pl.MpDeviceLoader(train_loader, "xla")

  4. Run the optimization step using the XLA-provided API as shown in the following code. This consolidates the gradients between cores and issues the XLA device step computation.
    torch_xla.core.xla_model.optimizer_step(optimizer)

  5. Map CUDA APIs (if any) to generic PyTorch APIs.
  6. Replace CUDA fused optimizers (if any) with generic PyTorch alternatives.

The entire example, which trains a text classification model using SageMaker and Trainium, is available in the following GitHub repo. The notebook file Fine tune Transformers for building classification models using SageMaker and Trainium.ipynb is the entrypoint and contains step-by-step instructions to run the training.

Benchmark tests

In the test, we ran two training jobs: one on ml.trn1.32xlarge, and one on ml.p4d.24xlarge with the same batch size, training data, and other hyperparameters. During the training jobs, we measured the billable time of the SageMaker training jobs, and calculated the price-performance by multiplying the time required to run training jobs in hours by the price per hour for the instance type. We selected the best result for each instance type out of multiple jobs runs.

The following table summarizes our benchmark findings.

Model Instance Type Price (per node * hour) Throughput (iterations/sec) ValidationAccuracy Billable Time (sec) Training Cost in $
BERT base classification ml.trn1.32xlarge 24.725 6.64 0.984 6033 41.47
BERT base classification ml.p4d.24xlarge 37.69 5.44 0.984 6553 68.6

The results showed that the Trainium instance costs less than the P4d instance, providing similar throughput and accuracy when training the same model with the same input data and training parameters. This means that the Trainium instance delivers better price-performance than GPU-based P4D instances. With a simple example like this, we can see Trainium offers about 22% faster time to train and up to 50% lower cost over P4d instances.

Deploy the trained model

After we train the model, we can deploy it to various instance types such as CPU, GPU, or AWS Inferentia. The key point to note is the trained model isn’t dependent on specialized hardware to deploy and make inference. SageMaker provides mechanisms to deploy a trained model using both real-time or batch mechanisms. The notebook example in the GitHub repo contains code to deploy the trained model as a real-time endpoint using an ml.c5.xlarge (CPU-based) instance.

Conclusion

In this post, we looked at how to use Trainium and SageMaker to quickly set up and train a classification model that gives up to 50% cost savings without compromising on accuracy. You can use Trainium for a wide range of use cases that involve pre-training or fine-tuning Transformer-based models. For more information about support of various model architectures, refer to Model Architecture Fit Guidelines.


About the Authors

Arun Kumar Lokanatha is a Senior ML Solutions Architect with the Amazon SageMaker Service team. He focuses on helping customers build, train, and migrate ML production workloads to SageMaker at scale. He specializes in Deep Learning especially in the area of NLP and CV. Outside of work, he enjoys Running and hiking.

Mark Yu is a Software Engineer in AWS SageMaker. He focuses on building large-scale distributed training systems, optimizing training performance, and developing high-performance ml training hardwares, including SageMaker trainium. Mark also has in-depth knowledge on the machine learning infrastructure optimization. In his spare time, he enjoys hiking, and running.

Omri Fuchs is a Software Development Manager at AWS SageMaker. He is the technical leader responsible for SageMaker training job platform, focusing on optimizing SageMaker training performance, and improving training experience. He has a passion for cutting-edge ML and AI technology. In his spare time, he likes cycling, and hiking.

Gal Oshri is a Senior Product Manager on the Amazon SageMaker team. He has 7 years of experience working on Machine Learning tools, frameworks, and services.

Read More

Learning from deep learning: a case study of feature discovery and validation in pathology

Learning from deep learning: a case study of feature discovery and validation in pathology

When a patient is diagnosed with cancer, one of the most important steps is examination of the tumor under a microscope by pathologists to determine the cancer stage and to characterize the tumor. This information is central to understanding clinical prognosis (i.e., likely patient outcomes) and for determining the most appropriate treatment, such as undergoing surgery alone versus surgery plus chemotherapy. Developing machine learning (ML) tools in pathology to assist with the microscopic review represents a compelling research area with many potential applications.

Previous studies have shown that ML can accurately identify and classify tumors in pathology images and can even predict patient prognosis using known pathology features, such as the degree to which gland appearances deviate from normal. While these efforts focus on using ML to detect or quantify known features, alternative approaches offer the potential to identify novel features. The discovery of new features could in turn further improve cancer prognostication and treatment decisions for patients by extracting information that isn’t yet considered in current workflows.

Today, we’d like to share progress we’ve made over the past few years towards identifying novel features for colorectal cancer in collaboration with teams at the Medical University of Graz in Austria and the University of Milano-Bicocca (UNIMIB) in Italy. Below, we will cover several stages of the work: (1) training a model to predict prognosis from pathology images without specifying the features to use, so that it can learn what features are important; (2) probing that prognostic model using explainability techniques; and (3) identifying a novel feature and validating its association with patient prognosis. We describe this feature and evaluate its use by pathologists in our recently published paper, “Pathologist validation of a machine-learned feature for colon cancer risk stratification”. To our knowledge, this is the first demonstration that medical experts can learn new prognostic features from machine learning, a promising start for the future of this “learning from deep learning” paradigm.

Training a prognostic model to learn what features are important

One potential approach to identifying novel features is to train ML models to directly predict patient outcomes using only the images and the paired outcome data. This is in contrast to training models to predict “intermediate” human-annotated labels for known pathologic features and then using those features to predict outcomes.

Initial work by our team showed the feasibility of training models to directly predict prognosis for a variety of cancer types using the publicly available TCGA dataset. It was especially exciting to see that for some cancer types, the model’s predictions were prognostic after controlling for available pathologic and clinical features. Together with collaborators from the Medical University of Graz and the Biobank Graz, we subsequently extended this work using a large de-identified colorectal cancer cohort. Interpreting these model predictions became an intriguing next step, but common interpretability techniques were challenging to apply in this context and did not provide clear insights.

Interpreting the model-learned features

To probe the features used by the prognostic model, we used a second model (trained to identify image similarity) to cluster cropped patches of the large pathology images. We then used the prognostic model to compute the average ML-predicted risk score for each cluster.

One cluster stood out for its high average risk score (associated with poor prognosis) and its distinct visual appearance. Pathologists described the images as involving high grade tumor (i.e., least-resembling normal tissue) in close proximity to adipose (fat) tissue, leading us to dub this cluster the “tumor adipose feature” (TAF); see next figure for detailed examples of this feature. Further analysis showed that the relative quantity of TAF was itself highly and independently prognostic.

A prognostic ML model was developed to predict patient survival directly from unannotated giga-pixel pathology images. A second image similarity model was used to cluster cropped patches of pathology images. The prognostic model was used to compute the average model-predicted risk score for each cluster. One cluster, dubbed the “tumor adipose feature” (TAF) stood out in terms of its high average risk score (associated with poor survival) and distinct visual appearance. Pathologists learned to identify TAF and pathologist scoring for TAF was shown to be prognostic.
 
Left: H&E pathology slide with an overlaid heatmap indicating locations of the tumor adipose feature (TAF). Regions highlighted in red/orange are considered to be more likely TAF by the image similarity model, compared to regions highlighted in green/blue or regions not highlighted at all. Right: Representative collection of TAF patches across multiple cases.

Validating that the model-learned feature can be used by pathologists

These studies provided a compelling example of the potential for ML models to predict patient outcomes and a methodological approach for obtaining insights into model predictions. However, there remained the intriguing questions of whether pathologists could learn and score the feature identified by the model while maintaining demonstrable prognostic value.

In our most recent paper, we collaborated with pathologists from the UNIMIB to investigate these questions. Using example images of TAF from the previous publication to learn and understand this feature of interest, UNIMIB pathologists developed scoring guidelines for TAF. If TAF was not seen, the case was scored as “absent”, and if TAF was observed, then “unifocal”, “multifocal”, and “widespread” categories were used to indicate the relative quantity. Our study showed that pathologists could reproducibly identify the ML-derived TAF and that their scoring for TAF provided statistically significant prognostic value on an independent retrospective dataset. To our knowledge, this is the first demonstration of pathologists learning to identify and score a specific pathology feature originally identified by an ML-based approach.

Putting things in context: learning from deep learning as a paradigm

Our work is an example of people “learning from deep learning”. In traditional ML, models learn from hand-engineered features informed by existing domain knowledge. More recently, in the deep learning era, a combination of large-scale model architectures, compute, and datasets has enabled learning directly from raw data, but this is often at the expense of human interpretability. Our work couples the use of deep learning to predict patient outcomes with interpretability methods, to extract new knowledge that could be applied by pathologists. We see this process as a natural next step in the evolution of applying ML to problems in medicine and science, moving from the use of ML to distill existing human knowledge to people using ML as a tool for knowledge discovery.

Traditional ML focused on engineering features from raw data using existing human knowledge. Deep learning enables models to learn features directly from raw data at the expense of human interpretability. Coupling deep learning with interpretability methods provides an avenue for expanding the frontiers of scientific knowledge by learning from deep learning.

Acknowledgements

This work would not have been possible without the efforts of coauthors Vincenzo L’Imperio, Markus Plass, Heimo Muller, Nicolò’ Tamini, Luca Gianotti, Nicola Zucchini, Robert Reihs, Greg S. Corrado, Dale R. Webster, Lily H. Peng, Po-Hsuan Cameron Chen, Marialuisa Lavitrano, David F. Steiner, Kurt Zatloukal, Fabio Pagni. We also appreciate the support from Verily Life Sciences and the Google Health Pathology teams – in particular Timo Kohlberger, Yunnan Cai, Hongwu Wang, Kunal Nagpal, Craig Mermel, Trissia Brown, Isabelle Flament-Auvigne, and Angela Lin. We also appreciate manuscript feedback from Akinori Mitani, Rory Sayres, and Michael Howell, and illustration help from Abi Jones. This work would also not have been possible without the support of Christian Guelly, Andreas Holzinger, Robert Reihs, Farah Nader, the Biobank Graz, the efforts of the slide digitization team at the Medical University Graz, the participation of the pathologists who reviewed and annotated cases during model development, and the technicians of the UNIMIB team.

Read More

TensorFlow with MATLAB

TensorFlow with MATLAB

Posted by Sivylla Paraskevopoulou, Product Marketing Manager at MathWorks

In this blog post I will show you how to use TensorFlow™ with MATLAB® for deep learning applications. More specifically, I will show you how to convert pretrained TensorFlow models to MATLAB models, convert models from MATLAB to TensorFlow, and use MATLAB and TensorFlow together.

These interoperability features, offered by MATLAB, enable collaboration between colleagues, teams, and communities that work on different platforms. Today’s post will show you how to use these features, and give you examples of when you might want to use them and how they connect the work of AI developers and engineers to enable domain-specific AI system design.

Introduction

What is MATLAB?

MATLAB is a computing platform tailored for engineering and scientific applications like data analysis, signal and image processing, control systems, wireless communications, and robotics. MATLAB includes a programming language, interactive apps, and tools for automatically generating embedded code. MATLAB is also the foundation for Simulink®, a block diagram environment for simulating complex multi-domain systems.

Similarly to Python® libraries, MATLAB provides toolboxes for achieving different goals. More specifically, MATLAB provides the Deep Learning Toolbox™ for deep learning workflows. Deep Learning Toolbox provides a framework for designing and implementing deep neural networks with algorithms, pretrained models, and apps. It can be combined with domain-specific toolboxes in areas such as computer vision, signal processing, and audio applications.

Flow chart depicting the correlation between Python and MATLAB as programming languages to TensorFlow and Deep Learning Toolbox as Deep Learning Platforms respectively
Figure:Python and MATLAB are programming languages; Python can leverage the TensorFlow library for deep learning workflows, while MATLAB provides the Deep Learning Toolbox.

Why TensorFlow and MATLAB?

Both TensorFlow and MATLAB are widely used for deep learning. Many MATLAB customers are interested in integrating TensorFlow models into their AI design, for creating customized tools, simulating complex systems, or optimizing data modeling. TensorFlow users can also leverage MATLAB to generate, analyze, and visualize training data, post-process model output, and deploy trained neural networks to desktop, web apps, or embedded hardware.

For example, engineers have integrated TensorFlow models into Simulink (MATLAB simulation environment) to develop a battery state-of charge estimator for an electric vehicle and scientists have used MATLAB with TensorFlow to build a custom toolbox for reading climate data. For more details on these examples, see Integrate TensorFlow Model into Simulink for Simulation and Code Generation and Climate Data Store Toolbox for MATLAB.

What’s Next?

Now you have started to see the benefits of using TensorFlow with MATLAB. Let’s get into more of the technical details on how to use TensorFlow with MATLAB in the following three sections.

You will see how straightforward it is to use TensorFlow with MATLAB and why I (and other engineers) like having the option to combine them for deep learning applications. Why choose when you don’t have to?

Convert Model from TensorFlow to MATLAB

Flow chart showing the conversion of a model `importTensorFlowNetwork` from TensorFlow to MATLAB

You can convert a pretrained model from TensorFlow to MATLAB by using the MATLAB function importTensorFlowNetwork. A scenario when this function might be useful; a data scientist creates a model in TensorFlow and then an engineer integrates this model into an AI system created in MATLAB.

We will show you here how to import an image classification TensorFlow model into MATLAB and (1) use it for prediction and (2) integrate it into an AI system.

Convert model from TensorFlow to MATLAB
Before importing a pretrained TensorFlow model into MATLAB network, you must save the TensorFlow model in the SavedModel format.

Python code:

import tensorflow as tf tf.saved_model.save(model.modelFolder)
Then, you can import the TensorFlow model into MATLAB by using the MATLAB function importTensorFlowNetwork. You only need one line of code!

MATLAB code:

modelFolder = “EfficientNetV2L”;net = importTensorFlowNetwork(modelFolder,OutputLayerType=”classification”)

Classify Image

Read the image you want to classify. Resize the image to the input size of the network.
MATLAB code:
Im = imread(“mydoc.jpg”);InputSize = net.Layers(1).InputSize;Im = imresize(Im,InputSize(1:2));

Before you classify the image, the image might require further preprocessing or changing the dimension ordering from TensorFlow to MATLAB. To learn more and get answers to common questions about importing models, see Tips on Importing Models from TensorFlow.

Predict and plot image with classified label. MATLAB code:

label = classify(net,Im); imshow(Im) title("Predicted label: " + string(label));

Image of a pomeranian with text 'Predicted label: Pomeranian'

To see the full example on how to import an image classification TensorFlow model into MATLAB and use the model for prediction, see Image Classification in MATLAB Using Converted TensorFlow Model. To learn more on importing TensorFlow models into MATLAB, check out the blog post Importing Models from TensorFlow, PyTorch, and ONNX.

Transfer Learning
A common reason to import a pretrained TensorFlow model into MATLAB is to perform transfer learning. Transfer learning is the process of taking a pretrained deep learning model and fine-tuning to fit the model to a new problem. For example, you are doing object detection in MATLAB, and you find a TensorFlow model that can improve the detection accuracy, but you need to retrain the model with your data. Using transfer learning is usually faster and easier than training a network from scratch.

In MATLAB, you can perform transfer learning programmatically or interactively by using the Deep Network Designer (DND) app. It’s easy to do model surgery (prepare a network to train on new data) with a few lines of MATLAB code by using built-in functions that replace, remove, or add layers at any part of the network architecture. For an example, see Train Deep Learning Network to Classify New Images. With DND, you can interactively prepare the network for training, train the network, export the retrained network, and then use it for the new task. For an example, see Transfer Learning with Deep Network Designer.

Screen grab showing editing of a pretrained model in Deep Network Designer
Figure:Edit pretrained model with a low-code app for transfer learning.

AI System Design in Simulink
Simulink is a block diagram environment used to design systems with multi-domain models, simulate systems before moving to hardware, and deploy without writing code. Simulink users have expressed interest in the ability to bring in AI models and simulate entire systems. In fact, this is very easy to do with Simulink blocks.

In the following figure, you can see a very simple AI system that reads and classifies an image using an imported TensorFlow model. Essentially, the Simulink system executes the same workflow shown above. To learn more about how to design and simulate such a system, see Classify Images in Simulink with Imported TensorFlow Network.

Screen grab of using image_classifier in Simulink
Figure:Simple Simulink system for predicting image label

Of course, Simulink capabilities extend far beyond classifying an image of my dog after I gave him a bad haircut and trying to predict his breed. For example, you can use deep neural networks inside a Simulink model to perform lane and vehicle detection. To learn more, see Machine Learning with Simulink and NVIDIA Jetson.

Moving image showing lane and vehicle detection output in Simulink
Lane and vehicle detection in Simulink using deep learning

Convert Model from MATLAB to TensorFlow

Flow chart showing conversion of `exportnetworktoTensorFlow` from MATLAB to TensorFlow

You can convert a trained or untrained model from MATLAB to TensorFlow by using the MATLAB function exportNetworkToTensorFlow. In MATLAB, we refer to trained models as networks and to untrained models as layer graphs. The Pretrained Deep Neural Networks documentation page shows you all the options of how to get a pretrained network. You can alternatively create your own network.

Create Untrained Model

Create a bidirectional long short-term memory (BiLSTM) network to classify sequence data. An LSTM network takes sequence data as input and makes predictions based on the individual time steps of the sequence data.

Architecture of LSTM model
Figure:Architecture of LSTM model

MATLAB code:

inputSize = 12;numHiddenUnits = 100;numClasses = 9; layers = [ sequenceInputLayer(inputSize) bilstmLayer(numHiddenUnits,OutputMode="last") fullyConnectedLayer(numClasses) softmaxLayer]; lgraph = layerGraph(layers);

To learn how to create the training data set for this model, see Export Untrained Layer Graph to TensorFlow. An important step is to permute the sequence data from the Deep Learning Toolbox ordering (CSN) to the TensorFlow ordering (NSC), where C is the number of features of the sequence, S is the sequence length, and N is the number of sequence observations. To learn more about the dimension ordering of the input data for different deep learning platforms, see Input Dimension Ordering.

Export Model to TensorFlow

Export the layer graph to TensorFlow. The exportNetworkToTensorFlow function saves the TensorFlow model in the Python package myModel.

MATLAB code:

exportNetworkToTensorFlow(lgraph,”myModel”)

Train TensorFlow Model

Run the following code in Python to load the exported model from the Python package myModel. You can also compile and train the exported model in Python. To train the model, use the training data in training_data.mat that you previously created.

Python code:

import myModel model = myModel.load_model()

Load training data.

Python code:

import scipy.io as sio data = sio.loadmat("training_data.mat") XTrain = data["XTrain"] YTrain = data["TTrain"]

Compile and train model.

Python code:

model.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy", metrics=["accuracy"]) r = model.fit(XTrain, YTrain, epochs=100, batch_size=27)

To learn more on how to export MATLAB models to TensorFlow, check out our blog post.

moving image showing how to export an untrained model from MATLAB to TensorFlow and train on Google Colab
Export untrained model from MATLAB to TensorFlow and train on Google Colab

Run TensorFlow and MATLAB Together

TensorFlow + MATLAB

You ‘ve seen so far how to convert models between TensorFlow and MATLAB. You also have the option to use TensorFlow and MATLAB together (run from the same environment) by either calling Python from MATLAB or calling MATLAB from Python. This way you can take advantage of the best capabilities from each environment by creating an integrated workflow.

For example, TensorFlow might offer newer models but you like MATLAB apps for labeling data, or you might want to train your TensorFlow model under multiple initial conditions using the Experiment Manager app (see example).

Call Python from MATLAB

Instead of importing a TensorFlow model into MATLAB you have the option to directly use the TensorFlow model in your MATLAB workflow by calling Python from MATLAB. You can access Python libraries by adding the py. prefix and execute any Python statement from MATLAB by using the pyrun function. For an example that shows how to call a TensorFlow model in MATLAB, see Image Classification in MATLAB Using TensorFlow.

A use case that this option might be useful is the following. You have created an object detection workflow in MATLAB. You want to quickly compare TensorFlow models to find the best suited model for your task before importing the best suited model into MATLAB. Call TensorFlow from MATLAB to run an inference test quickly.

Call MATLAB from Python

You can use MATLAB Engine API to call MATLAB from a Python environment and thus, integrate MATLAB tools and apps into your existing Python workflow. MATLAB is convenient for labeling and exploring data for domain-specific (e.g., radar, wireless, audio, and biomedical) signal processing using low-code apps. For an example, see our GitHub repo Co-Execution for Training a Speech Command Recognition System.

Conclusion

The bottom line is that both TensorFlow and MATLAB offer excellent tools that enable applying deep learning to your application. MATLAB integrates with TensorFlow to take full advantage of these tools and enable access to hundreds of deep learning models. Choose between the interoperability features (convert models between TensorFlow and MATLAB, or use TensorFlow and MATLAB together) to create a deep learning workflow that bridges platforms and teams.

If you have questions about how, when, and why to use the described interoperability, email me at sparaske@mathworks.com. I would love to hear more about your workflow and discuss how working across deep learning platforms accelerates the application of deep learning to your domain.

Read More

GPT-4

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.OpenAI Blog