Page Nav

HIDE

Breaking News:

latest

Ads Place

Deep Learning on your phone: PyTorch Lite Interpreter for mobile platforms

https://ift.tt/3DfVJjx The PyTorch Deep Learning framework supports seamlessly running server trained CNN and other ML models on mobile dev...

https://ift.tt/3DfVJjx

The PyTorch Deep Learning framework supports seamlessly running server trained CNN and other ML models on mobile devices. This article describes how to optimize and run your server trained models on mobile devices.

Photo by Rodion Kutsaev on Unsplash

PyTorch is a Deep Learning framework for training and running Machine Learning (ML) Models, accelerating the speed from research to production.

Typically, one would train a model (either on CPU or GPU) on a powerful server, and then take the pre-trained model and deploy it on a mobile platform (which is typically more resource constrained). This article will focus on what support PyTorch provides for running your server-trained models on a mobile device/platform.

PyTorch Mobile currently supports deploying pre-trained models for inference on both Android and iOS platforms.

Steps at a high level

  1. Train your model on server (either on CPU or GPU) — this article won’t discuss details about training a model. You can find more information about training with PyTorch here, here, and here. Here’s a great article about getting started with PyTorch.
  2. [Optional] Optimize your trained model for mobile inference
  3. Save your model in the lite interpreter format
  4. Deploy in your mobile app using PyTorch Mobile API
  5. Profit!

Steps in Detail

The rest of this article assumes you have a pre-trained .pt model file, and the examples below will use a dummy model to walk through the code and the workflow for deep learning using PyTorch Lite Interpreter for mobile platforms.

Create/Train a model on the server

Check that it runs, and produces a reasonable output

Should print:

tensor([ 4., 11., 18.])

Generate a scripted model

[Optional] Optimize the model for mobile inference

See this page for details about the mobile optimizer.

Save the model for mobile inference

Load and run the mobile model for inference (on a mobile device)

The code below uses the C++ API directly. Depending on your needs, you may choose to use the Android (Java) or iOS APIs. You can also see this article to get started with using the C++ API to run model inference on your development machine in a x-platform manner.

What does optimize_for_mobile(model) do exactly?

The mobile optimization pass performs many changes to the model’s graph. You can find the source code for the mobile optimization pass here. Below is a non-comprehensive list of such optimizations.

  1. Module Freezing: This promotes attributes to constants for attributes that won’t be mutated by the model. Useful in subsequent passes.
  2. Function inlining: If a method is called from the model’s forward() method, then the called methods are inlined in the optimized module to eliminate the function call overhead.
  3. Operator Fusion: For example fusing conv-batchnorm or conv-relu, etc… This allows re-using the data in intermediate layers once it is brought into the CPU caches.
  4. Remove Dropout: Dropout is a regularization method that approximates training a large number of neural networks with different architectures in parallel. Once a model is put in eval() only mode, this is not needed and removing this can improve inference performance of the model.

Comparing The Model Graph Before and After Mobile Optimization

Before the Optimization

The original model has 2 methods: forward() and helper(). forward() calls helper(). Also, the tensor t1 is a model attribute.

scripted.forward.graph

will print

scripted.helper.graph

will print

After the Optimization

The optimized model has just one method, forward(). Also, the tensor t1 is a model constant.

optimized_model.forward.graph

will print

How is a lite interpreter model different from a TorchScript model?

A lite interpreter model file is a regular TorchScript model file with mobile specific bytecode added to it. You can always load a mobile model as a normal PyTorch TorchScript model, and you can also load it as a lite-interpreter model.

When saved for lite-interpreter (mobile platforms), PyTorch saves additional bytecode for the model’s graph, which is more efficient to execute on device compared to TorchScript. PyTorch also uses lesser binary size in the compiled app relative to running TorchScript.

Here’s how to view the generated bytecode.pkl file.

Conclusion

PyTorch now supports inference on mobile platforms. It is easy to take a server-trained model and convert it to run optimized inference on a mobile platform. Model semantics are preserved in both TorchScript and Lite Interpreter format.

References

  1. PyTorch Mobile
  2. Understanding PyTorch with an example: a step-by-step tutorial
  3. Introduction to PyTorch For Deep Learning
  4. Getting Started With PyTorch
  5. The Most Important Fundamentals of PyTorch you Should Know
  6. PyTorch for Deep Learning: Get Started
  7. What is dropout?
  8. PyTorch Operator Fusion
  9. PyTorch Mobile Optimizer

Deep Learning on your phone: PyTorch Lite Interpreter for mobile platforms was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Towards Data Science - Medium https://ift.tt/3kwVWas
via RiYo Analytics

ليست هناك تعليقات

Latest Articles