UHG
Search
Close this search box.

6 Most Exciting New Updates in PyTorch 2.1 

PyTorch 2.1 released a host of updates and improved their library. They also added support for training and inference of Llama 2 models powered by AWS Inferentia.

Share

Table of Content

PyTorch recently released a new update, PyTorch 2.1. This new update offers automatic dynamic shape support in compiling and distributing checkpoints for parallelly saving and loading distributed training jobs on multiple ranks, alongside providing support for NumPy API. 

In addition to this, it has released the beta version updates to PyTorch domain libraries for TorchAudio and TorchVision. Lastly, the community has added support training and inference of Llama 2 models powered by AWS Inferentia2.

This will make running Llama 2 models on PyTorch quicker, cheaper and more efficient. This release was the effort of 784 contributors with 6,682 commits. 

New Features of PyTorch 2.1 

  • The new feature updates include the addition of  AArch64 wheel builds which would allow devices with 64-bit ARM architecture to use PyTorch. 
  • Compile PyTorch on M1 natively instead of cross compiling it from x86 which causes performance issues. Compiling PyTorch natively on M1 would improve performance and make it easier to use it directly on Apple M1 processors. 

Improvements 

  • Python Frontend: the PyTorch.device can now be used as a context manager to change the default device. This is a simple but powerful feature that can make your code more concise and readable.
  • Optimisation: NAdamW is a new optimiser that is more stable and efficient than the previous AdamW optimiser. NAdamW is an improved version of AdamW, stands out for its stability and efficiency, making it a superior choice for faster and more accurate model training.
  • Sparse Frontend:  Semi-structured sparsity is a new type of sparsity that can be more efficient than traditional sparsity patterns on NVIDIA Ampere and newer architectures. 

PyTorch’s TorchAudio v2.1 Library

The new update has introduced key features like the AudioEffector API for audio waveform enhancement and Forced Alignment for precise transcript-audio synchronisation. The addition of TorchAudio-Squim models allows estimation of speech quality metrics, while a CUDA-based CTC decoder improves automatic speech recognition efficiency. 

In the realm of AI music, new utilities enable music generation using AI techniques, and updated training recipes enhance model training for specific tasks. However, users need to adapt to changes like updated FFmpeg support (versions 6, 5, 4.4) and libsox integration, impacting audio file handling.

These updates expand PyTorch’s capabilities, making audio processing and AI music generation more efficient and precise. With enhanced alignment, speech quality assessment, and faster speech recognition, TorchAudio v2.1 is a valuable upgrade. 

TorchRL Library 

PyTorch has enhanced the RLHF components making it easy for developers to build an RLHF training loop with limited RL knowledge. TensorDict enables an easy interaction between datasets (say, HF datasets), alongside RL models. It has added new algorithms, where it offers a wide range of solutions for offline RL training, making it more data efficient. 

Plus, TorchRL can now work directly with hardware, like robots, for seamless training and deployment. It has added essential algorithms and expanded its supported environments, for faster data collection and value function execution.

TorchVision Library

This new library in PyTorch is now 10%-40% faster. PyTorch achieved this thanks to 2x-4x improvements made to the second version of Resize. “This is mostly achieved thanks to 2X-4X improvements made to v2.Resize(), which now supports native uint8 tensors for Bilinear and Bicubic mode. Output results are also now closer to PIL’s!,” reads the blog. 

Additionally, TorchVision now supports CutMix and MixUp augmentations. The previous beta transforms are now stabilised, offering improved performance for tasks like segmentation and detection. 

Llama 2 Deployment with AWS Inferentia2 using TorchServe

Pytorch for the first time has deployed the Llama 2 model using inference using Transformer Neuron using Torch Serve. This is done through Amazon SageMaker on EC2 Inferentia2 instances. This features 3x higher compute with 4x more accelerator memory resulting in up to 4x higher throughput, and up to 10x lower latency. 

The optimization techniques from AWS Neuron SDK enhance performance while keeping costs low. The Llama deployment on PyTorch also shares the benchmarking results. 

The framework is integrated with Llama 2 through AWS Transformers Neuron, enabling seamless usage of Llama-2 models for optimised inference on Inf2 instances.

📣 Want to advertise in AIM? Book here

Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.
Flagship Events
Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.