UHG
Search
Close this search box.

Now You Can Run Llama 3.1 405B on Your Computer Using Peer-to-Peer Network

Nidum.AI plans to use 2000+ Apple computers to run Llama 3.1 on P2P network.

Share

Peer-to-Peer-Network-for-Running-LLMs-

Illustration by Nikhil Kumar

Not everyone can access highly spec’d machines capable of running LLMs locally, which often require substantial computational power and memory. 

“GPUs like H100s, which are essential to train and run LLMs efficiently on a large scale, are beyond the budgets of most startups. And running models like Llama 3.1 405B is unthinkable for regular people. 

“Renting GPUs and running them on a single cluster or using peer-to-peer connections is one of the easiest ways to do it,” Arjun Reddy, the co-founder of Nidum.AI, told AIM.

P2P technology is already used in blockchains, which is a testimony to how secure the network can be. P2P technology came into the limelight for the first time in 1999, when Napster used P2P technology to decentralise music, allowing users to download and host music files from their own computers.

Reddy further explained the approach they follow for the P2P technology. It starts with fine-tuning the existing model for specific needs, which is then divided into hundreds of small parts and described to the P2P network. 

A layer of encryption is used to safeguard data. 

To showcase the flexibility of P2P technology, Reddy is about to host the largest decentralised AI event later this week where hundreds of Apple computers will be used to run Llama 3.1 through the P2P network. The idea is to demonstrate the importance of decentralised networks to run LLMs. 

The Promise of Peer-to-Peer Network

P2P networks, popularised by file-sharing systems like BitTorrent, distribute tasks across multiple nodes, each contributing a portion of the overall workload. 

Applying this concept to AI, a P2P network could theoretically distribute the training of an LLM across numerous consumer-grade GPUs, making it possible for individuals and smaller organisations to participate in AI development.

A research paper titled ‘A Peer-to-Peer Decentralised Large Language Models’ discusses a provably guaranteed federated learning (FL) algorithm designed for training adversarial deep neural networks, highlighting the potential of decentralised approaches for LLMs.

A study by Šajina Robert et al. explored multi-task peer-to-peer learning using an encoder-only Transformer model. This approach demonstrated that collaborative training in a P2P network could effectively handle multiple NLP tasks, highlighting the versatility of such systems.

Another significant contribution comes from Sree Bhargavi Balija and colleagues, who investigated building communication-efficient asynchronous P2P federated LLMs with blockchain technology. Their work emphasises the importance of minimising communication overhead and ensuring data integrity in decentralised networks.

But There are Challenges… 

Despite the promise, significant challenges hinder the practical implementation of P2P networks for LLMs. One major issue is the bandwidth and latency required for efficient training. 

Training LLMs involves transferring vast amounts of data between nodes, which can be prohibitively slow on consumer-grade networks. One Reddit user pointed out that even on a 10-gigabit network, the data transfer rates would be insufficient compared to the high-speed interconnects used in dedicated GPU clusters.

Moreover, the synchronisation required for distributed gradient descent, a common optimisation algorithm in training neural networks, adds another layer of complexity. 

Traditional training methods rely on tight synchronisation between nodes, which is difficult to achieve in a decentralised setting. 

A research paper on the review of synchronous stochastic gradient descent (Sync-SGD) highlights the impact of stragglers and high latency on the efficiency of distributed training. 

… And Solutions

Despite these challenges, ongoing efforts exist to make decentralised AI a reality. Projects like Petals and Hivemind are exploring ways to enable distributed inference and training of LLMs. 

Petals, for example, aims to facilitate the distributed inference of large models by allowing users to contribute their computational resources in exchange for access to the network’s collective AI capabilities.

Additionally, the concept of federated learning offers a more feasible approach to decentralised AI. 

In federated learning, multiple nodes train a model on their local data and periodically share their updates with a central server, which aggregates the updates to improve the global model. 

This method preserves data privacy and reduces the need for extensive data transfer between nodes. It could also be a practical solution for decentralised AI, especially in privacy-sensitive applications like medical machine learning.

📣 Want to advertise in AIM? Book here

Picture of Sagar Sharma

Sagar Sharma

A software engineer who loves to experiment with new-gen AI. He also happens to love testing hardware and sometimes they crash. While reviving his crashed system, you can find him reading literature, manga, or watering plants.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.