There has been a lot of buzz around the newest programming language, Bend. Discussion forums have been pitting it against CUDA, the go-to choice for experienced developers. However, with CUDA’s restrictions and worthy alternatives, Bend could be worth the excitement.
Bend is a high-level, massively parallel programming language designed to simplify parallel computing. Unlike traditional low-level languages like CUDA and Metal, Bend offers a Python-like syntax that makes parallel programming more accessible and easy to developers without deep expertise in concurrent programming.
“Bend automatically parallelises code, ensuring that any code that can run in parallel will do so without requiring explicit parallel annotations. As such, while Bend empowers developers with powerful parallel constructs, it maintains a clean and expressive syntax,” Vinay Konanur, VP – emerging technologies, UNext Learning, told AIM.
Why Not Cuda, Then?
One might wonder how it measures up against low-level languages like CUDA. While CUDA is a mature, low-level language that provides precise control over hardware, Bend aims to abstract away the complexities of parallel programming.
Bend is powered by HVM2 (Higher-Order Virtual Machine 2), a successor of HVM, letting you run high-level programming on massively parallel hardware, like GPUs, with near-ideal speedup.
A user mentioned that Bend is nowhere close to the performance of manually optimised CUDA. “It isn’t about peak performance,” he added.
Bend is based on the Rust foundation, which means you can expect top-notch performance through simple Python-like syntax. Konanur also revealed that Bend’s interoperability with Rust libraries and tools provides access to a rich ecosystem.
“Developers can leverage the existing Rust code and gradually transition to Bend,” said Konanur.
Moreover, he believes that the performance of a programming language on a specific GPU can depend on several factors, including the specific GPU, the nature of the task, and how well the task can be parallelized.
“So, even if Bend were to support AMD GPUs in the future, the performance could vary depending on these factors,” Konanur added.
Scalability and Parallelisation
Bend’s official documentation suggests that as long as the code isn’t “helplessly sequential”, Bend will use thousands of threads to run it in parallel. User demos have proved the same.
A recent demo showed a 57x speedup going from 1 CPU thread to 16,000 GPU threads on an NVIDIA RTX 4090. This is a perfect example of how Bend runs on massively parallel hardware like GPUs and provides near-linear speedup based on the number of cores available.
Focusing on parallelisation, Bend is not limited to any specific domain, like array operations. It can scale any concurrent algorithm that can be expressed using recursive data types and folds, from shaders to actor models.
Max Bernstein, a software developer, argues that Bend has different scaling laws compared to the traditional languages. While Bend may be slower than other languages in single-threaded performance, it can scale linearly with the number of cores for parallelisable workloads.
How about Other Programming Languages?
A Reddit user, when asked how different Bend is from CuPy or Numba, answered, “It massively reduces the amount of work you need to do in order to make your general purpose program parallelisable, whereas CuPy and Numba (as far as I know) only parallelise programmes that deal with multidimensional arrays.”
Further, users have also observed that Bend is not focused on giving you peak performance like the manually optimised CUDA code but rather on simplifying code execution by using Python/Haskell-like code on GPUs, which wasn’t possible earlier.
When you compare Bend with Mojo, a programming language that can be executed on GPUs and provides Python-like syntax, Bend focuses more on parallelism across all computations. Mojo is geared more towards traditional AI/ML workloads involving linear algebra.
But unlike Mojo, Bend is completely open-source which means users can take and modify the code as per their convenience. Also, they can contribute to the project as it ensures more transparency.