WebGPU:Unlocking modern GPU access in the browser

WebGPU:Unlocking modern GPU access in the browser

WebGPU:在浏览器中解锁现代 GPU 访问

了解 WebGPU 如何释放 GPU 的力量,实现更快的机器学习性能和更好的图形渲染.

本文翻译自WebGPU: Unlocking modern GPU access in the browser

新的 WebGPU API 在图形和机器学习工作负载中释放了巨大的性能提升.本文探讨了 WebGPU 是如何改进目前的 WebGL 解决方案的,并对未来的发展进行了窥探.但首先,让我们提供一些背景,说明为什么要开发 WebGPU.

关于 WebGPU 的背景

WebGL 于 2011 年在 Chrome 中发布.通过允许 Web 应用程序利用 GPU,WebGL 可以在 Web 上实现令人惊叹的体验——从 Google Earth 到交互式音乐视频,再到 3D 房地产漫游等等.WebGL 基于 1992 年首次开发的OpenGL系列 API.那是很久以前的事了!你可以想象,GPU 硬件自那时以来已经发生了巨大的变化.

WebGL landed in Chrome in 2011. By allowing web applications to take advantage of GPUs, WebGL enables amazing experiences on the web—from Google Earth, to interactive music videos, to 3D real-estate walkthroughs and more. WebGL was based on the OpenGL family of APIs first developed in 1992. That’s a long time ago! And you can imagine that GPU hardware has evolved significantly since that time.

为了跟上这种演变,开发了一种新的 API,以更有效地与现代 GPU 硬件交互.像Direct3D 12,MetalVulkan这样的 API.这些新的 API 支持了 GPU 编程中的新的和苛刻的用例,比如机器学习的爆炸和渲染算法的进步.WebGPU 是 WebGL 的继任者,将这一新型现代 API 的进步带到 Web 上.

To keep up with this evolution, a new breed of APIs were developed to more efficiently interact with modern GPU hardware. APIs like Direct3D 12, Metal, and Vulkan. These new APIs have supported new and demanding use cases for GPU programming such as the explosion in machine learning and advances in rendering algorithms. WebGPU is the successor to WebGL bringing the advancements of this new class of modern APIs to the Web.

WebGPU 在浏览器中解锁了许多新的 GPU 编程可能性.它更好地反映了现代 GPU 硬件的工作方式,同时为未来更先进的 GPU 功能奠定了基础.这个 API 自 2017 年以来一直在W3C 的”Web GPU”小组中进行开发,并且是苹果,谷歌,Mozilla,微软和英特尔等许多公司之间的合作.现在,在经过 6 年的努力之后,我们很高兴地宣布,Web 平台上最大的增强功能之一终于可用了!

WebGPU unlocks a lot of new GPU programming possibilities in the browser. It better reflects how modern GPU hardware works, while also laying a foundation for more advanced GPU capabilities in the future. The API has been baking in the W3C’s “GPU for the Web” group since 2017, and is a collaboration between many companies such as Apple, Google, Mozilla, Microsoft, and Intel. And now after 6 years of work, we’re excited to announce that one of the biggest additions to the Web platform is finally available!

WebGPU 今天在 Chrome 113 上可用于 ChromeOS,macOS 和 Windows,其他平台即将推出.非常感谢其他 Chromium 贡献者和英特尔,他们帮助实现了这一点.

WebGPU is available today in Chrome 113 on ChromeOS, macOS, and Windows, with other platforms coming soon. A huge thank you to other Chromium contributors and Intel in particular who helped make this happen.

现在让我们来看看 WebGPU 能够实现的一些令人兴奋的用例.

Now let’s take a look at some of the exciting use cases WebGPU enables.

为渲染释放新的 GPU 工作负载

WebGPU 的功能,如计算着色器,使得可以将新类的算法移植到 GPU 上.例如,可以为场景添加更多的动态细节,模拟物理现象等等!甚至有些工作负载以前只能在 JavaScript 中完成,现在可以移动到 GPU 上了.

WebGPU features such as compute shaders enable new classes of algorithms to be ported on the GPU. For example, algorithms that can add more dynamic details to scenes, simulate physical phenomenons, and more! There are even workloads that previously could only be done in JavaScript that can now be moved to the GPU.

下面的视频显示了 Marching Cubes 算法被用来三角化这些元球的表面.在视频的前 20 秒,当算法在 JavaScript 中运行时,它很难跟上页面,只能以 8FPS 的速度运行,导致动画不连贯.为了在 JavaScript 中保持性能,我们需要大大降低细节水平.

The following video shows the marching cubes algorithm being used to triangulate the surface of these metaballs. In the first 20 seconds of the video, the algorithm, when it’s running in JavaScript, struggles to keep up with the page only running at 8 FPS resulting in janky animation. To keep it performant in JavaScript we would need to lower the level of details a lot.

当我们将相同的算法移动到计算着色器中时,这种差异就显而易见了,这在视频的 20 秒后可以看到.性能显著提高,页面现在以平滑的 60FPS 运行,还有很多性能余地用于其他效果.此外,页面的主 JavaScript 循环完全被释放,用于其他任务,确保页面的交互保持响应.

It’s a night and day difference when we move the same algorithm to a compute shader, which is seen in the video after 20 seconds. The performance improves dramatically with the page now running at a smooth 60 FPS and there’s still a lot of performance headroom for other effects. In addition the page’s main JavaScript loop is completely freed up for other tasks, ensuring that interactions with the page stay responsive.

WebGPU 还使复杂的视觉效果成为可能.在下面的示例中,使用流行的Babylon.js库创建,海洋表面完全在 GPU 上模拟.逼真的动态是通过将许多独立的波添加到彼此上来创建的.但是直接模拟每个波将是太昂贵了.

WebGPU also enables complex visual effects that were not practical before. In the following example, created in the popular Babylon.js library, the ocean surface is being simulated entirely on the GPU. The realistic dynamics are created from many independent waves being added to each other. But simulating each wave directly would be too expensive.

这就是为什么演示使用了一种称为快速傅里叶变换的高级算法.它不是将所有的波都表示为复杂的位置数据,而是使用谱数据,这样就可以更有效地执行计算.然后,每一帧都使用傅里叶变换将谱数据转换为表示波高的位置数据.

That’s why the demo uses an advanced algorithm called Fast Fourier Transform. Instead of representing all the waves as complex positional data, this uses the spectral data which is much more efficient to perform computations. Then each frame uses the Fourier Transform to convert from spectral data to the positional data that represents the height of the waves.

更快的 ML 推理

WebGPU 还可以加速机器学习,这在最近几年已经成为 GPU 的主要用途.

WebGPU is also useful to accelerate machine learning, which has become a major use of GPUs in recent years.

长期以来,创意开发人员一直在重新利用 WebGL 的渲染 API 来执行非渲染操作,例如机器学习计算.但是,这需要绘制三角形的像素作为启动计算的一种方式,并且需要在纹理中仔细打包和解包张量数据,而不是更通用的内存访问.

For a long time, creative developers have been repurposing WebGL’s rendering API to perform non-rendering operations such as machine learning computations. However, this requires drawing the pixels of triangles as a way to initiate the computations, and carefully packing and unpacking tensor data in texture instead of more general purpose memory accesses.

使用 WebGL 这种方式需要开发人员将其代码笨拙地符合仅用于绘图的 API 的期望.再加上缺乏基本功能,比如计算之间的共享内存访问,这导致了重复的工作和次优的性能.

Using WebGL in this way requires developers to awkwardly conform their code to the expectations of an API designed only for drawing. Combined with the lack of basic features like shared memory access between computations, this leads to duplicate work and suboptimal performance.

计算着色器是 WebGPU 的主要新功能,它消除了这些痛点.计算着色器提供了一种更灵活的编程模型,利用了 GPU 的大规模并行性,同时不受绘图操作严格结构的约束.

Compute shaders are WebGPU’s primary new feature and remove these pain points. Compute shaders offer a more flexible programming model that takes advantage of the GPU’s massively parallel nature while not being constrained by the strict structure of rendering operations.

计算着色器为在着色器工作组内共享数据和计算结果提供了更多的机会,以获得更好的效率.这可以比以前尝试使用 WebGL 达到相同目的的效率提高很多.

Compute shaders give more opportunity for sharing data and computation results within groups of shader work for better efficiency. This can lead to significant gains over previous attempts to use WebGL for the same purpose.

作为这种效率提升的一个例子,TensorFlow.js 中图像扩散模型的一个初始移植显示,当从 WebGL 移动到 WebGPU 时,各种硬件上的性能提高了 3 倍.在测试的一些硬件上,图像的渲染时间不到 10 秒.而且,因为这是一个早期的移植,我们相信在 WebGPU 和 TensorFlow.js 中都有更多的改进空间!请查看Google I/O 会议上的 Web ML 有什么新功能?

As an example of the efficiency gains this can bring, an initial port of an image diffusion model in TensorFlow.js shows a 3x performance gain on a variety of hardware when moved from WebGL to WebGPU. On some of the hardware tested the image was rendered in under 10 seconds. And because this was an early port, we believe there are even more improvements possible in both WebGPU and TensorFlow.js! Check out What’s new with Web ML in 2023? Google I/O session.

但是 WebGPU 不仅仅是为 Web 带来 GPU 功能.

But WebGPU is not only about bringing GPU features to the web.

首先为 JavaScript 设计

使这些用例成为可能的功能已经在特定于平台的桌面和移动开发人员中可用了一段时间了,我们的挑战是以一种感觉像是 Web 平台的自然部分的方式来暴露它们.

The features that enable these use cases have been available to platform-specific desktop and mobile developers for a while, and it’s been our challenge to expose them in a way that feels like a natural part of the web platform.

WebGPU 是在 WebGL 开发人员做出了令人惊叹的工作的十多年的经验的基础上开发的.我们能够将他们遇到的问题,他们遇到的瓶颈以及他们提出的问题都汇集到这个新的 API 中.

WebGPU was developed with the benefit of hindsight from over a decade of developers doing amazing work with WebGL. We were able to take the problems they encountered, the bottlenecks they hit, and the issues they raised and funneled all of that feedback into this new API.

我们看到 WebGL 的全局状态模型使得创建强大的,可组合的库和应用程序变得困难和脆弱.因此,WebGPU 大大减少了开发人员需要跟踪的状态量,同时向 GPU 发送命令.

We saw that WebGL’s global state model made creating robust, composable libraries and applications difficult and fragile. So WebGPU dramatically reduces the amount of state that developers need to keep track of while sending the GPU commands.

我们听说调试 WebGL 应用程序很痛苦,所以 WebGPU 包括了更灵活的错误处理机制,不会降低性能.我们不遗余力地确保您从 API 中获得的每条消息都易于理解和可操作.

We heard that debugging WebGL applications was a pain, so WebGPU includes more flexible error handling mechanisms that don’t tank your performance. And we’ve gone out of our way to ensure that every message you get back from the API is easy to understand and actionable.

我们还看到,经常性地进行太多的 JavaScript 调用的开销是复杂的 WebGL 应用程序的瓶颈.因此,WebGPU API 的交互性更少,因此您可以用更少的函数调用完成更多的工作.我们专注于在前期进行重量级验证,使关键的绘制循环尽可能精简.我们还提供了新的 API,比如Render Bundles,允许您提前记录大量的绘制命令,并通过单个调用重放它们.

We also saw that frequently the overhead of making too many JavaScript calls was a bottleneck for complex WebGL applications. As a result, the WebGPU API is less chatty, so you can accomplish more with fewer function calls. We focus on performing heavyweight validation up front, keeping the critical draw loop as lean as possible. And we offer new APIs like Render Bundles, which allow you to record large numbers of drawing commands in advance and replay them with a single call.

为了证明像渲染包这样的功能可以产生多大的差异,这里有另一个来自 Babylon.js 的演示.他们的 WebGL 2 渲染器可以执行所有的 JavaScript 调用来渲染这个画廊场景大约 500 次/秒.这很好!

To demonstrate what a dramatic difference a feature like render bundles can make, here’s another demo from Babylon.js. Their WebGL 2 renderer can execute all the JavaScript calls to render this art gallery scene about 500 times a second. Which is pretty good!

然而,他们的 WebGPU 渲染器启用了一个他们称之为”快照渲染”的功能.这个功能建立在 WebGPU 的渲染包之上,允许同一个场景提交的速度提高了 10 倍以上.这种显著降低的开销允许 WebGPU 渲染更复杂的场景,同时还允许应用程序在 JavaScript 中并行执行更多的操作.

Their WebGPU renderer, however, enables a feature they call Snapshot Rendering. Built on top of WebGPUs render bundles, this feature allows the same scene to be submitted more than 10x faster. This significantly reduced overhead allows WebGPU to render more complex scenes, while also allowing applications to do more with JavaScript in parallel.

现代图形 API 以复杂性著称,以极致的优化机会换取简单性.另一方面,WebGPU 专注于跨平台兼容性,在大多数情况下自动处理资源同步等传统上难以处理的主题.

Modern graphics APIs have a reputation for complexity, trading simplicity for extreme optimization opportunities. WebGPU, on the other hand, is focused on cross-platform compatibility, handling traditionally difficult topics like resource synchronization automatically in most cases.

这有一个令人高兴的副作用,WebGPU 易于学习和使用.它依赖于现有的 Web 平台功能,比如图像和视频加载,并倾向于众所周知的 JavaScript 模式,比如 Promises 用于异步操作.这有助于将所需的样板代码降到最低.你可以在不到 50 行代码的情况下在屏幕上得到你的第一个三角形.

This has the happy side effect that WebGPU is easy to learn and use. It relies on existing features of the web platform for things like image and video loading, and leans into well-known JavaScript patterns like Promises for asynchronous operations. This helps keep the amount of boilerplate code needed to a minimum. You can get your first triangle on-screen in under 50 lines of code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
<canvas id="canvas" width="512" height="512"></canvas>
<script type="module">
const adapter = await navigator.gpu.requestAdapter();
const device = await adapter.requestDevice();

const context = canvas.getContext("webgpu");
const format = navigator.gpu.getPreferredCanvasFormat();
context.configure({ device, format });

const code = `
@vertex fn vertexMain(@builtin(vertex_index) i : u32) ->
@builtin(position) vec4f {
const pos = array(vec2f(0, 1), vec2f(-1, -1), vec2f(1, -1));
return vec4f(pos[i], 0, 1);
}
@fragment fn fragmentMain() -> @location(0) vec4f {
return vec4f(1, 0, 0, 1);
}`;
const shaderModule = device.createShaderModule({ code });
const pipeline = device.createRenderPipeline({
layout: "auto",
vertex: {
module: shaderModule,
entryPoint: "vertexMain",
},
fragment: {
module: shaderModule,
entryPoint: "fragmentMain",
targets: [{ format }],
},
});
const commandEncoder = device.createCommandEncoder();
const colorAttachments = [
{
view: context.getCurrentTexture().createView(),
loadOp: "clear",
storeOp: "store",
},
];
const passEncoder = commandEncoder.beginRenderPass({ colorAttachments });
passEncoder.setPipeline(pipeline);
passEncoder.draw(3);
passEncoder.end();
device.queue.submit([commandEncoder.finish()]);
</script>

总结

WebGPU 为 Web 平台带来的所有新的可能性令人兴奋,我们期待着看到您为 WebGPU 找到的所有酷炫的新用例!

It’s exciting to see all the new possibilities that WebGPU brings to the web platform and we’re looking forward to seeing all the cool new use cases that you will find for WebGPU!

一个充满活力的 WebGL 库和框架生态系统已经建立起来,这个生态系统渴望拥抱 WebGPU.WebGPU 在许多流行的 Javascript WebGL 库中正在进行或已经完成支持,有些情况下,利用 WebGPU 的好处可能只需要改变一个标志!

A vibrant ecosystem of libraries and frameworks has been built around WebGL, and that same ecosystem is eager to embrace WebGPU. Support for WebGPU is in-progress or already complete in many popular Javascript WebGL libraries, and in some cases taking advantage of the benefits of WebGPU might be as simple as changing a single flag!

而这个在Chrome 113 中的第一个版本只是一个开始.虽然我们最初的版本是为 Windows,ChromeOS 和 MacOS 发布的,但我们计划在不久的将来将 WebGPU 带到其他平台,比如 Android 和 Linux.

And this first release in Chrome 113 is just a start. While our initial release is for Windows, ChromeOS, and MacOS, we plan to bring WebGPU to the remaining platforms like Android and Linux in the near future.

而且,不仅仅是 Chrome 团队在努力推出 WebGPU.实现也正在 Firefox 和 WebKit 中进行中.

And it’s not just the Chrome team that’s been working on launching WebGPU. Implementations are also in-progress in Firefox and WebKit as well.

此外,当硬件可用时,W3C 已经在设计可以公开的新功能.例如:在 Chrome 中,我们计划很快就会启用对着色器中 16 位浮点数的支持DP4 类指令的支持,以获得更多的机器学习性能提升.

Additionally, new features are already being designed at the W3C that can be exposed when available in hardware. For example: In Chrome we plan to enable support for 16 bit floating point numbers in shaders and the DP4 class of instructions soon for even more machine learning performance improvements.

WebGPU 是一个广泛的 API,如果你投资它,它可以释放惊人的性能.今天我们只能以高层次的方式介绍它的好处,但是如果你想开始使用 WebGPU,可以看看我们的入门 Codelab,你的第一个 WebGPU 应用程序,你将在这里构建一个 GPU 版本的经典康威生命游戏.这个 codelab 将一步一步地指导你完成这个过程,所以即使是你第一次做 GPU 开发,你也可以尝试一下.

WebGPU is an extensive API that unlocks amazing performance if you invest in it. Today we could only cover its benefits at a high level, but if you’d like to get a hands-on start with WebGPU, please check out our introductory Codelab, Your first WebGPU app, where you’ll build a GPU version of the classic Conway’s Game of Life. This codelab will walk you through the process step-by-step, so you can try it out even if it’s your first time doing GPU development.

WebGPU 示例也是一个了解 API 的好地方.它们从传统的”你好三角形”到更完整的渲染和计算管道,展示了各种各样的技术.最后,看看我们的其他资源.

The WebGPU samples are also a good place to get a feel for the API. They range from the traditional “hello triangle” to more complete rendering and compute pipelines, demonstrating a variety of techniques. Finally, check out our other resources.

作者

1uciuszzz

发布于

2023-05-11

更新于

2023-05-11

许可协议

评论