For folks curious about WebGL in a Vulkan world, there's https://github.com/Khro...

kllrnohj · on Aug 25, 2017

Not to be overly harsh about their work but that just screams "me too!" instead of solving an actual problem.

Vulkan is a low-level, highly verbose API designed to extract maximum performance and leverage multiple threads to deal with slower aspects of rendering. Then you chuck that into a single-threaded, slow (relatively) runtime where method calls to the actual implementation are particularly expensive (JavaScript -> native transitions ain't cheap), and that's supposed to be a good idea? Why? Who is supposed to use this for anything useful?

Instead of just porting native features to the web for the sake of porting them make it easier to make use of the hardware's capabilities. Let me throw a GPU shader into a CSS3 transition animation or something like that, for example. That'd be cool and potentially useful instead of a system where you can port some game to the web for the sake of porting it to the web where nobody will ever use it because it sucks compared to the vastly superior native version of the game.

pcwalton · on Aug 26, 2017

> Then you chuck that into a single-threaded, slow (relatively) runtime

Slow relative to C++, sure. But JS is very fast relative to pretty much any other widely used dynamic language. If we should expose Vulkan bindings to dynamic languages (and why not?) JS is the obvious first choice for a target, given its speed and popularity.

> JavaScript -> native transitions ain't cheap

They're actually very cheap nowadays, because so many benchmarks stress the DOM and C++-implemented builtins. Driver calls, even with Vulkan, far exceed the cost of JS-to-native transitions. glDrawElements() is probably at least 100x slower than a JS-to-native call, the latter of which has latencies measured in nanoseconds.

> Let me throw a GPU shader into a CSS3 transition animation or something like that, for example.

Please, let's not. This will neither be good for designers (modern GPU programming has an enormous learning curve compared to CSS) nor browser developers (shaders would break batching and would expose too many engine-specific internal details).

kllrnohj · on Aug 26, 2017

> If we should expose Vulkan bindings to dynamic languages (and why not?) JS is the obvious first choice for a target, given its speed and popularity.

Exposing it to JS and exposing it to web pages in a browser are two completely different things.

> They're actually very cheap nowadays, because so many benchmarks stress the DOM and C++-implemented builtins. Driver calls, even with Vulkan, far exceed the cost of JS-to-native transitions. glDrawElements() is probably at least 100x slower than a JS-to-native call, the latter of which has latencies measured in nanoseconds.

glDrawElements in vulkan is a couple hundred lines of API calls. I don't think you're fully grokking the orders of magnitude of verbosity that Vulkan brings.

As for the transition cost there's varying levels of cheap, but at the end of the day it's not great. It's raw per-call overhead on an API designed around making an obscene amount of calls, and the overhead is significant. Jumping between these worlds is just not something you want to do very frequently if your goal is performance.

C++-builtins are a completely different class of problems as intrinsics get to play by their own compiler rules than regular JS -> native bindings.

> Please, let's not. This will neither be good for designers (modern GPU programming has an enormous learning curve compared to CSS) nor browser developers (shaders would break batching and would expose too many engine-specific internal details).

So simple pixel shaders are too complex, but vulkan is not?

If a pixel/fragment shader is too much, then so is the entirety of WebGL, and webvulkan would just be pure insanity from that perspective.

Also it doesn't break batching at all, I have no idea what you're talking about there. A pixel shader is just a function with a set of inputs and a color output. It's really quite simple, easily emulated for non-GPU fallbacks, and easily manipulated by the browser.

pcwalton · on Aug 26, 2017

> Exposing it to JS and exposing it to web pages in a browser are two completely different things.

It would make little sense to expose Vulkan to JS and not to put those bindings in a Web browser.

> glDrawElements in vulkan is a couple hundred lines of API calls. I don't think you're fully grokking the orders of magnitude of verbosity that Vulkan brings.

> As for the transition cost there's varying levels of cheap, but at the end of the day it's not great. It's raw per-call overhead on an API designed around making an obscene amount of calls, and the overhead is significant. Jumping between these worlds is just not something you want to do very frequently if your goal is performance.

It doesn't matter whether it's a couple hundred calls or not. The overhead, which again is measured in nanoseconds, really does not matter. Vulkan's performance demands on API boundary transitions are no worse than that of the DOM, which has been optimized for decades.

> C++-builtins are a completely different class of problems as intrinsics get to play by their own compiler rules than regular JS -> native bindings.

No, they don't. They are one and the same in many JS engines. (I can't speak for all engines, but I'm familiar with SpiderMonkey, where a JSNative is a JSNative, whether a builtin or a DOM method.) SpiderMonkey nowadays even knows about things like purity of various DOM methods and will optimize accordingly. (bz used this to make Dromaeo really fast.)

> So simple pixel shaders are too complex, but vulkan is not?

Fragment shaders are too complex for CSS. They aren't too complex for programmatic manipulation in JS. The reason is simple: CSS is a high-level declarative language intended to be accessible to designers, while JS is an imperative language mainly used by programmers.

> Also it doesn't break batching at all, I have no idea what you're talking about there.

As you know, switching shader programs can only be done in between draw calls.

> A pixel shader is just a function with a set of inputs and a color output.

With hundreds of pages of specification describing how all the different operations that that function can perform must behave.

Impossible · on Aug 26, 2017

You just stumbled into and argument the game industry has been dealing with for years, whether or not programmers or artists should own shader code. This boils down to whether or not you implement a visual node based shader editor in your game engine. While I agree that expecting designers to know "modern gpu programming" is extreme, exposing fragment shader style functionality isn't a bad idea, if properly implemented. I also don't think it'd be super difficult for designers to grok.

pcwalton · on Aug 26, 2017

> exposing fragment shader style functionality isn't a bad idea, if properly implemented.

I agree that functionality like fragment shaders is useful, as long as it's declarative and fits in with the rest of CSS. In fact, we already have it: the CSS filter property.

pandaman · on Aug 26, 2017

I don't think glDrawElements() goes into the driver though (at least there is no sane reason for it I can imagine, you only need to go into the driver to kick a command buffer at the end of frame).

Nevertheless, even if some implementations actually do call driver, the GP's point is that the whole point of Vulkan is getting rid of global state to allow parallel command buffer creation. JS is single threaded so, no matter how cheap or expensive the calls are, you won't be able to take advantage of Vulkan since you are running just one thread.

nhaehnle · on Aug 26, 2017

Of course glDrawElements() has to go into the driver, because it needs to do hardware-specific work. This is obviously true in OpenGL(ES), but even if you were to implement it via a translation layer to Vulkan, you still have to call the various Vulkan-equivalent commands, at the very least vkCmdDrawIndexed. Here's an implementation to make it painfully obvious why that has to involve the driver: https://github.com/mesa3d/mesa/blob/d819b1fcec02be5e0cfc87b6...

pandaman · on Aug 26, 2017

No, you only need to go into the driver when you need to do work that cannot be done in the userland. What are you talking about is just a shared library in the address space of your app, and the code you are showing is literally just writing bytes into a buffer.

Nobody forbids you from calling it "driver", of course but then the whole point of "going into the driver" does not make sense, since there is no syscall and it's just a regular function.

nhaehnle · on Aug 26, 2017

You're thinking of kernel-mode drivers.

What I've linked to is a driver. Everybody calls it that.

When you go download a driver for graphics cards, whether on Linux or Windows, that driver actually consists of multiple components, some of them running in kernel-mode and some of them running in user-mode. It's basically the exo-kernel principle, but without feeling the need of giving it a fancy name :)

There's a broader history of user-mode drivers not just for graphics, and not just in the obvious case of micro-kernels. User-mode USB drivers used to be a thing, for example (and I guess they still are for some more obscure hardware).

pandaman · on Aug 27, 2017

As I said, you can call it whatever you want, "driver", "kernel" or "linux" even. The point of "going into the driver" being expensive only makes sense if it's a syscall, which it is not as we both seem to agree.

crzwdjk · on Aug 27, 2017

Not really. OpenGL in particular has to do a surprising amount of work on every call, to make sure the state hasn't changed and to update all sorts of things if it has, to manage the various buffers and make sure they're mapped in the right place, and then to go down through all the abstraction layers until you end up in the code that actually writes stuff into the command buffer. It's not going to be a syscall level of overhead, though it may end up being that if stuff needs to get mapped into the GPU address space, but it's definitely going to be more than the dozen instruction overhead of going from JITed to native C++ code.

pandaman · on Aug 27, 2017

"Not really" what? There is syscall? Then you say yourself there is not... I only argue that there is no syscall in glDraw* as well as the vast majority of the APIs. Sure, driver/opengl do whatever and some calls are more expensive than others but adding more overhead is not going to make it any better and it's already pretty bad without overhead. That's why they developed Vulkan in the first place.

nhaehnle · on Aug 27, 2017

You know, you're talking to somebody who writes graphics drivers for a living :)

If you don't believe me or crzwdjk, just go ahead and actually profile a system running an OpenGL application. The syscall overhead -- as in, the overhead of transitioning between user and kernel mode -- is laughably negligible compared to everything else. Also, the vast majority of driver CPU time is spent in user space building up command submissions. The final command submission itself isn't free of course, but clearly more time is spent processing precisely those glDraw*() calls that you seem to think don't matter.

pandaman · on Aug 27, 2017

> You know, you're talking to somebody who writes graphics drivers for a living :)

That's great. Why do you think you are the only one? And what should I believe exactly here? That there is a syscall in every OpenGL API? If you are writing drivers you know it's not true yourself. The syscall overhead is not laughable, it's tens of thousands of clocks.

>Also, the vast majority of driver CPU time is spent in user space building up command submissions.

Exactly. OpenGL system (if you want to call it "driver" be my guest, DirectX does not do that for example, neither do other APIs) works mostly in the user space.

> The final command submission itself isn't free of course, but clearly more time is spent processing precisely those glDraw*() calls that you seem to think don't matter.

??? I don't even know what are you arguing here. Let's rewind. Someone said that "driver calls" are expensive. And it's true for people who understand drivers as a part of OS, not "user mode drivers", which are just shared libs. I corrected, saying that there is no actual driver call in the sense that people understand, i.e. there is no OS call or "syscall" since the OpenGL "driver" is mostly a shared library in the user space. You seem to agree with me. Now, I am well aware that some calls are expensive. I even know why. Some are not though. On some the overhead of moving data from a managed language to the GPU will be much greater than the call itself. E.g. setting an index buffer.

It still does not make it true that there are syscalls in the OpenGl calls anyways.

kllrnohj · on Aug 28, 2017

The userspace part of the driver is still called the driver.

A driver does not mean a kernel module. It's often that, but it does not exclusively mean that. Userspace drivers are still drivers.

The library that gets loaded into the process is part of the driver. It's provided by the GPU vendor and it's specific to the hardware you're running. It maps API calls to hardware-specific commands. Aka, it's a driver. It just happens to be implemented as a userspace library for most of the work.

pshc · on Aug 25, 2017

I imagine the next step is to provide a WebAssembly binding.

edit: "The API has to execute efficiently on WebAssembly and in multi-threaded environment. That means no GC allocations during the rendering loop in order to avoid the garbage collection pauses."

kllrnohj · on Aug 25, 2017

Well WebAssembly is another thing I'd question the usefulness of. It's going to result in threading finally coming to JS which is nice, but the rest of it is more like a showing of of technical infrastructure for the sake of it instead of helping apps with problems they have.

pshc · on Aug 25, 2017

I would argue that having to write apps in JS, or transpiling to the JS runtime, is a problem for many people.

kllrnohj · on Aug 25, 2017

Sure but you could imagine something more like a .NET or JVM bytecode instead, which would be a more practical target for transpiled web apps instead of webasm.

Instead we ended up with a stack machine & sbrk.

pcwalton · on Aug 26, 2017

.NET and the JVM are in no way more suitable for compiling C/C++ and the like than wasm is.

The JVM doesn't even have unsigned integers!

kllrnohj · on Aug 26, 2017

Well of course, but I think running C/C++ on the web is a complete nonsensical waste of time. That's not a useful market to target and critically it completely ignores the needs and problems of the current market.

johncolanduoni · on Aug 26, 2017

.NET and JVM are both stack machines. The .NET VM is the only one that has features that cater to C/C++-ish languages (i.e. Linear memory access via instructions). WebAssembly's memory model (including sbrk-style allocation) is likely what a sandboxed, linear memory focused .NET VM would have wanted to go with anyway in the interest of minimizing the performance impact of address range checking.

pshc · on Aug 26, 2017

But in what sense would an extra VM layer be more practical? Maybe this is where we disagree. I'm not fond of VMs, especially those two.

C++ or Rust -> wasm bytecode is great. Soon we'll be writing directly to command buffers, no fuss.

kllrnohj · on Aug 26, 2017

If you want to do C++/Rust direct to command buffers why on earth would you bother with the pile of overhead that is a modern web browser?

But ~nobody wants to build UIs like that anyway, so what's your target audience?

pshc · on Aug 26, 2017

Games and other 3D applications? Getting people in-game (say a lazily loaded demo with slightly less perf) with one click is huge.

kllrnohj · on Aug 26, 2017

No, it isn't. Games are already served by consoles first (which obviously won't run webasm), and steam second. There's no market there, and it's already one-click to launch steam to the game in question. Where it will then download in a medium suitable to handling the downloading & updating of a game's assets instead (which, even for a demo, is in the gigabyte range - you aren't lazy loading this).

As for 3D applications what 3D applications? Do you really think Maya is going to be ported to a browser? Why would they bother? Why would they restrict themselves like that?

The web has no advantages in this space, and the needs of those markets is already being served with superior technology and infrastructure.

pshc · on Aug 26, 2017

Have you ever played a flash game? Slither.io? There is a huge market there. Steam is not one click away from a tweet or a Facebook post.

Please have a nice weekend.

kllrnohj · on Aug 26, 2017

Flash games are dead and even facebook games are largely a thing of the past as Facebook is now primarily used on mobile. The casual audience is on their phone in app stores & not on the web anymore.

I assure you those casual game companies are not going to want to go anywhere near vulkan or similar, though, and they are generally fine with the performance scripting languages already give them (hence why they were in flash instead of java applets)

They want strong 2D graphics capabilities primarily, which is largely an ignored category. <canvas> has a 2D context, but it's pretty crappy.

ZenoArrow · on Aug 26, 2017

Most web-based games will probably remain to be in 2D (for multiple reasons, including development cost and accessibility), but 3D games on the web are certainly performant enough.

To give one example, bananabread is a tech demo showing off what can be done with asm.js and WebGL (performance is likely to get even better with WASM):

http://kripken.github.io/misc-js-benchmarks/banana/index.htm...

As for whether a VM has an overhead compared to native, of course it does, but you're not going to convince anyone of your viewpoint by stating something that's already obvious to us all. The performance of web-based games doesn't have to be the best, it just has to 'good enough'. I personally don't think we'll fast adoption of WebVR, but I'm glad it exists as it helps to have it as a goal to improve 3D performance, almost certainly leading to reducing the performance overhead of browsers (as low latency is a key component of a good VR experience).

johncolanduoni · on Aug 26, 2017

WebAssembly doesn't appear to be a way for threading to come to JS, unless you count adding a JS API that lets you invoke multithreaded code in a totally different format and memory/execution model as "threading coming to JS". I don't think anyone is itching to move V8/SpiderMonkey/etc. to concurrent garbage collection, so concurrent memory access in multithreaded JS code will likely be limited to SharedArrayBuffer/WebAssembly for the foreseeable future.