This echoes my understanding as well. Virtual Maker uses BabylonJS under the hood for rendering. That said, I've been interested for some time in finding/developing a standard format for describing 3D content which can include some form of input handling (GLTF is just rendering, AFAIK). Think <button>, but for 3D/VR. Maybe then we could port some simple experiences easily between engines.
I think part of the problem is that the full understanding of a scene with interactions requires a graph structure, but most of our tools are really only built for editing tree structures.
I see this time and again with just regular, ol' 2D DOM and JavaScript. You can define your layout in HTML and then play whack a mole with bugs if elements you got via document.querySelector move or get renamed, or you can define your layouts in JS and pay a relative huge cost in rendering and complexity.
Plus there is the added wrinkle in 3D of having to be very persnickety about memory and shaders and order of operations. How do you make a declarative, semantic scene description that can figure out that this geometry over here can be batched, this over here can be instanced, and never the twain shall meet?
Ah did you then modify the BabylonJS editor? I've been thinking of this approach but worried people have been submitting issues of glitches and oddities that made it seem unstable.
I've been trying to write a 3d editor from scratch with Three.js and Vue 3. It's quite daunting task and I am more appreciative of the work Babylon guys are doing just wish there were more funding so they can knock some of the older issues out.
After spending the last decade designing and building 3D/AR/VR products, I've realized there are many hurdles one has to overcome to create a complete experience. Between learning the principles and tools, building the experience, testing, iterating, distributing, and ensuring everything works correctly on different devices, there's quite a bit of boilerplate work required to bring your imagination to life.
I designed Virtual Maker to simplify the whole process, letting you focus directly on designing your experience. Let me know what you think!
As someone who has been down this road several times over the last 7 years, all I can say is... good luck.
And watch out for Mozilla. With A-Frame and Hubs, they like to play up "we're all one, big, happy, WebXR family", then quietly copy all your features and claim they were first.
Y'all setup a "recruiting" call with me before releasing A-Frame, asked me tons of questions about my WebXR framework, which I had in the wild for almost a year, which Josh admitted you had been using as part of your testing of Firefox Reality, then announced A-Frame and called it "The world's first WebXR framework". I had growth before the announcement, then nothing after, and people started accusing me of copying A-Frame. How was I supposed to feel?
EDIT: Excuse me, "WebVR support in Firefox". WebVR hadn't been renamed to WebXR, and Firefox Reality didn't exist, yet.
Cool! I run Spatial Ape (the VR industry trade show) and its great to see new authoring tools. Where do you see this fitting in relative to existing solutions like Unity and Unreal? Thanks for sharing!
I've worked with both Unity and Unreal on past projects. While I love both, they have steep learning curves, long build times, and deploying cross-platform is much more difficult than the web.
Virtual Maker is a simplification of the whole process. We have built-in actions, navigation modes, and assets that you can use to get going quickly. We take care of hosting for you, ensuring your scenes work across platforms, with instant deploy times.
Of course, that means Virtual Maker won't give you all the options that Unity and Unreal do to build anything you can imagine. It's a trade off, and depends on what kind of project you're making.
A great FOSS alternative to Unity and Unreal engine is Godot engine. I've found it much easier to learn and work with, extremely Blender3D friendly, and it's cross-platform and web export support seems to be excellent thus far as I've tried any of it. It does still require a fair bit of learning, as any complex tool does, but there's many great tutorials on the web and YouTube to help with that.