Why GlTF Is JPEG For Metaverse And Digital Dual

We are excited to bring Transform 2022 back in person on July 19 and almost from July 20-28. Join AI and data leaders for in -depth discussions and exciting networking opportunities. Register now!


The JPEG file format plays an important role in moving the web from a text -based world to a visual experience through an open and efficient container for image sharing. Today, the Graphics Language Transmission Format (glTF) promises to do the same for 3D objects in the metaverse and digital twins.

JPEG utilizes various compression tricks to reduce images compared to other formats such as GIF. The latest version of glTF also takes advantage of compression techniques for the geometry of 3D objects and their textures. GlTF is already playing an important role in e-commerce, as evidenced by Adobe’s push into the metaverse.

VentureBeat spoke with Neil Trevett, president of the Khronos Foundation that oversees the glTF standard, to learn more about what glTF means for businesses. He is also vice president of the developer ecosystem at Nvidia, where his job is to make it easier for developers to use GPUs. It explains how glTF complements other digital twin and metaverse formats such as USD, how to use it, and where it’s headed.

VentureBeat: What is glTF and how does it fit into the ecosystem of file format types related to the metaverse and digital dual?

Neil Trevet: At Khronos, we put a lot of effort into 3D APIs like OpenGL, WebGL and Vulkan. We’ve found that every app that uses 3D needs to import properties at some point. The glTF file format is widely adopted and very complete in USD, which has become the standard for authoring and authoring on platforms such as Omniverse. The USD is the place if you want to bundle multiple tools into sophisticated pipelines and create even more high-volume content, including movies. This is why Nvidia has invested heavily in USD for the Omniverse ecosystem.

On the other hand, glTF focuses on efficiency and ease of use as a streaming format. It is a lightweight, streamlined and fast processing format that can be used on any platform or device up to web browsers on mobile phones. The slogan we use as an analogy is “glTF is JPEG in 3D”.

It also adds the file formats used by authoring tools. For example, Adobe Photoshop uses PSD files to edit images. No professional photographer will edit JPEG files because a lot of information is lost. PSD files are more sophisticated than JPEGs and support multiple layers. However, you can’t send a PSD file to my mom’s cellphone. You need JPEG to stream it to a billion devices as efficiently and quickly as possible. Thus, the USD and glTF cooperate with each other in the same way.

VentureBeat: How do you move from one to the other?

Trevet: It is important to have a transparent distillation process from USD assets to glTF assets. Nvidia has invested in a glTF connector for the Omniverse so we can seamlessly import and export glTF assets to and from the Omniverse. At the Khronos glTF working group, we are pleased that the USD is meeting industry needs for a creative format, as it is a huge amount of work. The goal is for glTF to be the ideal distillation target for the USD to support widespread deployment.

A creative format and a shipping format have different design requirements. USD design is more about flexibility. It helps to compose things to create a movie or VR environment. If you want to import another element and merge it with the existing view, you need to store all the design information. And you want everything to be at true resolution and quality levels.

The design of a transmission format is different. For example, in glTF, vertex information is less flexible for rewriting. But it is delivered in the exact format required by the GPU to run the geometry as efficiently as possible via a 3D API such as WebGL or Vulkan. That’s why glTF puts a lot of effort into compression design to reduce download times. For example, Google contributed their Draco 3D Mesh compression technology and Binomial contributed their Base Universal texture compression technology. We have also started to put a lot of effort into Level of Detail (LOD) management, so that you can download models very efficiently.

Distillation allows you to transfer from one file format to another. A big part of this is removing the design and creating information you no longer need. But you don’t want to reduce visual quality unless you have to. With glTF you can maintain visual fidelity, but you also have the option to compress things if aspiring for a small bandwidth deployment.

VentureBeat: How little can you reduce without losing too much loyalty?

Trevet: It’s like JPEG, where you have a dial to increase compression with an acceptable loss of image quality, only glTF has the same for geometry and texture compression. If it is a geometry-intensive CAD model, geometry is the bulk of the data. But if it’s more of a consumer -based model, the texture data can be much larger than the geometry.

In Draco, reducing the data by 5-10 times is reasonable without causing a loss of quality. There is also something similar in texture.

Another factor is the amount of memory required, which is a valuable resource in mobile phones. Before binomial compression was implemented in glTF, people sent JPEGs, which was good because they were so small. But the process of unpacking into a full-size texture can take hundreds of megabytes even on a simple model, damaging the power and performance of a mobile phone. GlTF textures allow you to take a super-compressed texture in JPEG format and then decompress it into a GPU-native texture, so it doesn’t reach its maximum size. As a result, you reduce data transmission and memory requirements by 5-10 times. This can be useful if you are downloading properties in a browser to a mobile phone.

VentureBeat: How do people effectively represent textures in 3D objects?

Trevet: Well, there are two basic types of texture. One of the most common is image-based textures, such as mapping an image to the logo on a t-shirt. Other than procedural texture, where you can create a pattern, such as marble, wood, or stone, simply by running an algorithm.

There are many algorithms you can use. For example, Allegorithmic, recently acquired by Adobe, pioneered an interesting technique for creating textures that is now used in Adobe Substance Designer. You can always make this texture an image because it can be easily processed on client devices.

Once you have a texture, you can do more than glue it to the model like a piece of wrapping paper. You can use these texture images to create a more sophisticated look to the material. For example, Physical Rendering Materials (PBR) is your attempt to go as far as you can to mimic the characteristics of real-world materials. Is it metallic, which gives it a shiny look? Is it translucent? Does it refract light? Some of the most sophisticated PBR algorithms can use up to 5 or 6 different texture maps that feed parameters that identify its brightness level or translucency.

VentureBeat: How does the glTF improve the side graph of the scene to represent relationships within objects, such as how car wheels rotate or connect multiple objects?

Trevet: This is an area where the USD has a high leg on the glTF. Currently, most glTF use cases are satisfied with an asset in an asset file. 3D commerce is a major use case where you want to place a chair and drop it in your living room like Ikea. This is a unique asset of glTF, and many use cases have been satisfied with it. As we move into the metaverse and VR and AR, people want to create scenes with multiple properties to deploy. One active area discussed by the working group was how to best implement multiple glTF scenes and properties and how we linked them. It’s not as sophisticated as the USD because the emphasis is on transmission and delivery rather than creation. But glTF has something to do with multi-asset compounding and linking in the next 12-18 months.

VentureBeat: How has glTF been developed to support multiple metaverse and digital twin use cases?

Trevet: We need to start carrying things beyond the physical appearance. Now we have geometry, texture and animation in glTF 2.0. The current glTF says nothing about physical properties, sound, or interaction. I think many of the next generations of extensions for glTF will include this kind of behavior and property.

The industry is now deciding that it will be USD and glTF in the future. Even with older formats like OBJ, they are starting to show their age. There are popular formats like FBX that are proprietary. USD is an open source project and glTF is an open standard. People can participate in the same ecosystem and help develop it to meet the needs of their customers and market. I think the two formats will change together. Now the goal is to keep them consistent and continue the efficient distillation process in between.

Leave a Comment