Creating a 3D World from Flat Images
If you’ve ever used Google Street View or played with a VR headset, then you’ve seen how powerful 3D visuals can be. But capturing the real world and turning it into accurate, editable 3D models is not easy. Most of today’s tools either give you beautiful pictures without accurate shapes, or precise shapes that don’t look realistic.
This is where a family of methods called Neural Radiance Fields (NeRFs) come in. NeRFs can take a bunch of photos of a scene and generate new, photo-realistic views from any angle. The problem? They’re not very good at giving you a clean 3D mesh, the format you need if you want to use objects in a game engine, a simulation, or XR applications.
What We Set Out to Solve
Our team at CERTH worked on a new method dubbed High Accuracy Mesh Extraction NeRF, or HAME-NeRF. The goals are simple:
- Keep the photo-realism that NeRFs are famous for
- Create accurate, lightweight 3D meshes that can be used for real-time graphics, robotics, and XR.
Think of it like baking: NeRFs give you a fluffy, soft cake (beautiful but fragile), while our approach makes sure the cake has a solid structure inside, so that you can slice it, decorate it, and actually serve it.

How it Works
Here is the pipeline in plain words:
-
- We start with a set of regular photos of a scene.
- Instead of just creating a “cloudy” 3D volume, our method carves out a crisp surface that really matches object boundaries.
- Then we refine the shape by comparing how it looks when rendered back into images, adjusting the 3D model until the match is sharp.
The end result is a clean, detailed mesh that’s ready to be used in interactive applications.
For both the indoor and outdoor scenes, the images on the left are generated using HAME-NeRF, while the ones on the right represent the ground truth (i.e. the gold standard of accuracy). As we can see, there are only very minor differences between the two, demonstrating the effectiveness and realism achieved by HAME-NeRF.

What We Found
We tested HAME-NeRF on two standard 3D datasets. Compared to earlier methods, our approach:
-
- Captured finer details (like chair legs or plant leaves) without leaving gaps
-
- Produced meshes that looked smoother and more complete
-
- Delivered better overall rendering quality in most cases
In some harder real-world scenes, our method traded a tiny bit of photo accuracy for much cleaner geometry , which is often a good deal when the goal is to build usable digital assets.
Why This Matters for DIDYMOS-XR
Projects like DIDYMOS-XR rely on creating digital twins of real environments. To be useful, these twins need both:
-
- Realistic appearance (so they look convincing), and
-
- Accurate geometry (so robots, physics engines, and simulations behave correctly).
HAME-NeRF helps bridge this gap. Whether it’s training robots, reconstructing heritage sites, or building XR experiences, this method makes it easier to go from “a folder of images” to “a ready-to-use 3D model.”
Looking Forward
Our current method still has some limits, for example, lighting is “baked in,” which makes it tricky to relight scenes afterwards. But this is just the start. Future work will aim for even more flexibility, like changing lights or materials on the fly.
For now, HAME-NeRF is a step closer to making digital worlds as rich and precise as the real one.
Author: Panagiotis Frasiolas, CERTH