← Build Log

Text2Gaussian

Text to 3D Gaussian Splats in 17 Seconds | November 2025

View on GitHub

Overview

Text2Gaussian is an open-source pipeline that converts text descriptions into 3D Gaussian Splat models. Type a prompt like "a vintage brass teapot" and get a fully-formed 3D asset in approximately 17 seconds—output as a PLY file with 200,000+ vertices.

Pipeline

Text2Gaussian Pipeline

How It Works

The pipeline operates through three sequential stages:

  • Text to Image — Google's Gemini AI generates images from text descriptions
  • Image to 3D — Meta's SAM 3D Objects model processes the generated images
  • Output — Results are saved as PLY files compatible with standard 3D viewers

Architecture

  • stage0/ — Batch processing pipeline for generating multiple assets
  • stage1/ — Text-to-image generation via Gemini integration
  • runpod-server/ — GPU-accelerated 3D conversion API running on A100
  • viewer/ — Web-based 3D visualization interface