Text-to-3D
Uni-3DAR
File size: 2,917 Bytes
02f1ba8
 
9e525be
4a2409f
02f1ba8
 
 
 
 
 
 
 
 
4a2409f
 
 
02f1ba8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4a2409f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02f1ba8
 
4a2409f
 
02f1ba8
4a2409f
02f1ba8
4a2409f
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
license: mit
library_name: uni-3dar
pipeline_tag: text-to-3d
---

Uni-3DAR
========
[[Paper](https://arxiv.org/pdf/2503.16278)]

Introduction
------------

<p align="center"><img src="fig/overview.png" width=95%></p>
<p align="center"><b>Schematic illustration of the Uni-3DAR framework</b></p>

Uni-3DAR is an autoregressive model that unifies various 3D tasks. In particular, it offers the following improvements:

1. **Unified Handling of Multiple 3D Data Types.**  
   Although we currently focus on microscopic structures such as molecules, proteins, and crystals, the proposed method can be seamlessly applied to macroscopic 3D structures.

2. **Support for Diverse Tasks.**  
   Uni-3DAR naturally supports a wide range of tasks within a single model, especially for both generation and understanding.

3. **High Efficiency.**  
   It uses octree compression-in combination with our proposed 2-level subtree compression-to represent the full 3D space using only hundreds of tokens, compared with tens of thousands in a full-size grid. Our inference benchmarks also show that Uni-3DAR is much faster than diffusion-based models.

4. **High Accuracy.**  
   Building on octree compression, Uni-3DAR further tokenizes fine-grained 3D patches to maintain structural details, achieving substantially better generation quality than previous diffusion-based models.


News
----

**2025-03-21:** We have released the core model along with the QM9 training and inference pipeline.


Dependencies
------------

- [Uni-Core](https://github.com/dptech-corp/Uni-Core). For convenience, you can use our prebuilt Docker image:  
  `docker pull dptechnology/unicore:2407-pytorch2.4.0-cuda12.5-rdma`


Reproducing Results on QM9
--------------------------

To reproduce results on the QM9 dataset using our pretrained model or train from scratch, please follow the instructions below.

### Download Pretrained Model and Dataset

Download the pretrained checkpoint (`qm9.pt`) and the dataset archive (`qm9_data.tar.gz`) from our [HF中国镜像站 repository](https://huggingface.co/dptech/Uni-3DAR/tree/main).

### Inference with Pretrained Model

To generate QM9 molecules using the pretrained model:

```
bash inference_qm9.sh qm9.pt
```

### Train from Scratch

To train the model from scratch:

1. Extract the dataset:
```
tar -xzvf qm9_data.tar.gz
```

2. Run the training script with your desired data path and experiment name:

```
base_dir=/your_folder_to_save/ bash train_qm9.sh ./qm9_data/ name_of_your_exp
```


Citation
--------

Please kindly cite our papers if you use the data/code/model.

```
@article{lu2025uni3dar,
  author    = {Shuqi Lu and Haowei Lin and Lin Yao and Zhifeng Gao and Xiaohong Ji and Weinan E and Linfeng Zhang and Guolin Ke},
  title     = {Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens},
  journal   = {Arxiv},
  year      = {2025},
}
```