๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ˜ŽAI/3D Reconstruction

[Paper Review] DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

by SolaKim 2024. 12. 9.

 

Abstract

1. DeepSDF๋ž€?

- ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ์—ฐ์†์  Signed Distance Function (SDF) ํ‘œํ˜„
- ํ˜•์ƒ ํด๋ž˜์Šค ์ „์ฒด๋ฅผ ๊ณ ํ’ˆ์งˆ๋กœ ํ‘œํ˜„, ๋ณด๊ฐ„(interpolation), ๋ถˆ์™„์ „ ๋ฐ์ดํ„ฐ ๋ณต์› ๊ฐ€๋Šฅ

2. ํ‘œํ˜„ ๋ฐฉ์‹

- ๋ถ€ํ”ผ ํ•„๋“œ์—์„œ ์ ์˜ ํฌ๊ธฐ: ํ‘œ๋ฉด ๊ฒฝ๊ณ„๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ
- ๋ถ€ํ˜ธ: ํ˜•์ƒ ๋‚ด๋ถ€(-) ๋˜๋Š” ์™ธ๋ถ€(+)
- ๊ฒฝ๊ณ„๋Š” ํ•จ์ˆ˜์˜ 0-level-set ์œผ๋กœ ์•”๋ฌต์ ์œผ๋กœ ์ธ์ฝ”๋”ฉ

3. ๊ธฐ์กด SDF ์™€ ์ฐจ์ด์ 

- ๊ธฐ์กด SDF ๋Š” ๋‹จ์ผ ํ˜•์ƒ ํ‘œํ˜„
- DeepSDF ๋Š” ํ˜•์ƒ ํด๋ž˜์Šค ์ „์ฒด๋ฅผ ํ•™์Šตํ•˜๊ณ  ํ‘œํ˜„ ๊ฐ€๋Šฅ

4. ์„ฑ๊ณผ

- 3D ํ˜•์ƒ ํ‘œํ˜„๊ณผ ๋ณต์›์—์„œ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ
- ๋ชจ๋ธ ํฌ๊ธฐ๋ฅผ ๊ธฐ์กด ๋Œ€๋น„ 10๋ฐฐ ๊ฐ์†Œ

 

Introduction

1. ๋ฌธ์ œ ์ •์˜
: 3D ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ๊ธฐ์กด ๋ฐฉ์‹์—์„œ ๊ณต๊ฐ„ ๋ฐ ์‹œ๊ฐ„ ๋ณต์žก๋„ ์ฆ๊ฐ€, ์ •์ (vertex) ๊ฐœ์ˆ˜์™€ ์œ„์ƒ(topology) ๋ถˆํ™•์‹ค์„ฑ ๋•Œ๋ฌธ์— ํšจ์œจ์„ฑ๊ณผ ์œ ์—ฐ์„ฑ์ด ์ œํ•œ๋˜์—ˆ๋‹ค.

2. DeepSDF ์˜ ํ•ต์‹ฌ ์•„์ด๋””์–ด
- ์—ฐ์†์  Signed Distance Function (SDF) ๊ธฐ๋ฐ˜์˜ ํ•™์Šต๋œ 3D ์ƒ์„ฑ ๋ชจ๋ธ
- SDF์˜ ํ‰๊ฐ€์™€ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ๋ฅผ ์œ„ํ•ด ๊ทœ์น˜์ ์ธ ๊ฒฉ์ž๋กœ ์ด์‚ฐํ™”(discretize) ํ•˜๋Š” ๊ธฐ์กด ๋ฐฉ์‹๊ณผ ๋‹ฌ๋ฆฌ ์—ฐ์†์  ํ•„๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์ƒ์„ฑ ๋ชจ๋ธ์„ ํ•™์Šต
- SDF ๋ฅผ ํ˜•์ƒ ์กฐ๊ฑด๋ถ€(class-conditioned) ๋ถ„๋ฅ˜๊ธฐ๋กœ ๊ฐ„์ฃผํ•ด ํ‘œ๋ฉด์„ ๊ฒฐ์ • ๊ฒฝ๊ณ„๋กœ ์ •์˜

3. ๊ธฐ์—ฌํ•˜๋Š” ๋ฐ”
- ์—ฐ์†์  ์•”๋ฌต์  ํ‘œ๋ฉด(continuous implicit surface) ์„ ์‚ฌ์šฉํ•œ ํ˜•์ƒ ์กฐ๊ฑด๋ถ€(shape-conditioned) 3D ์ƒ์„ฑ ๋ชจ๋ธ๋ง์˜ ์ •์‹ํ™”(formulation)
- ํ™•๋ฅ ์  ์˜คํ† ๋””์ฝ”๋”(probabilistic auto-decoder) ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ 3D ํ˜•์ƒ ํ•™์Šต
- ํ˜•์ƒ ๋ชจ๋ธ๋ง ๋ฐ ๋ณต์›์—์„œ์˜ ์„ฑ๊ณต์  ์‘์šฉ
- ์ˆ˜์ฒœ ๊ฐœ์˜ ํ˜•์ƒ์„ ๋‹จ 7.4MB ๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ
- ๊ธฐ์กด ๋น„์••์ถ• 3D ๋น„ํŠธ๋งต ๋Œ€๋น„ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ์ ˆ๊ฐ

4. ์„ฑ๊ณผ
- ๋ณต์žกํ•œ ์œ„์ƒ์„ ๊ฐ€์ง„ ๊ณ ํ’ˆ์งˆ ์—ฐ์† ํ‘œ๋ฉด ์ƒ์„ฑ
- ํ˜•์ƒ ๋ณต์› ๋ฐ ์™„์„ฑ์—์„œ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ

 

Related Work

 

1. Representations for 3D Shape Learning

๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ 3D ํ•™์Šต ์ ‘๊ทผ๋ฒ•์˜ ํ‘œํ˜„์€ ์„ธ๊ฐ€์ง€ ๋ฒ”์ฃผ๋กœ ๋ถ„๋ฅ˜๋œ๋‹ค. [ํฌ์ธํŠธ ๊ธฐ๋ฐ˜, ๋ฉ”์‰ฌ ๊ธฐ๋ฐ˜, ๋ณต์…€ ๊ธฐ๋ฐ˜]

์ผ๋ถ€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜(์˜ˆ: 3D ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ๊ธฐ๋ฐ˜ ๊ฐ์ฒด ๋ถ„๋ฅ˜)๋Š” ์ด๋Ÿฌํ•œ ํ‘œํ˜„์— ์ ํ•ฉํ•˜์ง€๋งŒ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋ณต์žกํ•œ topology ๋ฅผ ๊ฐ€์ง„ ์—ฐ์† ํ‘œ๋ฉด์„ ํ‘œํ˜„ํ•˜๋Š”๋ฐ ์žˆ์–ด ์ด๋“ค์˜ ํ•œ๊ณ„์— ๋Œ€ํ•ด์„œ ์–˜๊ธฐํ•œ๋‹ค.

[point-based]

์„ผ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ง์ ‘์ ์œผ๋กœ ํ™œ์šฉ ๊ฐ€๋Šฅ

โŠ– ์œ„์ƒ์„ ์„ค๋ช…ํ•˜์ง€ ๋ชปํ•˜๋ฉฐ ์™„์ „ ๋ฐ€ํ๋œ(watertight) ํ‘œ๋ฉด์„ ์ƒ์„ฑํ•  ์ˆ˜ ์—†์Œ

[mesh-based]

 3D ๋Œ€์‘(correspondence) ์ œ๊ณต, ๊ณ ํ’ˆ์งˆ ํ˜•์ƒ ์ƒ์„ฑ ๊ฐ€๋Šฅ

โŠ– ๊ณ ์ •๋œ ์œ„์ƒ๋งŒ ๋ชจ๋ธ๋ง ๊ฐ€๋Šฅ, ๋งค๊ฐœํ™” ํ’ˆ์งˆ์— ์˜์กดํ•˜๋ฉฐ ๋‹ซํžŒ ํ˜•์ƒ์„ ๋ณด์žฅํ•˜์ง€ ๋ชปํ•จ

[Voxel-based]

3D ๊ณต๊ฐ„ ํ‘œํ˜„์—์„œ ์ง๊ด€์ ์ด๊ณ  2D ์ปจ๋ณผ๋ฃจ์…˜ ๊ธฐ๋ฐ˜ ํ•™์Šต๊ณผ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์—ฐ๊ฒฐ

โŠ– ๋†’์€ ๋ฉ”๋ชจ๋ฆฌ ์š”๊ตฌ๋Ÿ‰, ์ €ํ•ด์ƒ๋„ ํ˜•์ƒ ํ‘œํ˜„, ๋งค๋„๋Ÿฝ์ง€ ์•Š์€ ํ‘œ๋ฉด ์ƒ์„ฑ

 

2. Representation Learning Techniques

ํ‘œํ˜„ ํ•™์Šต(Representation Learning) ์ด๋ž€? ๋ฐ์ดํ„ฐ๋ฅผ ์••์ถ•์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋Š” ํŠน์ง•์„ ์ž๋™์œผ๋กœ ๋ฐœ๊ฒฌํ•˜๋Š” ๊ธฐ์ˆ 

[Generative Adversial Networks, GANs]

- ์ƒ์„ฑ์ž(generator) ์™€ ํŒ๋ณ„์ž(discriminator)๊ฐ€ ์„œ๋กœ ๊ฒฝ์Ÿํ•˜๋Š” ํ›ˆ๋ จ ๊ณผ์ •(aka. ๋Œ€๋ฆฝ์  ํ•™์Šต)์„ ํ†ตํ•ด ์‚ฌ์‹ค์ ์ธ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

- ํ•˜์ง€๋งŒ ์ด์™€ ๊ฐ™์€ ๋Œ€๋ฆฝ์  ํ›ˆ๋ จ์€ ๋ถˆ์•ˆ์ •ํ•˜๋‹ค๋Š” ๊ฒƒ์ด ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค.

[Auto-encoders]

- ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋” ์‚ฌ์ด์˜ ์ •๋ณด ๋ณ‘๋ชฉ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ์›๋ณธ ์ž…๋ ฅ์„ ๋ณต์ œํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์žฌ๊ตฌ์„ฑํ•œ๋‹ค.

- ์˜คํ†  ์ธ์ฝ”๋”๋Š” ํŠน์ง• ํ•™์Šต ๋„๊ตฌ๋กœ์„œ์˜ ๊ฐ€๋Šฅ์„ฑ ์ž…์ฆํ•˜์˜€๊ณ , ๋งŽ์€ 3D ํ˜•์ƒ ํ•™์Šต ์—ฐ๊ตฌ์—์„œ ํ™œ์šฉ๋œ๋‹ค.

- ๋ณ€๋ถ„ ์˜คํ† ์ธ์ฝ”๋”(VAE)๋Š” ๋ณ‘๋ชฉ ํ˜„์ƒ์— ๊ฐ€์šฐ์‹œ์•ˆ ๋…ธ์ด์ฆˆ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋งค๋„๋Ÿฌ์šด ์ž ์žฌ ๊ณต๊ฐ„์„ ์ƒ์„ฑํ•œ๋‹ค.

[Optimizing Latent Vectors]

- ์ „์ฒด ์˜คํ† ์ธ์ฝ”๋”๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹ , ๋””์ฝ”๋”๋งŒ ํ•™์Šตํ•˜์—ฌ ์••์ถ•๋œ ๋ฐ์ดํ„ฐ ํ‘œํ˜„์„ ์ƒ์„ฑํ•˜๊ณ  ์ตœ์ ์˜ ์ž ์žฌ ๋ฒกํ„ฐ๋ฅผ ์ฐพ๋Š”๋‹ค.

- ์ด๋Š” ์ฃผ๋กœ ๋…ธ์ด์ฆˆ ๊ฐ์†Œ ๋ฐ ๋ˆ„๋ฝ ๋ฐ์ดํ„ฐ ๋ณต์›์— ํ™œ์šฉ๋œ๋‹ค.

 

๐Ÿ’ก ์—ฌ๊ธฐ์„œ, ์ž ์žฌ ๊ณต๊ฐ„(Latent Space) ๋ž€?

๋ฐ์ดํ„ฐ์˜ ์••์ถ•๋œ ํ‘œํ˜„์ด๋‚˜ ์ถ”์ƒ์ ์ธ ํŠน์ง•์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ณ ์ฐจ์› ๊ณต๊ฐ„์ด๋‹ค.
์ด๋Š” ์›๋ž˜ ๋ฐ์ดํ„ฐ(์˜ˆ: ์ด๋ฏธ์ง€, ์Œ์„ฑ ๋“ฑ)์˜ ๋ณต์žกํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ„๊ฒฐํ•œ ์ˆ˜ํ•™์  ํ‘œํ˜„์œผ๋กœ ๋ณ€ํ™˜ํ•œ ๊ฒƒ์œผ๋กœ, ๋ฐ์ดํ„ฐ์˜ ์ค‘์š”ํ•œ ํŒจํ„ด๋งŒ์„ ํฌํ•จํ•œ๋‹ค.

 

3. Shape Completion

ํ˜•์ƒ ์™„์„ฑ(shape completion) ์€ ํฌ์†Œํ•˜๊ฑฐ๋‚˜ ๋ถˆ์™„์ „ํ•œ 3D ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ณด์ด์ง€ ์•Š๋Š” ๋ถ€๋ถ„์„ ์˜ˆ์ธกํ•˜๋Š” ์ž‘์—…์ด๋‹ค.

์ด์ „ ๋ฐฉ๋ฒ•๋“ค:
- RBF (๋ฐฉ์‚ฌ ๊ธฐ์ € ํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‘œํ˜„ ํ•จ์ˆ˜ ๊ทผ์‚ฌ
- ๋ฐฉํ–ฅ์„ฑ point cloud ๋ฐ์ดํ„ฐ๋ฅผ ํ‘ธ์•„์†ก ๋ฌธ์ œ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ํ‘œ๋ฉด ์žฌ๊ตฌ์„ฑ
- ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ๋ฐฉ์‹์€ ๋‹จ์ผ ํ˜•์ƒ๋งŒ ์ฒ˜๋ฆฌ

์ตœ๊ทผ ์ ‘๊ทผ๋ฒ•: ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹์œผ๋กœ ๋ฐœ์ „
- ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ตฌ์กฐ ํ™œ์šฉ
- ๋ถ€๋ถ„ ๋ฐ์ดํ„ฐ๋ฅผ ์ž ์žฌ ๊ณต๊ฐ„์œผ๋กœ ๋ณ€ํ™˜ ํ›„ ํ•™์Šต๋œ ์‚ฌ์ „ ์ง€์‹์„ ์‚ฌ์šฉํ•ด ์ „์ฒด ํ˜•์ƒ์„ ์ƒ์„ฑ
- ๋‹ค์–‘ํ•œ ์ž…๋ ฅ ์œ ํ˜• ์ง€์›: point cloud, ๊นŠ์ด ๋งต, RGB ์ด๋ฏธ์ง€ ๋“ฑ

โžก๏ธ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์€ ๊ณ ์ „์  ๋ฐฉ๋ฒ•์˜ ํ•œ๊ณ„๋ฅผ ๋„˜์–ด ๋” ์œ ์—ฐํ•˜๊ณ  ๋ฐ์ดํ„ฐ์…‹ ์ „๋ฐ˜์— ๊ฑธ์ณ ์ ์šฉ ๊ฐ€๋Šฅ

 

 

Modeling SDFs with Neural Networks

 

Singed Distance Function (SDF) ๋Š” ์ ์ด ํ‘œ๋ฉด๊ณผ์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ์—ฐ์† ํ•จ์ˆ˜์ด๋ฉฐ, ๋ถ€ํ˜ธ๋Š” ์ ์ด ํ‘œ๋ฉด ๋‚ด๋ถ€(์Œ์ˆ˜)์ธ์ง€ ์™ธ๋ถ€(์–‘์ˆ˜)์ธ์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๊ธฐ์ €ํ‘œ๋ฉด(Underlying surface) ์€ SDF(·) = 0์ธ ๋“ฑ์น˜๋ฉด(iso-surface)์œผ๋กœ ์•”๋ฌต์ ์œผ๋กœ ํ‘œํ˜„๋œ๋‹ค.
์ด ์•”๋ฌต์  ํ‘œ๋ฉด์€ Marching Cubes ๋ฅผ ์ด์šฉํ•œ ๋ฉ”์‰ฌ์˜ ๋ž˜์Šคํ„ฐํ™”๋‚˜ ๋ ˆ์ด์บ์ŠคํŒ…(raycasting) ์„ ํ†ตํ•ด ๋ Œ๋”๋ง ๋  ์ˆ˜ ์žˆ๋‹ค.

 

DeepSDF ์ ‘๊ทผ๋ฒ•

ํ•ต์‹ฌ ์•„์ด๋””์–ด: ํฌ์ธํŠธ ์ƒ˜ํ”Œ๋กœ๋ถ€ํ„ฐ SDF ๋ฅผ ๋”ฅ๋Ÿฌ๋‹ ์‹ ๊ฒฝ๋ง์„ ํ†ตํ•ด ์ง์ ‘ ํšŒ๊ท€(regress)(์ง์ ‘ ํ•™์Šต) ํ•˜๋Š”๊ฒƒ์ด๋‹ค.

- ์‹ ๊ฒฝ๋ง์ด ํŠน์ • ์œ„์น˜์˜ SDF ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋„๋ก ํ›ˆ๋ จ (Fig 2. ์ฐธ๊ณ )
- ์ด๋ฅผ ํ†ตํ•ด SDF(x) = 0 ์ธ ํ‘œ๋ฉด(๋“ฑ์น˜๋ฉด)์„ ์ถ”์ถœ ๊ฐ€๋Šฅ
- deep feed-forward network ๋Š” universal function approximator(๋ณดํŽธ์  ํ•จ์ˆ˜ ๊ทผ์‚ฌ๊ธฐ) ๋กœ์„œ ์ด๋ก ์ ์œผ๋กœ ์ž„์˜์˜ ์ •๋ฐ€๋„๋กœ ์—ฐ์†์ ์ธ ํ˜•์ƒ ํ•จ์ˆ˜๋ฅผ ๊ทผ์‚ฌํ•  ์ˆ˜ ์žˆ๋‹ค. 
- ํ•˜์ง€๋งŒ ์‹ค์งˆ์ ์œผ๋กœ๋Š” ์ œํ•œ๋œ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ์™€ ๋„คํŠธ์›Œํฌ ์šฉ๋Ÿ‰์œผ๋กœ ์ธํ•ด ์ •๋ฐ€๋„๊ฐ€ ํ•œ์ •๋œ๋‹ค.

<Fig 2.> DeepSDF ํ‘œํ˜„์„ Stanford Bunny์— ์ ์šฉํ•œ ๊ฒฐ๊ณผ
(a) ํ‘œ๋ฉด ๋‚ด๋ถ€(SDF < 0)์™€ ์™ธ๋ถ€(SDF > 0)์—์„œ ์ƒ˜ํ”Œ๋ง๋œ ์ ๋“ค์„ ํ•™์Šตํ•œ SDF = 0์˜ ๊ธฐ์ € ์•”๋ฌต์  ํ‘œ๋ฉด ํ‘œํ˜„
(b) ์„œ๋ช… ๊ฑฐ๋ฆฌ ํ•„๋“œ(Signed Distance Field)์˜ 2D ๋‹จ๋ฉด
(c) SDF = 0 ์—์„œ ๋ณต์›๋œ 3D ํ‘œ๋ฉด(๋“ฑ์น˜๋ฉด)์˜ ๋ Œ๋”๋ง
(b)์™€ (c) ๋Š” ๋ชจ๋‘ DeepSDF ๋ฅผ ํ†ตํ•ด ๋ณต์›๋จ

 

ํ›ˆ๋ จ ๊ณผ์ •

Fig 3. a ์— ๋‚˜์™€ ์žˆ๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์ฃผ์–ด์ง„ ๋ชฉํ‘œ ํ˜•์ƒ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ๋‹จ์ผ ๋”ฅ ๋„คํŠธ์›Œํฌ๋ฅผ ํ›ˆ๋ จ์‹œํ‚จ๋‹ค.

1. ๋ฐ์ดํ„ฐ ์ค€๋น„: 3D ์  ์ƒ˜ํ”Œ๊ณผ ํ•ด๋‹น SDF ๊ฐ’์œผ๋กœ ๊ตฌ์„ฑ๋œ ์Œ์˜ ์ง‘ํ•ฉ X ์ค€๋น„

 

2. ๋ชฉํ‘œ
: ํ›ˆ๋ จ ์„ธํŠธ S์—์„œ multi-layer fully-connected network์ธ fθ(x) ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜ θ ๋ฅผ ํ›ˆ๋ จ์‹œ์ผœ, ํƒ€๊ฒŸ ๋„๋ฉ”์ธ Ω ์—์„œ SDF ๋ฅผ ์ž˜ ๊ทผ์‚ฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•™์Šต

 

3. ์†์‹ค ํ•จ์ˆ˜:  L1 ์†์‹ค์„ ์ตœ์†Œํ™”

ํ›ˆ๋ จ์€ ํฌ์ธํŠธ X ์— ๋Œ€ํ•œ ์˜ˆ์ธก SDF ๊ฐ’๊ณผ ์‹ค์ œ SDF ๊ฐ’ ์‚ฌ์ด์˜ ์†์‹ค์˜ ํ•ฉ์„ ์ตœ์†Œํ™” ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ด๋ฃจ์–ด์ง„๋‹ค.

์•„๋ž˜ ์‹์€ SDF ์˜ ํŠน์ • ๊ฑฐ๋ฆฌ δ ๋ฅผ ์ œ์–ดํ•œ๋‹ค. ๋งค๊ฐœ๋ณ€์ˆ˜ δ ๋Š” ํ‘œ๋ฉด์œผ๋กœ๋ถ€ํ„ฐ ์ผ์ • ๊ฑฐ๋ฆฌ ๋‚ด์—์„œ ๋ฉ”ํŠธ๋ฆญ SDF ๋ฅผ ์œ ์ง€ํ•˜๋„๋ก ํ•œ๋‹ค.

δ๊ฐ€ ํฌ๋ฉด ์ƒ˜ํ”Œ์ด ์•ˆ์ „ํ•œ ์Šคํ… ํฌ๊ธฐ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋น ๋ฅธ ray tracing ๊ฐ€๋Šฅ, ์ž‘์œผ๋ฉด ํ‘œ๋ฉด ๊ทผ์ฒ˜ ์„ธ๋ถ€์‚ฌํ•ญ ํ•™์Šต์— ์ง‘์ค‘

 

4. ๋„คํŠธ์›Œํฌ ์„ค๊ณ„ ๋ฐ ํ•™์Šต

- ๊ตฌ์กฐ: 8๊ฐœ fully-connected ๋ ˆ์ด์–ด, ๊ฐ ๋ ˆ์ด์–ด๋Š” 512 ์ฐจ์›, ReLU ํ™œ์„ฑํ™” ํ•จ์ˆ˜

- ์ถœ๋ ฅ ํ™œ์„ฑํ™” : tanh

- overfitting ๋ฐฉ์ง€ : dropout , ADAM optimizer ์‚ฌ์šฉ, ๊ฐ€์ค‘์น˜ ์ •๊ทœํ™”(weight normalization)์œผ๋กœ ๋Œ€์ฒดํ•˜์—ฌ ๋ฐฐ์น˜ ์ •๊ทœํ™”๋ฅผ ์•ˆ์ •ํ™” 

- SDF ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•œ ์—ญ์ „ํŒŒ(back propagation)์„ ํ†ตํ•ด ๊ณต๊ฐ„์  ๋ฏธ๋ถ„(gradient) ์„ ๊ณ„์‚ฐํ•˜์—ฌ ํ‘œ๋ฉด์˜ ์ •ํ™•ํ•œ ๋ฒ•์„  ๋ฒกํ„ฐ ๊ณ„์‚ฐ ๊ฐ€๋Šฅ

๊ณต๊ฐ„์  ๋ฏธ๋ถ„

 

๐Ÿ“Œ ์š”์•ฝ ํ•˜์ž๋ฉด
DeepSDF ๋Š” ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ SDF ๋ฅผ ํ•™์Šตํ•˜์—ฌ ์—ฐ์†์ ์ด๊ณ  ์ •ํ™•ํ•œ 3D ํ˜•์ƒ์„ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.
- SDF ๋Š” ์ ์ด ํ‘œ๋ฉด๊ณผ์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ๋ถ€ํ˜ธ๋ฅผ ํ†ตํ•ด ๋‚ด/์™ธ๋ถ€๋ฅผ ๊ตฌ๋ถ„ํ•œ๋‹ค.
- ์ธ์ฝ”๋” ์—†์ด 
ํฌ์ธํŠธ ์ƒ˜ํ”Œ๋งŒ์œผ๋กœ
 ๋„คํŠธ์›Œํฌ๋ฅผ ํ›ˆ๋ จํ•ด ํ˜•์ƒ์„ ํ•™์Šตํ•œ๋‹ค.
- ์žฅ์ : ํ‘œ๋ฉด ๋ฒ•์„ ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๊ณ , ๋‹ค์–‘ํ•œ ๋ Œ๋”๋ง ๋ฐ ์žฌ๊ตฌ์„ฑ ์ž‘์—…์— ํ™œ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค.

 

 

 

Learning the Latent Space of Shapes

1. ๋ฌธ์ œ ์ •์˜
- ๊ฐ ๋ชจ์–‘์— ๋Œ€ํ•ด ๋ณ„๋„๋กœ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์€ ๋น„ํšจ์œจ์ 
- ๋‹ค์–‘ํ•œ ๋ชจ์–‘์˜ ๊ณตํ†ต ์†์„ฑ์„ ํ•™์Šตํ•˜๊ณ , ์ด๋ฅผ ์ €์ฐจ์› ์ž ์žฌ ๊ณต๊ฐ„์— ๋‚ด์žฌํ™”ํ•˜๋Š” ๋ชจ๋ธ์ด ํ•„์š”

2. ์ž ์žฌ ๋ฒกํ„ฐ ๋„์ž…
- ์ž ์žฌ ๋ฒกํ„ฐ z ๋Š” ๋ชจ์–‘ ์ •๋ณด๋ฅผ ์ธ์ฝ”๋”ฉํ•œ๋‹ค
- ์‹ ๊ฒฝ๋ง์˜ ๋‘ ๋ฒˆ์งธ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์—ฐ์†์ ์ธ 3D SDF ํ‘œํ˜„์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค(Fig 3b. ์ฐธ์กฐ)

3. ๋ชจ๋ธ ๊ตฌ์„ฑ
- ๋‹จ์ผ ๋„คํŠธ์›Œํฌ fθ ๊ฐ€ ์ž ์žฌ ๋ฒกํ„ฐ z_i ์™€ 3D ์œ„์น˜ x ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ๋‹ค์–‘ํ•œ SDF๋ฅผ ๋ชจ๋ธ๋งํ•œ๋‹ค.
- ํ‘œ๋ฉด์€ fθ(z, x) = 0 ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„๋กœ ํ‘œํ˜„๋˜๋ฉฐ, ์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด ์ด์‚ฐํ™”(discretized) ๊ฐ€๋Šฅํ•˜๋‹ค.

 

Motivating Encoder-less Learning

1. ๊ธฐ์กด ๋ฐฉ์‹์˜ ํ•œ๊ณ„
- ๊ธฐ์กด auto-encoder ๋Š” ๋ณ‘๋ชฉ์˜ ํŠน์ง•์œผ๋กœ ์ž ์žฌ ๊ณต๊ฐ„์„ ํ•™์Šตํ•˜์ง€๋งŒ, ํ…Œ์ŠคํŠธ ์‹œ ์ธ์ฝ”๋”๋Š” ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š”๋‹ค.
- ๊ทธ๋ž˜์„œ ํ›ˆ๋ จ ์ค‘ ์ธ์ฝ”๋” ์‚ฌ์šฉ์ด ๊ณ„์‚ฐ ์ž์› ํšจ์œจ์„ฑ ์ธก๋ฉด์—์„œ ์ตœ์„ ์ธ์ง€ ๋ถˆ๋ช…ํ™•ํ•˜๋‹ค.

 

2. auto-decoder ๋„์ž… ๋™๊ธฐ
- ์ธ์ฝ”๋” ์—†์ด ๋””์ฝ”๋”๋งŒ์œผ๋กœ ๋ชจ์–‘ ์ž„๋ฒ ๋”ฉ์„ ํ•™์Šตํ•œ๋‹ค. (Fig 4. ์ฐธ๊ณ )
- ํ…Œ์ŠคํŠธ ์‹œ ์ž ์žฌ ๋ฒกํ„ฐ๋ฅผ ์ตœ์ ํ™”ํ•˜์—ฌ ์ž…๋ ฅ ๊ด€์ฐฐ๊ฐ’๊ณผ ์ผ์น˜์‹œํ‚ค๋Š” ๋ฐฉ์‹

3. ์„ฑ๊ณผ
- ์—ฐ์† SDF ๋ฅผ ํ•™์Šตํ•œ ์ž๋™ ๋””์ฝ”๋”๋Š” ๊ณ ํ’ˆ์งˆ 3D ์ƒ์„ฑ ๋ชจ๋ธ ์ƒ์„ฑ ๊ฐ€๋Šฅ
- ์ž ์žฌ ๊ณต๊ฐ„ ์ •๊ทœํ™”๋ฅผ ๋„์ž…ํ•˜์—ฌ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ ๊ฐœ์„ ํ•˜๋Š” ํ™•๋ฅ ์  ๊ตฌ์„ฑ(probabilistic formulation) ๊ฐœ๋ฐœ
- 3D ํ•™์Šต ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ์ž๋™ ๋””์ฝ”๋” ํ•™์Šต ๋ฐฉ๋ฒ• ์ฒ˜์Œ์œผ๋กœ ๋„์ž…ํ–ˆ๋‹ค.

Auto-decoder-based DeepSDF Formulation

auto-decoder ๊ธฐ๋ฐ˜์˜ shape-coded DeepSDF ๊ณต์‹ํ™”๋ฅผ ์œ ๋„ํ•˜๊ธฐ ์œ„ํ•ด ํ™•๋ฅ ์  ๊ด€์ (probabilistic perspective)๋ฅผ ์ฑ„ํƒํ–ˆ๋‹ค.

 

N ๊ฐœ์˜ SDF ๋กœ ํ‘œํ˜„๋œ ๋ชจ์–‘ ๋ฐ์ดํ„ฐ ์…‹์ด ์žˆ์„๋•Œ, ๊ฐ ๋ชจ์–‘ SDF_i ์—์„œ K ๊ฐœ์˜ ์  ์ƒ˜ํ”Œ๊ณผ ํ•ด๋‹น ์„œ๋ช… ๊ฑฐ๋ฆฌ ๊ฐ’์„ ์ค€๋น„ํ•œ๋‹ค.

auto-decoder์—๋Š” encoder๊ฐ€ ์—†์œผ๋ฏ€๋กœ, ๊ฐ ์ž ์žฌ ์ฝ”๋“œ z_i ๊ฐ€ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ X_i ์™€ ์Œ์„ ์ด๋ฃฌ๋‹ค.
๋ชจ์–‘ SDF ์ƒ˜ํ”Œ X_i ์— ๋Œ€ํ•œ ์ž ์žฌ ์ฝ”๋“œ z_i ์˜ ์‚ฌํ›„ ํ™•๋ฅ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ถ„ํ•ด๋œ๋‹ค.

์—ฌ๊ธฐ์„œ θ ๋Š” SDF ๊ฐ€๋Šฅ๋„๋ฅผ ๋งค๊ฐœํ•œ๋‹ค.

์ž ์žฌ ๊ณต๊ฐ„์—์„œ ์ž ์žฌ ์ฝ”๋“œ p(z_i) ์˜ ์‚ฌ์ „ ํ™•๋ฅ ์€ ํ‰๊ท  0์˜ ๋‹ค๋ณ€๋Ÿ‰ ๊ฐ€์šฐ์‹œ์•ˆ ๋ถ„ํฌ(multivariate-Gaussian) ๋กœ ๊ฐ€์ •๋œ๋‹ค:

์ด ์‚ฌ์ „ ํ™•๋ฅ ์€ ๋ชจ์–‘ ์ฝ”๋“œ๊ฐ€ ๋ฐ€์ง‘๋˜๋„๋ก ํ•˜๊ณ , ์ปดํŒฉํŠธํ•œ ๋ชจ์–‘ ๋งค๋‹ˆํด๋“œ๋ฅผ ์ถ”๋ก ํ•˜๋ฉฐ ์ข‹์€ ์†”๋ฃจ์…˜์œผ๋กœ ์ˆ˜๋ ดํ•˜๋„๋ก ๋•๋Š”๋‹ค.

SDF ๊ฐ€๋Šฅ๋„๋Š” deep feed-forward network fθ(z_i, x_j) ๋กœ ํ‘œํ˜„๋˜๋ฉฐ, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง„๋‹ค:

(8) ๋„คํŠธ์›Œํฌ๊ฐ€ ์˜ˆ์ธกํ•œ SDF
(8) ์‹ค์ œ SDF ๊ฐ’ s_j์™€์˜ ํŽธ์ฐจ๋ฅผ ๋ฒŒ์ ํ™”ํ•˜๋Š” ์†์‹คํ•จ์ˆ˜

ํ›ˆ๋ จ ์‹œ, ๋ชจ๋“  ํ›ˆ๋ จ ๋ชจ์–‘์— ๋Œ€ํ•ด ๊ฐœ๋ณ„ ๋ชจ์–‘ ์ฝ”๋“œ์™€ ๋„คํŠธ์›Œํฌ ๋งค๊ฐœ๋ณ€์ˆ˜ θ ์— ๋Œ€ํ•œ ๊ณต๋™ ๋กœ๊ทธ ์‚ฌํ›„ ํ™•๋ฅ (the joint log posterior) ์„ ์ตœ๋Œ€ํ™”:

๊ฐœ๋ณ„ ๋ชจ์–‘ ์ฝ”๋“œ

์ถ”๋ก  ์‹œ, ํ›ˆ๋ จ ํ›„ θ ๋ฅผ ๊ณ ์ •ํ•œ ์ƒํƒœ์—์„œ ๋ชจ์–‘ X_i์˜ ์ž ์žฌ์ฝ”๋“œ z_i๋Š” ๋‹ค์Œ ์ตœ๋Œ€ ์‚ฌํ›„ ํ™•๋ฅ (Maximum-a-Posterior,MAP) ์ถ”์ •์œผ๋กœ ๊ณ„์‚ฐ๋จ: 

์ด ๊ณต์‹์€ SDF ์ƒ˜ํ”Œ X ๊ฐ€ ์ž„์˜์˜ ํฌ๊ธฐ์™€ ๋ถ„ํฌ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ์†์‹ค์— ๋Œ€ํ•œ z ์˜ ๊ธฐ์šธ๊ธฐ๊ฐ€ ๊ฐ ์ƒ˜ํ”Œ์— ๋Œ€ํ•ด ๊ฐœ๋ณ„์ ์œผ๋กœ ๊ณ„์‚ฐ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ์œ ํšจํ•˜๋‹ค. ์ด๋Š” DeepSDF ๊ฐ€ ๊นŠ์ด ์ง€๋„์™€ ๊ฐ™์€ ๋ถ€๋ถ„ ๊ด€์ฐฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•˜๋ฉฐ, ์ด๋Š” ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์™€ ์œ ์‚ฌํ•œ ์ž…๋ ฅ์„ ์š”๊ตฌํ•˜๋Š” ์ž๋™ ์ธ์ฝ”๋” ํ”„๋ ˆ์ž„์›Œํฌ๋ณด๋‹ค ํฐ ์ด์ ์ด๋‹ค.

์ž ์žฌ ์ฝ”๋“œ z ๋Š” ์ž…๋ ฅ ๋ ˆ์ด์–ด์™€ 4๋ฒˆ์งธ ๋ ˆ์ด์–ด์—์„œ ์ƒ˜ํ”Œ ์œ„์น˜์™€ ํ•จ๊ป˜ ๋„คํŠธ์›Œํฌ์— ์ œ๊ณต๋œ๋‹ค. Adam ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๋ฉฐ, z ๋Š” N(0, 0.012) ์—์„œ ๋ฌด์ž‘์œ„๋กœ ์ดˆ๊ธฐํ™”๋œ๋‹ค.

 

โœจ ์š”์•ฝ ์ •๋ฆฌ โœจ

1. ์ž๋™ ๋””์ฝ”๋” ๋ฐฉ์‹
- ์ž ์žฌ ์ฝ”๋“œ z ์™€ ์  ์ƒ˜ํ”Œ X ๋ฅผ ์‚ฌ์šฉํ•ด ์—ฌ๋Ÿฌ SDF ๋ฅผ ๋‹จ์ผ ๋„คํŠธ์›Œํฌ๋กœ ํ•™์Šต์‹œํ‚จ๋‹ค.
- SDF ๊ฐ€๋Šฅ๋„๋Š” ๋„คํŠธ์›Œํฌ fθ(z, x) ๋กœ ํ‘œํ˜„๋˜๋ฉฐ, ์†์‹ค ํ•จ์ˆ˜ L ๋กœ ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์ฐจ์ด๋ฅผ ์ธก์ •ํ•œ๋‹ค.

2. ํ›ˆ๋ จ ๋ฐ ์ถ”๋ก 
- ํ›ˆ๋ จ: ๋กœ๊ทธ ์‚ฌํ›„ ํ™•๋ฅ ์„ ์ตœ์ ํ™”ํ•ด ๋„คํŠธ์›Œํฌ์™€ ์ž ์žฌ ์ฝ”๋“œ๋ฅผ ํ•™์Šตํ•œ๋‹ค.
- ์ถ”๋ก : MAP ์ถ”์ •์„ ํ†ตํ•ด ์ฃผ์–ด์ง„ ๋ชจ์–‘์˜ ์ž ์žฌ ์ฝ”๋“œ๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

3. ์žฅ์ 
- ์ž„์˜ ํฌ๊ธฐ/๋ถ„ํฌ์˜ ๋ฐ์ดํ„ฐ์™€ ๋ถ€๋ถ„ ๊ด€์ฐฐ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ
- ๊ธฐ์กด ์ž๋™ ์ธ์ฝ”๋”์™€ ๋‹ฌ๋ฆฌ ์ธ์ฝ”๋” ํ•„์š” ์—†์ด ์œ ์—ฐํ•œ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ

4. VAE ์™€์˜ ๋น„๊ต
- ์ž ์žฌ ์ฝ”๋“œ์— ๋Œ€ํ•ด ๊ฐ€์šฐ์‹œ์•ˆ ์‚ฌ์ „ ๋ถ„ํฌ๋ฅผ ๊ณต์œ ํ•˜์ง€๋งŒ, VAE์˜ ํ™•๋ฅ ์  ์ตœ์ ํ™” ๋ฐฉ์‹์€ 3D ํ•™์Šต์— ๋ถ€์ ํ•ฉ

 

 

Data Preparation

3D ๋ฉ”์‹œ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ SDF ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜์—ฌ ์—ฐ์†ํ˜• SDF ๋ชจ๋ธ ํ›ˆ๋ จ์„ ์ค€๋น„ํ•œ๋‹ค. ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

- ๋ฉ”์‹œ๋ฅผ ๋‹จ์œ„ ๊ตฌ๋กœ ์ •๊ทœํ™”ํ•˜๊ณ , 500,000๊ฐœ์˜ ๊ณต๊ฐ„ ์  ์ƒ˜ํ”Œ๋ง
- ํ‘œ๋ฉด ๊ทผ์ฒ˜์—์„œ ๋” ๋งŽ์€ ์ƒ˜ํ”Œ๋ง์„ ํ†ตํ•ด ์„ธ๋ฐ€ํ•œ ์ •๋ณด๋ฅผ ํฌ์ฐฉ
- ๊ฐ€์ƒ ์นด๋ฉ”๋ผ์™€ ํ‘œ๋ฉด ๋ฒ•์„ ์„ ์ด์šฉํ•ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ •๋ ฌ๋œ ํ‘œ๋ฉด ์  ์ถ”์ถœ
- ๋‹ซํžˆ์ง€ ์•Š์€ ๋ชจ์–‘ ๋˜๋Š” ๋‚ด๋ถ€ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„ ๋ฉ”์‹œ๋Š” ์ œ๊ฑฐ (๊ฒฐํ•จ ์ฒ˜๋ฆฌ)
- ๊ฐ ์ƒ˜ํ”Œ ์ ์— ๋Œ€ํ•ด SDF ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ๋ชจ๋ธ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉ

 

Results

DeepSDF ์˜ ํ‘œํ˜„ ๋Šฅ๋ ฅ์„ ์ž…์ฆํ•˜๊ธฐ ์œ„ํ•ด ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ๋‹ค.

์ด๋Š” ๊ธฐํ•˜ํ•™์  ์„ธ๋ถ€์‚ฌํ•ญ์„ ํ‘œํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ๊ณผ ์›ํ•˜๋Š” ํ˜•ํƒœ์˜ ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„์„ ํ•™์Šตํ•˜๋Š” ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ํฌํ•จํ•œ๋‹ค. 

ํฌ๊ฒŒ ๋„ค ๊ฐ€์ง€ ์ฃผ์š” ์‹คํ—˜์„ ํ†ตํ•ด DeepSDF ์˜ ๋Šฅ๋ ฅ์„ ํ…Œ์ŠคํŠธํ•œ๋‹ค.

์‹คํ—˜ ํ•ญ๋ชฉ :
- ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ ํ‘œํ˜„(๊ธฐํ•˜ํ•™์  ์„ธ๋ถ€์‚ฌํ•ญ์„ ์žฌํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ)
- ๋ณด์ง€ ๋ชปํ•œ ํ˜•์ƒ ๋ณต์›(ํ•™์Šต๋œ ํŠน์ง• ํ‘œํ˜„์„ ์‚ฌ์šฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ํ˜•์ƒ์„ ๋ณต์›ํ•˜๋Š” ๋Šฅ๋ ฅ)
- ๋ถ€๋ถ„ ํ˜•์ƒ ์™„์„ฑ(ํ˜•์ƒ priors(์„ ํ–‰ ์ •๋ณด)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ€๋ถ„์ ์ธ ํ˜•์ƒ์„ ์™„์„ฑํ•˜๋Š” ๋Šฅ๋ ฅ)
- ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„ ์ƒ˜ํ”Œ๋ง(๋ถ€๋“œ๋Ÿฝ๊ณ  ์™„์ „ํ•œ ํ˜•์ƒ ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„์„ ํ•™์Šตํ•˜๊ณ  ์ƒˆ๋กœ์šด ํ˜•์ƒ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ๋Šฅ๋ ฅ)

๋ฐ์ดํ„ฐ์…‹: ShapeNet ์‚ฌ์šฉ

๋น„๊ต ๋Œ€์ƒ: OGN(์ตœ์‹  ์˜ฅํŠธ๋ฆฌ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹), AtlasNet(๋ฉ”์‹œ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹), 3D-EPN(๋ณผ๋ฅ˜๋ฉ”ํŠธ๋ฆญ SDF ๊ธฐ๋ฐ˜ ํ˜•์ƒ ์™„์„ฑ ๋ฐฉ์‹)

 

Representing Known 3D Shapes

๋ชฉ์ : DeepSDF ์˜ ์ œํ•œ๋œ ์ž ์žฌ ์ฝ”๋“œ(latent code)๋ฅผ ์‚ฌ์šฉํ•œ ๊ธฐ์กด ํ˜•์ƒ ํ‘œํ˜„ ๋Šฅ๋ ฅ ํ‰๊ฐ€

- CD (Chamfer Distance, ์ž‘์„์ˆ˜๋ก ์šฐ์ˆ˜) ์—์„œ DeepSDF๊ฐ€ OGN, AtlasNet ๋ณด๋‹ค ์šฐ์ˆ˜
- EMD (Earth Mover Distance, ์ž‘์„์ˆ˜๋ก ์šฐ์ˆ˜) ์ฐจ์ด๋Š” ์ ์—ˆ์œผ๋‚˜, ์ด๋Š” ์ ์€ ์ˆ˜์˜ ์ƒ˜ํ”Œ ํฌ์ธํŠธ(500๊ฐœ) ๋กœ ์ธํ•ด ๋ฐœ์ƒ

Fig 5. ๋Š” DeepSDF (left)๊ฐ€ OGN (right) ๋ณด๋‹ค ์ •์„ฑ์  ํ‰๊ฐ€์—์„œ๋„ ์šฐ์ˆ˜ํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค.

 

Representing Test 3D shapes (auto-encoding)

๋ชฉ์ : ํ…Œ์ŠคํŠธ ์…‹์—์„œ ์•Œ ์ˆ˜ ์—†๋Š” ํ˜•์ƒ์— ๋Œ€ํ•œ ํ‘œํ˜„ ๋Šฅ๋ ฅ ํ‰๊ฐ€

- DeepSDF ๋Š” ๋Œ€๋ถ€๋ถ„์˜ ํ˜•์ƒ ํด๋ž˜์Šค์™€ ์ง€ํ‘œ์—์„œ AtlasNet ๋ณด๋‹ค ์šฐ์ˆ˜ํ•˜๋‹ค.
- AtlasNet ์€ ๊ตฌ๋ฉ ์—†๋Š” ํ˜•์ƒ์— ์ ํ•ฉํ•˜๊ณ , ๊ตฌ๋ฉ์ด ๋งŽ์€ ํ˜•์ƒ์—์„œ๋Š” ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜์—ˆ๋‹ค.

๊ตฌ๋ฉ์ด ์žˆ๋Š” ์˜์ž ๋ฐ์ดํ„ฐ์—์„œ๋Š” AtlasNet์ด ์„ฑ๋Šฅ ์ €ํ•˜
์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ 2๊ฐœ๋Š” DeepSDF์˜ ์‹คํŒจ์ž‘์ด๋‹ค. ์ด๋Š” training data์˜ ๋ถ€์กฑ๊ณผ minimization convergence์˜ ์‹คํŒจ ๋•Œ๋ฌธ์ด๋‹ค.

 

Shape Completion

๋ชฉ์ : ๋ถ€๋ถ„ ํ˜•์ƒ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์™„์ „ํ•œ ํ˜•์ƒ์„ ์ƒ์„ฑํ•˜๋Š” ๋Šฅ๋ ฅ ํ‰๊ฐ€

๋ฐฉ๋ฒ•
1. ๊นŠ์ด ๊ด€์ฐฐ ๋ฐ์ดํ„ฐ ๋ณ€ํ™˜
- ๊นŠ์ด ์ด๋ฏธ์ง€์—์„œ SDF ์ƒ˜ํ”Œ ์ƒ์„ฑ
    - ๊ฐ ๊นŠ์ด ๊ด€์ฐฐ ์ง€์ ์—์„œ ํ‘œ๋ฉด ๋ฒ•์„ ์„ ๋”ฐ๋ผ ๊ฑฐ๋ฆฌ η ๋งŒํผ ๋–จ์–ด์ง„ ๋‘ ์ ์„ ์ƒ˜ํ”Œ๋งํ•œ๋‹ค.
    - ์ด ๋‘ ์ ์˜ SDF ๊ฐ’์€ ๊ฐ๊ฐ η ๋ฐ -η ๋กœ ๊ฐ„์ฃผํ•œ๋‹ค.
- ๋นˆ ๊ณต๊ฐ„(free-space) ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€
    - ๊ด€์ฐฐ๋œ ํ‘œ๋ฉด๊ณผ ์นด๋ฉ”๋ผ ์‚ฌ์ด์˜ ๋นˆ ๊ณต๊ฐ„์„ ๋”ฐ๋ผ ์ ์„ ์ƒ˜ํ”Œ๋งํ•œ๋‹ค.
    - ํ•ด๋‹น ์ ์˜ SDF ๊ฐ’์€ 0๋ณด๋‹ค ํฐ ๊ฐ’์ด ๋˜๋„๋ก ์ œ์•ฝ์„ ์ ์šฉํ•œ๋‹ค.
- ์†์‹ค ํ•จ์ˆ˜ ๋ฐ ์ตœ์ ํ™”
    - Eq. 4 ์˜ ํด๋žจํ”„(clamp) ๊ฐ’์„ η ๋กœ ์„ค์ •ํ•˜์—ฌ MAP ์ถ”์ •์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.
    - ๋นˆ ๊ณต๊ฐ„ ์†์‹ค์€ fθ(z, x_j) ๊ฐ€ 0 ๋ณด๋‹ค ์ž‘์„ ๊ฒฝ์šฐ ์ ˆ๋Œ€๊ฐ’์„ ๋ถ™์—ฌ์„œ ์‚ฌ์šฉํ•˜๊ณ , ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์œผ๋กœ ์„ค์ •ํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๊ณ„์‚ฐ๋œ ๋นˆ ๊ณต๊ฐ„ ์†์‹ค ๊ฐ’์€ z ์™€ θ๋ฅผ ํ•™์Šตํ•˜๋Š” MAP ์ถ”์ • ๊ณผ์ • (Eq. 10)์— ํฌํ•จ๋œ๋‹ค. 
- Map ์ตœ์ ํ™”๋ฅผ ํ†ตํ•ด ์ถ”์ •๋œ ํ˜•์ƒ ์ฝ”๋“œ๋Š” ๋””์ฝ”๋”๋ฅผ ํ†ตํ•ด ํ˜•์ƒ ์ƒ์„ฑ์— ์‚ฌ์šฉ๋œ๋‹ค.

๊ฒฐ๊ณผ
- DeepSDF ๋Š” ๊ธฐ์กด ๋ณผ๋ฅ˜๋ฉ”ํŠธ๋ฆญ ๋ฐฉ์‹๋ณด๋‹ค ๋” ์ •ํ™•ํ•˜๊ณ  ์‹œ๊ฐ์ ์œผ๋กœ ์šฐ์ˆ˜ํ•œ ํ˜•์ƒ ์™„์„ฑ ๊ฒฐ๊ณผ ์ œ๊ณต
- DeepSDF ๋Š” ์—ฐ์†์ ์ธ SDF ํ‘œํ˜„์˜ ์žฅ์ ์„ ๋ณด์—ฌ์คŒ

์ •๋Ÿ‰์  ๊ฒฐ๊ณผ
์ •์„ฑ์  ๊ฒฐ๊ณผ

 

Latent Space Shape Interpolation

๋ชฉ์ : ํ•™์Šต๋œ ํ˜•์ƒ ์ž„๋ฒ ๋”ฉ(latent embedding) ์ด ์™„์ „ํ•˜๊ณ  ์—ฐ์†์ ์ž„์„ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•จ

๋ฐฉ๋ฒ•
- ์ž ์žฌ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ๋‘ ํ˜•์ƒ ๊ฐ„์˜ ๋ณด๊ฐ„(interpolation) ์„ ์ˆ˜ํ–‰
- ๋ณด๊ฐ„๋œ ์ž ์žฌ ๋ฒกํ„ฐ๋ฅผ ๋””์ฝ”๋”์— ์ „๋‹ฌํ•˜์—ฌ ํ˜•์ƒ์„ ๋ Œ๋”๋ง
- Fig 1. ์— ๋ณด๊ฐ„ ๊ฒฐ๊ณผ ์ œ์‹œ

๊ฒฐ๊ณผ
- ๋ณด๊ฐ„ ๊ณผ์ •์—์„œ ์ƒ์„ฑ๋œ ํ˜•์ƒ์€ ๋ชจ๋‘ ์˜๋ฏธ ์žˆ๋Š” ํ˜•ํƒœ๋ฅผ ์œ ์ง€
- ์˜์ž ํŒ”๊ฑธ์ด ๊ฐ™์€ ํŠน์ง•์ด ์ž ์žฌ ๊ณต๊ฐ„์—์„œ ์„ ํ˜•์ ์œผ๋กœ ๋ณด๊ฐ„๋จ
- ์ด๋Š” DeepSDF ๊ฐ€ ํ˜•์ƒ์˜ ๊ณตํ†ต์ ์ด๊ณ  ํ•ด์„ ๊ฐ€๋Šฅํ•œ ํŠน์ง•์„ ํšจ๊ณผ์ ์œผ๋กœ ํ‘œํ˜„ํ•จ์„ ๋‚˜ํƒ€๋ƒ„

 

 

Conclusion


- DeepSDF ๋Š” 3D ํ˜•์ƒ ํ‘œํ˜„ ๋ฐ ๋ณต์› ์ž‘์—…์—์„œ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์„ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ, ๋ณต์žกํ•œ ๊ตฌ์กฐ์™€ ๋‹ซํžŒ ํ‘œ๋ฉด์„ ํšจ๊ณผ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ  ๊ณ ํ’ˆ์งˆ์˜ ํ‘œ๋ฉด ๋ฒ•์„  ์ •๋ณด๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
- ํฌ์ธํŠธ๋ณ„ SDF ์ƒ˜ํ”Œ๋ง์€ ํšจ์œจ์ ์ด์ง€๋งŒ, ํ˜•์ƒ ๋ณต์›(auto-decoding)์€ ์ž ์žฌ ๋ฒกํ„ฐ์— ๋Œ€ํ•œ ๋ช…์‹œ์  ์ตœ์ ํ™”๊ฐ€ ํ•„์š”ํ•ด์„œ ์ถ”๋ก  ์‹œ๊ฐ„์ด ๊ธธ์–ด์ง€๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค.
- ์žฅ์ : ๋ณต์žกํ•œ ํ˜•์ƒ์„ ๋” ์ ์€ ๋ฉ”๋ชจ๋ฆฌ๋กœ ํ‘œํ˜„ํ•˜๋ฉฐ ์ด์ „ ๋ฐฉ๋ฒ•๋ณด๋‹ค ๋”์šฑ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค€๋‹ค.