๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ˜ŽAI/3D Reconstruction

[Paper Review] Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

by SolaKim 2024. 11. 6.

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Nanyang Wang1 โ‹†, Yinda Zhang2 โ‹†, Zhuwen Li3 โ‹†, Yanwei Fu4, Wei Liu5, Yu-Gang Jiang1 โ€ 
1 Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University
2 Princeton University
3 Intel Labs
4 School of Data Science, Fudan University
5 Tencent AI Lab

 

Abstract.

์ด ๋…ผ๋ฌธ์€ ๋‹จ์ผ ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ 3D ํ˜•์ƒ์„ ์‚ผ๊ฐํ˜• ๋ฉ”์‰ฌ(triangular mesh) ํ˜•ํƒœ๋กœ ์ƒ์„ฑํ•˜๋Š” ์ข…๋‹จ ๊ฐ„(end-to-end) ๋”ฅ๋Ÿฌ๋‹ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ด์ „์˜ ๋ฐฉ๋ฒ•๋“ค์€ ์ฃผ๋กœ 3D ํ˜•์ƒ์„ ๋ณผ๋ฅจ(volume) ๋˜๋Š” ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ(point cloud) ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ–ˆ์œผ๋‚˜, ์ด๋Ÿฌํ•œ ํ˜•์‹์„ ์‹ค์ œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฉ”์‰ฌ ๋ชจ๋ธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ์€ ์‰ฝ์ง€ ์•Š๋‹ค.

๊ทธ๋ž˜์„œ ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์˜ ์ฃผ์š” ํŠน์ง•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

1. ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ์ปจ๋ณผ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง(Graph-based Convolutional Neural Network) ์‚ฌ์šฉ
: ๊ธฐ์กด ๋ฐฉ๋ฒ•๊ณผ ๋‹ฌ๋ฆฌ, ์ด ๋„คํŠธ์›Œํฌ๋Š” 3D ๋ฉ”์‰ฌ๋ฅผ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ์ปจ๋ณผ๋ฃจ์…˜ ์‹ ๊ฒฝ๋ง์œผ๋กœ ํ‘œํ˜„ํ•˜์—ฌ, ์˜ฌ๋ฐ”๋ฅธ ํ˜•์ƒ์„ ์ƒ์„ฑํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ์ถ”์ถœํ•œ ์ง€๊ฐ์  ํŠน์ง•(perceptual features)์„ ํ™œ์šฉํ•œ๋‹ค. 

2. ์ ์ง„์ ์ธ ๋ณ€ํ˜•(progressive deformation)
: ์ดˆ๊ธฐ ํ˜•์ƒ์ธ ํƒ€์›์ฒด(ellipsoid)๋ฅผ ์ ์ง„์ ์œผ๋กœ ๋ณ€ํ˜•ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ, ํ˜•์ƒ์ด ์•ˆ์ •์ ์œผ๋กœ ๋ณ€ํ˜•๋  ์ˆ˜ ์žˆ๋„๋ก ๊ฑฐ์น ๊ฒŒ ์‹œ์ž‘ํ•ด์„œ ์ ์  ์„ธ๋ฐ€ํ•˜๊ฒŒ(coarse-to-fine) ๋ณ€ํ™”์‹œํ‚ค๋Š” ์ „๋žต์„ ์ฑ„ํƒํ–ˆ๋‹ค.

3. ๋ฉ”์‰ฌ ๊ด€๋ จ ์†์‹ค ํ•จ์ˆ˜(mesh-related losses) ์ •์˜
: ์—ฌ๋Ÿฌ ์ˆ˜์ค€์—์„œ ๋ฉ”์‰ฌ์˜ ์‹œ๊ฐ์  ๋งค๋ ฅ๊ณผ ๋ฌผ๋ฆฌ์  ์ •ํ™•์„ฑ์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์„ค์ •ํ–ˆ๋‹ค.

4. ๋†’์€ 3D ํ˜•์ƒ ์ถ”์ • ์ •ํ™•๋„
: ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด ์ด ๋ฐฉ๋ฒ•์ด ๋” ๋†’์€ ์„ธ๋ถ€ ํ‘œํ˜„๋ ฅ์„ ๊ฐ€์ง„ ๋ฉ”์‰ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•  ๋ฟ ์•„๋‹ˆ๋ผ, ๊ธฐ์กด ์ตœ์ฒจ๋‹จ ๊ธฐ์ˆ ๋“ค์— ๋น„ํ•ด 3D ํ˜•์ƒ ์ถ”์ • ์ •ํ™•๋„๊ฐ€ ๋†’์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.


 

Introduction

 

3D ํ˜•์ƒ์„ ์ถ”๋ก ํ•˜๋Š” ๊ฒƒ์ด ์ธ๊ฐ„์—๊ฒŒ๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ๊ธฐ๋Šฅ์ด์ง€๋งŒ, ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ๋Š” ๋งค์šฐ ์–ด๋ ค์šด ๊ณผ์ œ์ด๋‹ค.
์ตœ๊ทผ์—๋Š” ๋‹จ์ผ ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€์—์„œ 3D ํ˜•์ƒ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์ด ์‚ฌ์šฉ๋˜์–ด ์ข‹์€ ์„ฑ๊ณผ๋ฅผ ๋‚ด๊ณ  ์žˆ์ง€๋งŒ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์€ 3D ํ˜•์ƒ์„ ๋ณผ๋ฅจ(volume)์ด๋‚˜ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ(point cloud) ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐ ๊ทธ์น˜๊ณ  ํ‘œํ˜„ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์žƒ์–ด๋ฒ„๋ฆฌ๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค.

์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ 3D ์‚ผ๊ฐํ˜• ๋ฉ”์‰ฌ๋ฅผ ์ถ”์ถœํ•˜๋Š” ์ƒˆ๋กœ์šด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ ๋‹จ์ˆœํžˆ ๋ฉ”์‰ฌ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ํ‰๊ท  ํ˜•์ƒ์—์„œ ๋ชฉํ‘œ ํ˜•์ƒ์œผ๋กœ ์ ์ง„์ ์œผ๋กœ ๋ณ€ํ˜•ํ•˜๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค. ์ด ๋ฐฉ์‹์˜ ์žฅ์ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค :

1. ์ž”์—ฌ ๋ณ€ํ˜•(residual deformation) ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๊ตฌ์กฐํ™”๋œ ์ถœ๋ ฅ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์‹ ๊ฒฝ๋ง ํ•™์Šต์— ๋” ํšจ๊ณผ์ ์ž„
2. ์—ฌ๋Ÿฌ ๋ณ€ํ˜• ๋‹จ๊ณ„๋ฅผ ํ†ตํ•ด ์ ์ง„์ ์œผ๋กœ ์„ธ๋ฐ€ํ•˜๊ฒŒ ํ˜•์ƒ ๋‹ค๋“ฌ์„ ์ˆ˜ ์žˆ์Œ
3. ์ดˆ๊ธฐ ๋ฉ”์‰ฌ์— ๋Œ€ํ•œ ์‚ฌ์ „ ์ง€์‹(prior knowledge) ์„ ์ธ์ฝ”๋”ฉํ•  ์ˆ˜ ์žˆ์–ด ๋‹ค์–‘ํ•œ ํ˜•์ƒ์— ์ ํ•ฉํ•จ

ํŠนํžˆ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ํ† ํด๋กœ์ง€๊ฐ€ ๊ณ ์ •๋œ ํƒ€์›์ฒด(ellipsoid)(๊ตฌ๋ฉ์ด ์—†๋Š” ํ์‡„ํ˜• ๋ฉ”์‰ฌ ๊ตฌ์กฐ)๋ฅผ ๋ณ€ํ˜•ํ•˜์—ฌ ์ฐจ๋Ÿ‰, ๋น„ํ–‰๊ธฐ, ํ…Œ์ด๋ธ”๊ณผ ๊ฐ™์€ ์ผ๋ฐ˜์ ์ธ ๊ฐ์ฒด์˜ ํ˜•์ƒ์„ ํšจ๊ณผ์ ์œผ๋กœ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค€๋‹ค.

์ด ๋…ผ๋ฌธ์—์„œ ํ•ด๊ฒฐํ•ด์•ผ ํ•  ์ฃผ์š” ๊ณผ์ œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค : 

1. ๋ฉ”์‰ฌ ๋ชจ๋ธ ํ‘œํ˜„ ๋ฐฉ๋ฒ•
: ๋ฉ”์‰ฌ๋Š” ๋ณธ์งˆ์ ์œผ๋กœ ๋ถˆ๊ทœ์น™ํ•œ ๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ์ด๋ฏ€๋กœ, 2D ์ด๋ฏธ์ง€์—์„œ ์ถ”์ถœํ•œ ํŠน์ง•์„ ํšจ๊ณผ์ ์œผ๋กœ ํ†ตํ•ฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‘ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ(2D ์ด๋ฏธ์ง€์™€ 3D ๊ทธ๋ž˜ํ”„) ๊ฐ„์˜ ์ •๋ณด ์œตํ•ฉ์ด ํ•„์š” => ์ด๋ฅผ ์œ„ํ•ด 1) graph-based fully convolutional network (GCN) ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฉ”์‰ฌ ๋ชจ๋ธ์˜ ๊ฐ ๊ผญ์ง“์ (vertex) ์„ ๋…ธ๋“œ๋กœ ํ‘œํ˜„ํ•˜๊ณ , ์ธ์ ‘ ๋…ธ๋“œ ๊ฐ„ ํŠน์ง• ๊ตํ™˜์„ ํ†ตํ•ด 3D ์œ„์น˜๋ฅผ ํšŒ๊ท€ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฑ„ํƒํ–ˆ๋‹ค. 2)VGG-16 ์œ ์‚ฌ ์•„ํ‚คํ…์ฒ˜๋ฅผ ํ†ตํ•ด 2D ์ด๋ฏธ์ง€์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•œ ๋’ค, ๊ฐ GCN ๋…ธ๋“œ๊ฐ€ ํ•ด๋‹น 2D ์ด๋ฏธ์ง€ ์œ„์น˜์—์„œ ํŠน์ง•์„ ํ’€๋งํ•˜๋„๋ก ์„ค๊ณ„ํ–ˆ๋‹ค. (<- ์ด ๋ถ€๋ถ„์— ๋Œ€ํ•ด์„œ๋Š” ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ–ˆ๋‹ค...!) => ์ฆ‰, VGG-16 ์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๋กœ, ์—ฌ๋Ÿฌ ์ธต์˜ 2D ํ•ฉ์„ฑ๊ณฑ(convolution) ๊ณผ ํ’€๋ง(pooling) ์—ฐ์‚ฐ์„ ํ†ตํ•ด ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ ๊ณ ์ˆ˜์ค€ ํŠน์ง•์„ ์ถ”์ถœํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ์ถ”์ถœ๋œ ํŠน์ง•์„ 3D ๋ฉ”์‰ฌ ์ •์ ์˜ ์œ„์น˜์— ํ•ด๋‹นํ•˜๋Š” ์ด๋ฏธ์ง€ ์ •๋ณด์™€ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ฉ”์‰ฌ ๋ณ€ํ˜• ํ•™์Šต์— ํ™œ์šฉํ•œ๋‹ค. ์ด๋ ‡๊ฒŒ ๋˜๋ฉด ๋” ์ •ํ™•ํ•˜๊ณ  ์„ธ๋ถ€์ ์ธ 3D ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.

2. ๋ฒ„ํ…์Šค ์œ„์น˜ ์—…๋ฐ์ดํŠธ
: ์ง์ ‘ ๋งŽ์€ ๋ฒ„ํ…์Šค๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์–ด, ๊ทธ๋ž˜ํ”„ ์–ธํ’€๋ง(graph unpooling) ๋ ˆ์ด์–ด๋ฅผ ๋„์ž…ํ•˜์—ฌ ์ดˆ๊ธฐ์—๋Š” ์ ์€ ๋ฒ„ํ…์Šค๋กœ ์‹œ์ž‘ํ•ด ์ ์ง„์ ์œผ๋กœ ์ถ”๊ฐ€ํ•˜๋ฉด์„œ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜๋„๋ก ์„ค๊ณ„ํ–ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋„คํŠธ์›Œํฌ๊ฐ€ ๋ณด๋‹ค ๋„“์€ ์ˆ˜์šฉ ์˜์—ญ์„ ๊ฐ€์ง€๊ณ  ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ–ˆ๋‹ค. (<- ์ด ๋ถ€๋ถ„์— ๋Œ€ํ•ด์„œ ์•„์ง ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ–ˆ๋‹ค...! ) => ์ ์€ ๋ฒ„ํ…์Šค๋กœ ์‹œ์ž‘ํ•˜๋ฉด ๋„คํŠธ์›Œํฌ๊ฐ€ ์ „์ฒด์ ์ธ ๊ธ€๋กœ๋ฒŒ ๊ตฌ์กฐ๋ฅผ ๋จผ์ € ํ•™์Šต. ์ดํ›„ ์„ธ๋ถ€์ ์ธ ๋ฒ„ํ…์Šค๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด์„œ ๋กœ์ปฌ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์ ์ง„์ ์œผ๋กœ ํ•™์Šต.

3. ํ•™์Šต์˜ ํ’ˆ์งˆ ํ–ฅ์ƒ
: ๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ ๋•๋ถ„์— ์ธ์ ‘ ๋…ธ๋“œ ๊ฐ„ ๊ณ ์ฐจ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•  ์ˆ˜ ์žˆ์–ด 3D ํ˜•์ƒ์„ ๊ทœ์ œํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ‘œ๋ฉด ํ‰ํ™œ(smoothness) ์†์‹ค, ์—ฃ์ง€ ๊ท ์ผํ™” ์†์‹ค, ๋ผํ”Œ๋ผ์‹œ์•ˆ ์†์‹ค์„ ์ •์˜ํ•˜์—ฌ ํ’ˆ์งˆ ์ข‹์€ ๋ฉ”์‰ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋„๋ก ํ–ˆ๋‹ค. (<- ์ด ๋ถ€๋ถ„์— ๋Œ€ํ•ด์„œ ์•„์ง ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ–ˆ๋‹ค...! )

 

 


Related Work

 

1. Multi-view geometry (MVG) ๋ฐฉ๋ฒ•

- ์ฃผ์š” ์ ‘๊ทผ๋ฒ• : Structure from Motion (SfM) ๊ณผ Simultaneous Localization and Mapping (SLAM).
- ์ œํ•œ ์‚ฌํ•ญ :
    - ์—ฌ๋Ÿฌ ์‹œ์ ์ด ํ•„์š”ํ•˜์—ฌ ๋ณด์ด์ง€ ์•Š๋Š” ๋ถ€๋ถ„์„ ๋ณต์›ํ•  ์ˆ˜ ์—†๊ณ , ์ถฉ๋ถ„ํ•œ ์‹œ์ ์„ ์–ป๊ธฐ ์œ„ํ•ด ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆผ
    - ๋ฐ˜์‚ฌ๋‚˜ ํˆฌ๋ช…ํ•œ ๋ฌผ์ฒด์™€ ๊ฐ™์€ ๋น„-๋žจ๋ฒ„์‹œ์•ˆ(non-lambertian) ํ‘œ๋ฉด ๋ฐ ํ…์Šค์ฒ˜๊ฐ€ ์—†๋Š” ๋ฌผ์ฒด ๋ณต์›์— ์–ด๋ ค์›€
- ์ด๋Ÿฌํ•œ ํ•œ๊ณ„ ๋•Œ๋ฌธ์— ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์ด ์ฃผ๋ชฉ๋ฐ›๊ธฐ ์‹œ์ž‘ํ•จ

 

2. ํ•™์Šต ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฒ•

- ์ฃผ๋กœ ๋‹จ์ผ ๋˜๋Š” ์†Œ์ˆ˜์˜ ์ด๋ฏธ์ง€๋ฅผ ํ™œ์šฉํ•˜๊ณ , ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ˜•์ƒ ์šฐ์„  ์ง€์‹์„ ํ•™์Šตํ•จ
- ๋”ฅ๋Ÿฌ๋‹ ๊ตฌ์กฐ์™€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹(์˜ˆ: ShapeNet) ๋•๋ถ„์— ์—ฐ๊ตฌ๊ฐ€ ํฌ๊ฒŒ ์ง„์ „๋จ

 

3. ํ•™์Šต ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฒ•์˜ ์ฃผ์š” ๊ธฐ๋ฒ•

- ํ˜•์ƒ ๊ฒ€์ƒ‰ ๋ฐ ๋ณ€ํ˜• : Huang ๋“ฑ๊ณผ Su ๋“ฑ์ด ๋ฐ์ดํ„ฐ์…‹์—์„œ ํ˜•์ƒ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๊ฒ€์ƒ‰ํ•ด ์ด๋ฏธ์ง€๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ๋ณ€ํ˜•ํ•จ. ๊ทธ๋Ÿฌ๋‚˜ ์ด ์ ‘๊ทผ์€ ํ•ด๊ฒฐํ•˜๊ธฐ ์–ด๋ ค์šด ill-posed ๋ฌธ์ œ์ž„
- 3D ๋ณ€ํ˜• ๊ฐ€๋Šฅ ๋ชจ๋ธ : Kar ๋“ฑ์€ ๋ฒ”์ฃผ๋ณ„ 3D ๋ณ€ํ˜• ๋ชจ๋ธ์„ ์ œ์•ˆํ–ˆ์œผ๋‚˜, ์ธ๊ธฐ ์žˆ๋Š” ๋ฒ”์ฃผ์— ํ•œ์ •๋˜๋ฉฐ ์„ธ๋ถ€ ํ‘œํ˜„์ด ๋ถ€์กฑํ•จ
- ๋ณต์…€ ๊ธฐ๋ฐ˜ ๋ณต์› : ๋Œ€๋ถ€๋ถ„์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์€ 3D ๋ณต์…€์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ, GPU ๋ฉ”๋ชจ๋ฆฌ ์ œ์•ฝ์œผ๋กœ ์ธํ•ด ํ•ด์ƒ๋„๊ฐ€ ๋‚ฎ์Œ. Tatarchenko ๋“ฑ์€ ์˜ฅํŠธ๋ฆฌ(octree) ํ‘œํ˜„์„ ํ†ตํ•ด ๋” ๋†’์€ ํ•ด์ƒ๋„๋ฅผ ์ œ๊ณตํ•จ
- ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ : Fan ๋“ฑ์€ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋ฅผ ์ƒ์„ฑํ•˜์ง€๋งŒ, ํฌ์ธํŠธ ๊ฐ„ ์—ฐ๊ฒฐ์ด ์—†์–ด 3D ๋ฉ”์‰ฌ ๋ณต์›์— ๋ฐ”๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์–ด๋ ค์›€
- ์ง€์˜ค๋ฉ”ํŠธ๋ฆฌ ์ด๋ฏธ์ง€ : ์ผ๋ถ€ ์—ฐ๊ตฌ๋Š” 3D ํ˜•์ƒ์„ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•ด 2D ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์„ ์ด์šฉํ•œ "์ง€์˜ค๋ฉ”ํŠธ๋ฆฌ ์ด๋ฏธ์ง€"๋ฅผ ์‚ฌ์šฉํ•จ

 

4. ์ตœ๊ทผ ์—ฐ๊ตฌ ๋ฐ ํ•œ๊ณ„

- ์‹ค๋ฃจ์—ฃ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๊ณผ ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ ์ €์žฅ์†Œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฐํ•ฉ ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐฉ๋ฒ•์ด ๊ด€๋ จ ์—ฐ๊ตฌ๋กœ ์ œ์‹œ๋˜์—ˆ์œผ๋‚˜, ๋ณต์žกํ•œ ํ˜•์ƒ์—์„œ๋Š” ์„ฑ๋Šฅ์ด ์ €์กฐํ•˜๊ฑฐ๋‚˜ ๋งŽ์€ ์ž์›์„ ์š”๊ตฌํ•จ

 

5. ๊ทธ๋ž˜ํ”„ ์‹ ๊ฒฝ๋ง(Graph Neural Network, GNN) ์„ ํ™œ์šฉํ•œ 3D ๋ณต์›

- ๋ณธ ์—ฐ๊ตฌ๋Š” ๊ทธ๋ž˜ํ”„ ์‹ ๊ฒฝ๋ง(GNN) ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ˜•์ƒ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ์ตœ๊ทผ ํ˜•์ƒ ๋ถ„์„ GNN์ด ์ ์šฉ๋œ ์—ฐ๊ตฌ์—์„œ ์˜๊ฐ์„ ๋ฐ›์Œ
- ์ฐจํŒ… ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ• : ๋ฉ”์‰ฌ ๊ฐ์ฒด์— ์ ํ•ฉํ•œ ํ‘œ๋ฉด ๋‹ค์–‘์ฒด์— ํ•ฉ์„ฑ๊ณฑ์„ ์ ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ๋‹จ์ผ ์ด๋ฏธ์ง€ 3D ๋ณต์›์—๋Š” ๋งŽ์ด ์‚ฌ์šฉ๋˜์ง€ ์•Š์•˜์Œ

 


Method

 

Preliminary: Graph-based Convolution

1. 3D ๋ฉ”์‰ฌ ๊ตฌ์„ฑ
: 3D ๋ฉ”์‰ฌ๋Š” ์ •์ (Vertices), ๊ฐ„์„ (Edges), ๋ฉด(Faces) ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ๊ตฌ์กฐ์ด๋‹ค.
๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, M= (V, E, F) ๋กœ ๋‚˜ํƒ€๋‚ธ๋‹ค.

๊ทธ๋ž˜ํ”„์˜ ๊ฐ ์ •์  (Vertex) ์—์„œ๋Š” ํŠน์„ฑ(feature) ๋ฒกํ„ฐ๊ฐ€ ํ• ๋‹น๋œ๋‹ค. 

 

2. ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ๊ณฑ (Graph-based Convolution)
: ๋ถˆ๊ทœ์น™ํ•œ ๊ทธ๋ž˜ํ”„ ์ƒ์˜ ํ•ฉ์„ฑ๊ณฑ ์ธต์„ ์ •์˜ํ•˜๋ฉฐ, ์‹(1) ๊ณผ ๊ฐ™์ด ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค.

w0์™€ w1์€ ๊ฐ€์ค‘์น˜(weight).

์œ„์˜ ํŠน์„ฑ ์—…๋ฐ์ดํŠธ ์‹์˜ ํ•ต์‹ฌ : ์ •์  p์˜ ํŠน์„ฑ์€ ์ž๊ธฐ ์ž์‹ ์˜ ํŠน์„ฑ(w0fp^l)๊ณผ ์ด์›ƒ ์ •์ ์˜ ํŠน์„ฑ ํ•ฉ(w1 fq^l)์˜ ์„ ํ˜• ๊ฒฐํ•ฉ์œผ๋กœ ์—…๋ฐ์ดํŠธ

์œ„์˜ ์—ฐ์‚ฐ์„ ํ†ตํ•ด ์ •์ ์˜ ํŠน์„ฑ์„ ์—…๋ฐ์ดํŠธํ•˜๋ฉฐ, ์ด๋Š” ๋ฉ”์‰ฌ์˜ ๋ณ€ํ˜•(deformation)์„ ์ ์šฉํ•˜๋Š” ๊ฒƒ๊ณผ ์œ ์‚ฌํ•œ ํšจ๊ณผ๋ฅผ ๋‚ธ๋‹ค.

 

3. ํŠน์„ฑ ๋ฒกํ„ฐ 
: ์ •์ ์— ์—ฐ๊ฒฐ๋œ ํŠน์„ฑ ๋ฒกํ„ฐ fp๋Š” 3D ์ •์  ์ขŒํ‘œ, 3D ํ˜•์ƒ ํŠน์ง• ์ธ์ฝ”๋”ฉ, ์ž…๋ ฅ ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€์—์„œ ํ•™์Šต๋œ ํŠน์ง•(์กด์žฌํ•  ๊ฒฝ์šฐ)์˜ ๊ฒฐํ•ฉ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

=> ์ด๋Ÿฌํ•œ ํ•ฉ์„ฑ๊ณฑ์€ ๋ฉ”์‰ฌ์˜ ํ˜•์ƒ์„ ์œ ์ง€ํ•˜๋ฉฐ, ๋ณต์žกํ•œ ๊ตฌ์กฐ์—์„œ๋„ ํšจ๊ณผ์ ์œผ๋กœ ์ž‘๋™ํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ๋‹ค. 

 

 

System Overview

 

<Fig 2> ์ด ๋ชจ๋ธ์€ ์„ธ ๊ฐœ์˜ ๋ฉ”์‰ฌ ๋ณ€ํ˜• ๋ธ”๋ก(Mesh Deformation Network) ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. ๊ฐ ๋ธ”๋ก์€ ์ˆœ์ฐจ์ ์œผ๋กœ ๋ฉ”์‰ฌ ํ•ด์ƒ๋„๋ฅผ ๋†’์ด๊ณ  ์ •์ (vertex) ์œ„์น˜๋ฅผ ์ถ”์ •ํ•œ๋‹ค.
1) ๋ณ€ํ˜• ๋ธ”๋ก(Mesh Deformation) : ๊ฐ ๋ฉ”์‰ฌ ๋ณ€ํ˜• ๋ธ”๋ก์€ ์ด์ „ ๋‹จ๊ณ„์—์„œ ์ถ”์ •๋œ ์ •์  ์œ„์น˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฉ”์‰ฌ์˜ ํ•ด์ƒ๋„๋ฅผ ๋†’์ธ๋‹ค. 
2) ํŠน์ง• ์ถ”์ถœ : ์ถ”์ •๋œ ์ •์  ์œ„์น˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 2D CNN์œผ๋กœ๋ถ€ํ„ฐ ์ด๋ฏธ์ง€์˜ ์ง€๊ฐ์ (perceptual) ํŠน์ง•์„ ์ถ”์ถœํ•œ๋‹ค. ์ด ํŠน์ง•์€ ๋‹ค์Œ ๋ณ€ํ˜• ๋ธ”๋ก์—์„œ ๋ฉ”์‰ฌ ๋ณ€ํ˜•์„ ๋” ์ •๊ตํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ์ž…๋ ฅ์œผ๋กœ ํ™œ์šฉ๋œ๋‹ค.
=> ์ฆ‰, ๊ฐ ๋ธ”๋ก์€ ๋ฉ”์‰ฌ์˜ ํ•ด์ƒ๋„๋ฅผ ๋†’์ด๊ณ , ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ 2D CNN ์—์„œ ๋” ์ •ํ™•ํ•œ ํŠน์ง•์„ ์ถ”์ถœํ•ด ๋‹ค์Œ ๋ณ€ํ˜• ๋ธ”๋ก์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ตฌ์กฐ์ด๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์€ ์ ์ง„์ ์œผ๋กœ ๊ณ ํ•ด์ƒ๋„์˜ ์ •๊ตํ•œ 3D ๋ฉ”์‰ฌ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

 

์ด ๋…ผ๋ฌธ์˜ ๋ชจ๋ธ์€ ๋‹จ์ผ ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ 3D ๋ฉ”์‰ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ์—”๋“œํˆฌ์—”๋“œ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์ด๋‹ค.

1. ๊ตฌ์„ฑ

- ์ „์ฒด ๋„คํŠธ์›Œํฌ๋Š” ์ด๋ฏธ์ง€ ํŠน์ง• ๋„คํŠธ์›Œํฌ์™€ ๊ณ„์ธต์  ๋ฉ”์‰ฌ ๋ณ€ํ˜• ๋„คํŠธ์›Œํฌ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.
- ์ด๋ฏธ์ง€ ํŠน์ง• ๋„คํŠธ์›Œํฌ๋Š” 2D CNN์œผ๋กœ, ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•œ๋‹ค. ์ด ํŠน์ง•์€ ๋ฉ”์‰ฌ ๋ณ€ํ˜• ๋„คํŠธ์›Œํฌ์—์„œ ๋ฉ”์‰ฌ๋ฅผ ์›ํ•˜๋Š” 3D ๋ชจ๋ธ๋กœ ๋ณ€ํ˜•ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.

2. ๋ฉ”์‰ฌ ๋ณ€ํ˜• ๋„คํŠธ์›Œํฌ

- ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ๊ณฑ ๋„คํŠธ์›Œํฌ(GCN) ๋กœ, ์„ธ ๊ฐœ์˜ ๋ณ€ํ˜• ๋ธ”๋ก(Mesh Deformation) ๊ณผ ๋‘ ๊ฐœ์˜ Graph unpooling ์ธต์ด ๊ต์ฐจ๋กœ ๋ฐฐ์น˜๋˜์–ด ์žˆ๋‹ค. 
- Mesh Deformation์€ ํ˜„์žฌ์˜ ๋ฉ”์‰ฌ ๋ชจ๋ธ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์ƒˆ๋กœ์šด ์ •์  ์œ„์น˜์™€ ํŠน์ง•์„ ์ƒ์„ฑํ•œ๋‹ค.
- Graph unpooling ์ธต์€ ์„ธ๋ถ€ ํ‘œํ˜„์„ ์œ„ํ•ด ์ •์  ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋ฉด์„œ ์‚ผ๊ฐํ˜• ๋ฉ”์‰ฌ ๊ตฌ์กฐ๋ฅผ ์œ ์ง€ํ•œ๋‹ค.

3. ์ž‘๋™ ๋ฐฉ์‹

- ๋ชจ๋ธ์€ ์†Œ์ˆ˜์˜ ์ •์ ์—์„œ ์‹œ์ž‘ํ•ด, ์ ์ฐจ์ ์œผ๋กœ ๋ฉ”์‰ฌ๋ฅผ ๋ณ€ํ˜•ํ•˜๊ณ  ์„ธ๋ถ€์‚ฌํ•ญ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•œ๋‹ค.

4. ์†์‹ค ํ•จ์ˆ˜

- Chamfer Distance ์†์‹ค ์™ธ์— Surface normal ์†์‹ค, Laplacian ์ •๊ทœํ™” ์†์‹ค, Edge length ์†์‹ค์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์•ˆ์ •์ ์ธ ๋ณ€ํ˜•๊ณผ ์ •ํ™•ํ•œ ๋ฉ”์‰ฌ ์ƒ์„ฑ์„ ์œ ๋„ํ•œ๋‹ค.

 

Initial ellipsoid

์ด ๋ชจ๋ธ์€ ์‚ฌ์ „ 3D ๋ชจ์–‘์— ๋Œ€ํ•œ ์ง€์‹ ์—†์ด ์ดˆ๊ธฐ ํƒ€์›์ฒด(ellipsoid)๋กœ๋ถ€ํ„ฐ ๋ณ€ํ˜•์„ ์‹œ์ž‘ํ•œ๋‹ค. 
์ดˆ๊ธฐ ํƒ€์›์ฒด์˜ ํŠน์ง•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.
- ์นด๋ฉ”๋ผ ์ขŒํ‘œ์—์„œ ํ‰๊ท  ํฌ๊ธฐ๋กœ ์„ค์ •๋œ ํƒ€์›์ฒด๊ฐ€ ์‚ฌ์šฉ๋˜๋ฉฐ, ์นด๋ฉ”๋ผ ์•ž์ชฝ 0.8 ๋ฏธํ„ฐ ์ง€์ ์— ๋ฐฐ์น˜๋œ๋‹ค. ํƒ€์›์ฒด์˜ ์„ธ ์ถ•์€ ๊ฐ๊ฐ 0.2m, 0.2m, 0.4m ์˜ ๋ฐ˜์ง€๋ฆ„์„ ๊ฐ€์ง„๋‹ค.
- ์ดˆ๊ธฐ ๋ฉ”์‰ฌ ์ƒ์„ฑ: Meshlab์˜ ์•”๋ฌต์  ํ‘œ๋ฉด ์•Œ๊ณ ๋ฆฌ์ฆ˜(Implicit surface algorithm) ์„ ์‚ฌ์šฉํ•ด ํƒ€์›์ฒด์˜ ๋ฉ”์‰ฌ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋ฉฐ, ์ด ๋ชจ๋ธ์€ 156๊ฐœ์˜ ์ •์ (Vertices) ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

์ดˆ๊ธฐ ํƒ€์›์ฒด๋Š” ๋„คํŠธ์›Œํฌ์˜ ์ž…๋ ฅ ๊ทธ๋ž˜ํ”„๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ, ์ •์ ์˜ 3D ์ขŒํ‘œ๋งŒ์„ ํฌํ•จํ•œ ์ดˆ๊ธฐ ํŠน์ง•์ด ํ• ๋‹น๋œ๋‹ค. ์ดํ›„, ์ด ํƒ€์›์ฒด๋Š” ๋„คํŠธ์›Œํฌ์—์„œ ์ ์ฐจ ๋ณ€ํ˜•๋˜์–ด ์›ํ•˜๋Š” 3D ๋ชจ์–‘์œผ๋กœ ๋งŒ๋“ค์–ด์ง„๋‹ค.

 

Mesh deformation block

<Fig 3> 
(a) Mesh Deformation Block

1. ์ •์  ์œ„์น˜ Ci-1 : ํ˜„์žฌ ๋ฉ”์‰ฌ ๋ชจ๋ธ์˜ ์ •์  ์œ„์น˜์ด๋‹ค. ์ด ์œ„์น˜๋Š” ์ด๋ฏธ์ง€ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ๋‹ค.

2. Perceptual Feature Pooling : 
    - ์ •์  ์œ„์น˜ Ci-1 ๋ฅผ ์‚ฌ์šฉํ•ด, ์นด๋ฉ”๋ผ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ 2D ์ด๋ฏธ์ง€ ํ‰๋ฉด์— ํˆฌ์˜ํ•œ๋‹ค. 
    - VGG-16 ๋„คํŠธ์›Œํฌ์˜ conv3_3, conv4_3, conv5_3 ๋ ˆ์ด์–ด์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ , ์ด ํŠน์ง•๋“ค์€ ์Œ์„ ํ˜• ๋ณด๊ฐ„๋ฒ•(bilinear interpolation)์„ ํ†ตํ•ด ํ’€๋ง

* Bilinear interpolation ์ด๋ž€?
  2D ๊ณต๊ฐ„์—์„œ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด ์ฃผ๋ณ€์˜ ๋„ค ๊ฐœ์˜ ์ธ์ ‘ํ•œ ์ ์„ ์‚ฌ์šฉํ•˜์—ฌ, ๊ฐ€์ค‘ํ•ฉ์œผ๋กœ ์ƒˆ๋กœ์šด ๊ฐ’์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ณด๊ฐ„๋ฒ•์ด๋‹ค. ์ด๋ฆ„์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ์ด, ๋‘๋ฒˆ์˜ ์„ ํ˜• ๋ณด๊ฐ„(linear interpolation)์„ ๊ฑฐ์ณ ๊ฐ’์„ ๊ณ„์‚ฐํ•œ๋‹ค. ์ด๋Š” ํŠนํžˆ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ์—์„œ, ํ”ฝ์…€ ๊ฐ’์„ ํ™•๋Œ€ํ•˜๊ฑฐ๋‚˜ ์ค„์ผ ๋•Œ ์ž์ฃผ ์‚ฌ์šฉ๋œ๋‹ค.

์‚ฌ์šฉํ•˜๋Š” ์ด์œ 
- ์ •์ (Vertex)์˜ ์œ„์น˜๊ฐ€ ์ด๋ฏธ์ง€ ์œ„์—์„œ ์ •ํ™•ํžˆ ํ”ฝ์…€์˜ ์ค‘์•™์— ์œ„์น˜ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค. ์ด ๊ฒฝ์šฐ, ํ•ด๋‹น ์œ„์น˜์˜ ํŠน์ง• ๊ฐ’์„ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ์ฃผ๋ณ€ ๋„ค ๊ฐœ์˜ ํ”ฝ์…€ ๊ฐ’์„ ์‚ฌ์šฉํ•œ๋‹ค.
- ์Œ์„ ํ˜• ๋ณด๊ฐ„๋ฒ•์€ ์ด ๋„ค ํ”ฝ์…€ ๊ฐ’์˜ ๊ฐ€์ค‘ํ•ฉ์„ ๊ณ„์‚ฐํ•ด, ์ •ํ™•ํ•œ ์œ„์น˜์—์„œ์˜ ํŠน์ง• ๊ฐ’์„ ์–ป๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

3. Vertex Features Fi-1 : 
    - ์ด์ „ ๋ธ”๋ก์—์„œ ์ „๋‹ฌ๋œ 3D ์ •์  ํŠน์ง•์ด๋‹ค.
    - ํ’€๋ง๋œ perceptual feature ์™€ ํ•จ๊ป˜ ์—ฐ๊ฒฐ(concaternate, ๊ทธ๋ฆผ์—์„œ +์— ๋™๊ทธ๋ผ๋ฏธ ๋˜์–ด์žˆ๋Š” ๊ธฐํ˜ธ) ๋˜์–ด G-ResNet์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉ๋œ๋‹ค.

4. G-ResNet : 
    - Graph-based ResNet ๊ตฌ์กฐ๋กœ, 14๊ฐœ์˜ ๊ทธ๋ž˜ํ”„ ์ž”์ฐจ(Residual) ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค.
    - ์ž…๋ ฅ ํŠน์ง•์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ƒˆ๋กœ์šด ์ •์  ์œ„์น˜ Ci ์™€ ์ƒˆ๋กœ์šด ์ •์  ํŠน์ง• Fi๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

 

(b) Perceptual Feature Pooling

- 3D ๋ฉ”์‰ฌ์˜ ์ •์ ๋“ค์ด ์นด๋ฉ”๋ผ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด 2D ์ด๋ฏธ์ง€ ํ‰๋ฉด์— ํˆฌ์˜๋œ๋‹ค.

- VGG-16์˜ conv3_3, conv4_3, conv5_3์—์„œ ์ถ”์ถœ๋œ ์ด๋ฏธ์ง€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ฃผ๋ณ€ ํ”ฝ์…€์˜ ํŠน์ง•์„ ์Œ์„ ํ˜• ๋ณด๊ฐ„๋ฒ•์œผ๋กœ ํ’€๋งํ•œ๋‹ค.

- ํ’€๋ง๋œ perceptual feature ๋Š” ์ •์ ์˜ 3D ํŠน์ง•๊ณผ ๊ฒฐํ•ฉ๋˜์–ด G-ResNet์— ์ž…๋ ฅ๋œ๋‹ค. 

 

์ž ์ด์ œ, Fig3์— ๋Œ€ํ•ด์„œ ์‚ดํŽด ๋ณด์•˜์œผ๋‹ˆ ์ฃผ์š” ๋‹จ๊ณ„๋ฅผ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž. 

1. ์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ ๋ฐ ํ’€๋ง :

- ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ธฐ VGG-16 ๋„คํŠธ์›Œํฌ(conv5_3 ๋ ˆ์ด์–ด๊นŒ์ง€ ์‚ฌ์šฉ)๋ฅผ ํ™œ์šฉํ•œ๋‹ค.
- ์ •์ (vertex)์˜ 3D ์ขŒํ‘œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฅผ ์นด๋ฉ”๋ผ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ด์šฉํ•ด 2D ์ด๋ฏธ์ง€ ํ‰๋ฉด์œผ๋กœ ํˆฌ์˜ํ•œ๋‹ค.
- ํˆฌ์˜๋œ ์ขŒํ‘œ๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ์ฃผ๋ณ€ 4๊ฐœ์˜ ํ”ฝ์…€๋กœ๋ถ€ํ„ฐ ์Œ์„ ํ˜• ๋ณด๊ฐ„๋ฒ•(bilinear interpolation) ์„ ํ†ตํ•ด ์ด๋ฏธ์ง€๋ฅผ ํ’€๋งํ•œ๋‹ค.
- ์ด ํ’€๋ง ๋‹จ๊ณ„์—์„œ๋Š” conv3_3, conv4_3, conv5_3 ๋ ˆ์ด์–ด์—์„œ ์ถ”์ถœ๋œ ํŠน์ง•์„ ์—ฐ๊ฒฐ(concatenate)ํ•˜๋ฉฐ, ์ด 1280์ฐจ์›์˜ ์ด๋ฏธ์ง€ ํŠน์ง•์ด ์ƒ์„ฑ๋œ๋‹ค.

2. ์ •์  ํŠน์ง• ๊ฒฐํ•ฉ

- ํ’€๋ง๋œ ์ด๋ฏธ์ง€ ํŠน์ง•์€ ์ž…๋ ฅ ๋ฉ”์‰ฌ์˜ 3D ํŠน์ง•(128์ฐจ์›)๊ณผ ์—ฐ๊ฒฐ๋˜์–ด, ์ด 1408์ฐจ์›์˜ ํŠน์ง• ๋ฒกํ„ฐ๊ฐ€ ์ƒ์„ฑ๋œ๋‹ค.
- ์ฒซ ๋ฒˆ์งธ ๋ธ”๋ก์—์„œ๋Š” ์ดˆ๊ธฐ ํ•™์Šต๋œ 3D ํŠน์ง•์ด ์—†๊ธฐ ๋•Œ๋ฌธ์—, 3D ์ขŒํ‘œ๋งŒ ์—ฐ๊ฒฐํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค. (3์ฐจ์›)

3. G-ResNet (Graph-based ResNet)

- 1408์ฐจ์› ํŠน์ง• ๋ฒกํ„ฐ๋Š” G-ResNet์œผ๋กœ ์ „๋‹ฌ๋œ๋‹ค. G-ResNet์€ ๊นŠ์€ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ์ž”์ฐจ ์‹ ๊ฒฝ๋ง์œผ๋กœ, ๊ฐ ์ •์ ์˜ ์ƒˆ๋กœ์šด ์œ„์น˜์™€ 3D ํŠน์ง•์„ ์˜ˆ์ธกํ•œ๋‹ค.
- G-ResNet ์€ 14๊ฐœ์˜ ๊ทธ๋ž˜ํ”„ ์ž”์ฐจ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๊ฐ ๋ ˆ์ด์–ด๋Š” 128 ์ฑ„๋„์„ ๊ฐ€์ง„๋‹ค. 
- ์ด ๋„คํŠธ์›Œํฌ๋Š” ์ •์  ๊ฐ„์˜ ์ •๋ณด ๊ตํ™˜์„ ํšจ์œจ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ๋‹ค. ๊ธฐ๋ณธ์ ์ธ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜์€ ์ด์›ƒ ์ •์  ๊ฐ„์˜ ์ •๋ณด ๊ตํ™˜๋งŒ ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๋ฅผ ๊นŠ๊ฒŒ ํ•˜๊ณ  shortcut connections์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์ˆ˜์šฉ ์˜์—ญ ๋ฌธ์ œ(receptive field issue) ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค.
- ๋งˆ์ง€๋ง‰ ๋ ˆ์ด์–ด์—์„œ๋Š” ์ถ”๊ฐ€ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋ฅผ ์ ์šฉํ•˜์—ฌ ๊ฐ ์ •์ ์˜ ์ƒˆ๋กœ์šด 3D ์ขŒํ‘œ๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค.

* ์ˆ˜์šฉ ์˜์—ญ์ด๋ž€ ?
๋„คํŠธ์›Œํฌ๊ฐ€ ํ•œ ์ •์ ์—์„œ ์ •๋ณด๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ฅธ ์ •์  ๋“ค์˜ ์ˆ˜๋ฅผ ๋œปํ•œ๋‹ค. 

๊ธฐ๋ณธ์ ์ธ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜์—์„œ๋Š” ๊ฐ ์ •์ ์ด ์ง์ ‘์ ์œผ๋กœ ์ด์›ƒํ•œ ์ •์ ๋“ค๊ณผ๋งŒ ์ •๋ณด๋ฅผ ๊ตํ™˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋กœ ์ธํ•ด ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ์ •๋ณด์˜ ์ „ํŒŒ ๋ฒ”์œ„๊ฐ€ ๋Š˜์–ด๋‚˜์ง€ ์•Š๊ฑฐ๋‚˜, ๋„ˆ๋ฌด ๋งŽ์€ ์ธต์„ ์Œ“์œผ๋ฉด ๊ทธ๋กœ ์ธํ•œ ์ •๋ณด ์†์‹ค์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” ์ˆ˜์šฉ ์˜์—ญ์˜ ๋ฌธ์ œ๋กœ, ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์ง€๊ฑฐ๋‚˜ ๋ณต์žกํ•ด์งˆ์ˆ˜๋ก ๊ฐ ์ •์ ์ด ์ฐธ์กฐํ•  ์ˆ˜ ์žˆ๋Š” ์ •๋ณด ๋ฒ”์œ„๊ฐ€ ํ•œ์ •๋˜์–ด, ๋” ๋„“์€ ๋ฒ”์œ„์˜ ์ •๋ณด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์–ด๋ ค์›Œ์ง€๋Š” ๋ฌธ์ œ์ด๋‹ค.

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด shortcut connections๋‚˜ ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๋ฅผ ์ฆ๊ฐ€์‹œ์ผœ ์ˆ˜์šฉ ์˜์—ญ์„ ํ™œ์žฅํ•˜๋ ค๋Š” ์ ‘๊ทผ์ด ํ•„์š”ํ•˜๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๋” ๋ฉ€๋ฆฌ ๋–จ์–ด์ง„ ์ •์ ๋“ค์˜ ์ •๋ณด๋ฅผ ๋น ๋ฅด๊ฒŒ ๊ตํ™˜ํ•  ์ˆ˜ ์žˆ์–ด ๋„คํŠธ์›Œํฌ์˜ ํ•™์Šต ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋œ๋‹ค. 

์—ฌ๊ธฐ์„œ ๊ถ๊ธˆํ•œ์  : ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉด ์ •๋ณด์˜ ๋ฒ”์œ„๊ฐ€ ๋Š˜์–ด๋‚˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ๋ง์€ ์–ด๋–ป๊ฒŒ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์„๊นŒ?
=> ๊ธฐ๋ณธ์ ์œผ๋กœ, ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ๋„คํŠธ์›Œํฌ์—์„œ๋Š” ๊ฐ ๋ ˆ์ด์–ด๊ฐ€ ์ธ์ ‘ํ•œ ์ •์ ๋“ค๊ณผ๋งŒ ์ •๋ณด๋ฅผ ๊ตํ™˜ํ•˜๋ฏ€๋กœ, ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๊ฐ€ ๋Š˜์–ด๊ฐ€๋ฉด ์ •๋ณด๊ฐ€ ์ „ํŒŒ๋˜๋Š” ๋ฒ”์œ„๋Š” ๋Š˜์–ด๋‚˜๊ธดํ•œ๋‹ค, ํ•˜์ง€๋งŒ ๋ฌธ์ œ๋Š” ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ์ •๋ณด๊ฐ€ ์ ์ฐจ ์†Œ์‹ค๋˜๊ฑฐ๋‚˜ ์™œ๊ณก๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค. ์ด๋Š” ๋„คํŠธ์›Œํฌ์˜ ์—ฌ๋Ÿฌ ์ธต์„ ํ†ต๊ณผํ•˜๋ฉด์„œ ์ •๋ณด๊ฐ€ ์ ์ฐจ ํฌ์„๋˜๊ฑฐ๋‚˜, ๋ฉ€๋ฆฌ ์žˆ๋Š” ์ •์ ๋“ค๊ณผ์˜ ๊ด€๊ณ„๋ฅผ ์ž˜ ํŒŒ์•…ํ•˜์ง€ ๋ชปํ•˜๊ฒŒ ๋˜๋Š” ํ˜„์ƒ์ด๋‹ค.

Shortcut connections(๋˜๋Š” skip connections)๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ์ด์œ ๋Š” ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ์ด๋‹ค. ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ๊ฐ ์ธต์„ ๊ฑด๋„ˆ๋›ฐ์–ด ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๊ฒฝ๋กœ๋ฅผ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ, ๊นŠ์ด๊ฐ€ ์ฆ๊ฐ€ํ•˜๋”๋ผ๋„ ์ •๋ณด๊ฐ€ ์‚ฌ๋ผ์ง€์ง€ ์•Š๋„๋ก ๋„์™€์ค€๋‹ค. ์ฆ‰, shortcut connections๋Š” ์ •๋ณด๋ฅผ ๋” ์ž˜ ๋ณด์กดํ•˜๊ณ , ๋„คํŠธ์›Œํฌ์˜ ๊นŠ์ด๋ฅผ ๋Š˜๋ฆฌ๋ฉด์„œ๋„ ์ •๋ณด๊ฐ€ ์™œ๊ณก๋˜์ง€ ์•Š๋„๋ก ํ•œ๋‹ค. 

 

 

Graph unpooling layer

<Fig 4>
(a) Graph Unpooling 
- ๊ฒ€์€์ƒ‰ ์ •์ ๊ณผ ์ ์„  ์—ฃ์ง€๋Š” unpooling ๊ณผ์ •์—์„œ ์ƒˆ๋กœ ์ถ”๊ฐ€๋œ ์ •์ ๊ณผ ์—ฃ์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.
- Face-based ๋ฐฉ๋ฒ• : ์‚ผ๊ฐํ˜•(face)์˜ ์ค‘์‹ฌ์— ์ƒˆ๋กœ์šด ์ •์ ์„ ์ถ”๊ฐ€ํ•˜๊ณ , ์ด ์ •์ ์„ ๊ธฐ์กด ์‚ผ๊ฐํ˜•์˜ ์„ธ ์ •์ ๊ณผ ์—ฐ๊ฒฐํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์ด ๋ฐฉ์‹์€ ์ •์ ์˜ ์ฐจ์ˆ˜(degree)๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•ด์ง€๋Š” ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•  ์ˆ˜ ์žˆ๋‹ค.
- Edge-based ๋ฐฉ๋ฒ• : ์—ฃ์ง€์˜ ์ค‘์•™์— ์ƒˆ๋กœ์šด ์ •์ ์„ ์ถ”๊ฐ€ํ•˜๊ณ , ์ด ์ •์ ์„ ์—ฃ์ง€์˜ ์–‘ ๋ ์ •์ ๋“ค๊ณผ ์—ฐ๊ฒฐํ•œ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ, ๊ฐ™์€ ์‚ผ๊ฐํ˜• ๋‚ด์—์„œ ์ถ”๊ฐ€๋œ ์„ธ ์ •์ ์€ ์„œ๋กœ ์—ฐ๊ฒฐ๋˜์–ด ์ƒˆ๋กœ์šด ์‚ผ๊ฐํ˜•์ด ๋งŒ๋“ค์–ด์ง„๋‹ค. ์ด ๋ฐฉ์‹์€ ์ •์ ์˜ ์ฐจ์ˆ˜๊ฐ€ ๋” ๊ท ์ผํ•˜๊ฒŒ ์œ ์ง€๋˜๋„๋ก ๋„์™€์ค€๋‹ค.

(b) Comparison between Face-based and Edge-based Unpooling
- inital Mesh : Unpooling ์ด ์ ์šฉ๋˜๊ธฐ ์ „์˜ ์ดˆ๊ธฐ ๋ฉ”์‹œ ๊ตฌ์กฐ์ด๋‹ค. ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ์ˆ˜์˜ ์ •์ ๊ณผ ์‚ผ๊ฐํ˜•์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. 
- Face-based ๋ฐฉ๋ฒ• : ์ •์ ์˜ ์ฐจ์ˆ˜๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•ด์ ธ ๋ถˆ๊ทœ์น™์ ์ธ ๊ตฌ์กฐ๊ฐ€ ๋‚˜ํƒ€๋‚œ๋‹ค.
- Edge-based ๋ฐฉ๋ฒ• : ์ •์ ์˜ ์ฐจ์ˆ˜๊ฐ€ ๊ท ๋“ฑํ•˜๊ฒŒ ์œ ์ง€๋˜๋ฉฐ, ๋” ๊ท ์ผํ•˜๊ณ  ๊ทœ์น™์ ์ธ ๊ตฌ์กฐ๋ฅผ ํ˜•์„ฑํ•œ๋‹ค.

Unpooling layer ๋Š” ์ •์  ์ˆ˜๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฆ๊ฐ€์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ์ฒ˜์Œ์— ์ ์€ ์ˆ˜์˜ ์ •์ ์œผ๋กœ ์‹œ์ž‘ํ•˜๊ณ  ํ•„์š”์— ๋”ฐ๋ผ ์ •์ ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ๊ธฐ์กด์˜ face-based ์ ‘๊ทผ์€ ๋ถˆ๊ท ํ˜•ํ•œ ์ •์  ์ฐจ์ˆ˜๋ฅผ ์œ ๋ฐœํ•˜๋Š” ๋ฐ˜๋ฉด, ์—ฃ์ง€ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ์€ ๊ท ํ˜• ์žกํžŒ ๋ฐฉ์‹์œผ๋กœ ์ •์ ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ธ๋‹ค. ์ด ๋ฐฉ์‹์€ ์ปดํ“จํ„ฐ ๊ทธ๋ž˜ํ”ฝ์Šค์—์„œ์˜ ๋ฉ”์‹œ ๋ถ„ํ•  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ฐจ์šฉํ•˜์—ฌ, ๊ฐ ์—ฃ์ง€์˜ ์ค‘์•™์— ์ƒˆ๋กœ์šด ์ •์ ์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์ด๋ฅผ ๊ธฐ์กด ์ •์ ๋“ค๊ณผ ์—ฐ๊ฒฐํ•˜์—ฌ ์ •์ ์˜ ์ˆ˜๋ฅผ ์ฆ๊ฐ€์‹œํ‚ค๊ณ , ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์„ฑ๊ณผ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ๋™์‹œ์— ์ถ”๊ตฌํ•œ๋‹ค. 

 

 

Losses

๋ฉ”์‹œ ๋ณ€ํ˜•(Mesh deformation) ๊ณผ์ •์—์„œ ์ถœ๋ ฅ ํ˜•ํƒœ(output shape)์˜ ํŠน์„ฑ๊ณผ ๋ณ€ํ˜• ์ ˆ์ฐจ๋ฅผ ์ œ์•ฝํ•˜๊ธฐ ์œ„ํ•ด ๋„ค ๊ฐ€์ง€ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•œ๋‹ค. => ์˜ˆ์˜๊ณ  ์ž์—ฐ์Šค๋Ÿฌ์šด ๊ฒฐ๊ณผ๋ฅผ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์†์‹ค์„ ์ ์šฉํ•œ๋‹ค.

 

1. Chamfer loss

- Chamfer ๊ฑฐ๋ฆฌ๋Š” ๋‘ ์  ์ง‘ํ•ฉ(์˜ˆ์ธก๋œ ๋ฉ”์‹œ์™€ ์‹ค์ œ ๋ฉ”์‹œ)์˜ ๋ชจ๋“  ์ ๋“ค ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ธก์ •ํ•˜๋Š” ์†์‹ค ํ•จ์ˆ˜์ด๋‹ค.
- ์ด ์ˆ˜์‹์€ ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.
     1. ์˜ˆ์ธก๋œ ๋ฉ”์‹œ์˜ ์ •์  p ์—์„œ ์‹ค์ œ ๋ฉ”์‹œ์˜ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ •์  q ๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ
     2. ์‹ค์ œ ๋ฉ”์‹œ์˜ ์ •์  q์—์„œ ์˜ˆ์ธก๋œ ๋ฉ”์‹œ์˜ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ •์  p ๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ
- ์ด ์†์‹ค์€ ์˜ˆ์ธก๋œ ๋ฉ”์‹œ์˜ ์ •์ ๋“ค์ด ์‹ค์ œ ๋ฉ”์‹œ์˜ ์ •์ ๋“ค๊ณผ ๊ฐ€๊นŒ์›Œ์ง€๋„๋ก ์œ ๋„ํ•œ๋‹ค.

=> ๋ฌธ์ œ์  : Chamfer ์†์‹ค์€ ์ •์  ์œ„์น˜๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ๋Š” ์œ ์šฉํ•˜์ง€๋งŒ, ํ‘œ๋ฉด์˜ ๋ถ€๋“œ๋Ÿฌ์›€์ด๋‚˜ ๊ณ ์ฐจ์›์ ์ธ ํŠน์„ฑ์„ ์ž˜ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ฉ”์‹œ์˜ ์„ธ๋ฐ€ํ•œ ๊ณก๋ฅ (curvature) ์ •๋ณด๋‚˜ ํ‘œ๋ฉด ์ผ๊ด€์„ฑ์„ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋‹จ์ˆœํžˆ ์ •์ ์ด ๊ทผ์ฒ˜์— ์žˆ๋Š” ๊ฒƒ ๋งŒ์œผ๋กœ๋Š” ๊ณ ํ’ˆ์งˆ์˜ 3D ๋ฉ”์‹œ๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

 

2. Normal loss

q : Chamfer ์†์‹ค ๊ณ„์‚ฐ ์‹œ, p ์™€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์‹ค์ œ ๋ฉ”์‹œ์˜ ์ •์  q ์ด๋‹ค. 
k : p ์˜ ์ด์›ƒ์ธ ์ด์›ƒ ํ”ฝ์…€์„ ์˜๋ฏธํ•œ๋‹ค. 
nq : ground truth ์—์„œ์˜ ๊ด€์ธก๋œ ํ‘œ๋ฉด ๋ฒ•์„ ์ด๋‹ค.

Normal loss์˜ ๋ชฉ์ 
- ์ด ์†์‹ค ํ•จ์ˆ˜๋Š” ๋ฒ•์„  ๋ฐฉํ–ฅ์„ ๊ธฐ์ค€์œผ๋กœ ์ตœ์ ํ™”๋ฅผ ์ง„ํ–‰ํ•˜๋ฉฐ, ํŠนํžˆ ์ •์  p ์˜ ์ด์›ƒ ์ •์ ๊ณผ์˜ ๊ฐ„๊ฒฉ๊ณผ ๋ฒ•์„  ๋ฐฉํ–ฅ์˜ ์ผ๊ด€์„ฑ์„ ๋งž์ถ”๋ ค๊ณ  ํ•œ๋‹ค.
- ๊ตฌ์ฒด์ ์œผ๋กœ๋Š” p ์™€ k ์‚ฌ์ด์˜ ๋ฒกํ„ฐ๊ฐ€ ground truth ์—์„œ ์ฃผ์–ด์ง„ ๋ฒ•์„ (nq) ๊ณผ ์ˆ˜์ง์ด ๋˜๋„๋ก ์ตœ์ ํ™”ํ•˜๋ ค๊ณ  ํ•œ๋‹ค.
- "์ˆ˜์ง" ์ด๋ผ๋Š” ๋ง์€, p ์™€ k ๊ฐ„์˜ ๋ฒกํ„ฐ๊ฐ€ ground truth์˜ ํ‘œ๋ฉด ๋ฒ•์„  ๋ฒกํ„ฐ์™€ ๋‚ด์ ๊ฐ’์ด 0์— ๊ฐ€๊นŒ์›Œ์ง€๋„๋ก ํ•œ๋‹ค๋Š” ๋ง์ด๋‹ค.
- ์ตœ์ ํ™” ๊ณผ์ •์—์„œ๋Š” ํ‘œ๋ฉด์˜ ์ ‘์„  ํ‰๋ฉด์„ ์ž˜ ๋งž์ถ”๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ํ•™์Šต์ด ์ด๋ฃจ์–ด์ง€๋ฉฐ, ์ด ์†์‹ค์€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜์—ฌ ํ•™์Šต ๊ณผ์ •์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉ๋œ๋‹ค.

๐Ÿ’ก Back propagation
โ€œ์ด ์†์‹ค์€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜์—ฌ ํ•™์Šต ๊ณผ์ •์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹คโ€๋ผ๋Š” ํ‘œํ˜„์€ ๋ฐฑํ”„๋กœํŒŒ๊ฒŒ์ด์…˜(Backpropagation)๊ณผ ๊ด€๋ จ ์žˆ๋‹ค.

-> Back propagation ์€ ์‹ ๊ฒฝ๋ง์—์„œ ์˜ค์ฐจ ์—ญ์ „ํŒŒ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ์ด ๊ณผ์ •์—์„œ ์†์‹ค ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ๊ฐ€์ค‘์น˜๊ฐ€ ์ตœ์ ํ™”๋  ์ˆ˜ ์žˆ๋„๋ก ๋•๋Š”๋‹ค.
-> ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์€ ์†์‹ค ํ•จ์ˆ˜๊ฐ€ ์—ฐ์†์ ์ด๊ณ  ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋Š” ํ•จ์ˆ˜๋ผ๋Š” ๋œป์ด๊ณ , ์†์‹ค ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„๊ฐ’์€ ๋„คํŠธ์›Œํฌ์˜ ๊ฐ€์ค‘์น˜(ํ˜น์€ ํŒŒ๋ผ๋ฏธํ„ฐ)๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณ€๊ฒฝ๋˜์–ด์•ผ ํ•˜๋Š”์ง€๋ฅผ ์•Œ๋ ค์ฃผ๋Š” ์ค‘์š”ํ•œ ์ •๋ณด์ด๋‹ค.

-> ๋งŒ์•ฝ ์†์‹คํ•จ์ˆ˜๊ฐ€ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜์ง€ ์•Š๋‹ค๋ฉด, ๊ธฐ์šธ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์—†์–ด ๋ฐฑํ”„๋กœํŒŒ๊ฒŒ์ด์…˜์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๊ฒŒ ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ๋ชจ๋ธ์˜ ํ•™์Šต์ด ์–ด๋ ค์›Œ์ง€๊ฑฐ๋‚˜ ๋ถˆ๊ฐ€๋Šฅํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

3. Regularization

๋ฌธ์ œ : Local Minimum

- Chamfer loss ์™€ Normal loss ๋ฅผ ์‚ฌ์šฉํ•ด๋„, 3D ๋ฉ”์‰ฌ ๋ชจ๋ธ์ด ๊ตญ์†Œ ์ตœ์†Œ๊ฐ’์— ๊ฐ‡ํ˜€์„œ ์ตœ์ ํ™”๊ฐ€ ์ œ๋Œ€๋กœ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค.
ํŠนํžˆ, ์ดˆ๊ธฐ ์ถ”์ •์ด ์‹ค์ œ ๊ฐ’์—์„œ ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ์žˆ์„๋•Œ(์ฆ‰, ๋„คํŠธ์›Œํฌ๊ฐ€ ์ž˜๋ชป๋œ ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•  ๋•Œ) ๊ณผ๋„ํ•œ ๋ณ€ํ˜•(deformation) ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” vertex ๊ฐ€ ๋น„์ •์ƒ์ ์œผ๋กœ ์ด๋™(flying vertices) ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ์ดˆ๋ž˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

Laplacian Regularization

- Laplacian regularization์€ ์ •์ (vertex) ๋“ค์ด ๋„ˆ๋ฌด ์ž์œ ๋กญ๊ฒŒ ์ด๋™ํ•˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๋Š” ๊ธฐ๋ฒ•์ด๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ๋ฉ”์‰ฌ ๋ชจ๋ธ์ด ์ž๊ธฐ ๊ต์ฐจ(self-intersection)๋ฅผ ์ผ์œผํ‚ค๋Š” ๋ฌธ์ œ๋ฅผ ์˜ˆ๋ฐฉํ•  ์ˆ˜ ์žˆ๋‹ค.

- ์ด ์ •๊ทœํ™” ๊ธฐ๋ฒ•์€ ์ด์›ƒ ์ •์ (neighboring vertices) ๋“ค์ด ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์›€์ง์ด๋„๋ก ์œ ๋„ํ•˜์—ฌ ์„ธ๋ถ€์‚ฌํ•ญ์„ ๋ณด์กดํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค.

- ์ฒ˜์Œ์—๋Š” (์ฒซ ๋ฒˆ์งธ ๋ณ€ํ˜• ๋ธ”๋ก์—์„œ) ํ‘œ๋ฉด์ด ํ‰ํ‰ํ•œ ํƒ€์›์ฒด๋กœ ์ž…๋ ฅ๋˜๋ฏ€๋กœ ํ‘œ๋ฉด์˜ ๋งค๋„๋Ÿฌ์›€์„ ์œ ์ง€ํ•˜๋Š” ์—ญํ• ์„ ํ•˜๋ฉฐ, ์ดํ›„ ๋ธ”๋ก์—์„œ๋Š” ๋„ˆ๋ฌด ๊ณผ๋„ํ•œ ๋ณ€ํ˜•์„ ๋ฐฉ์ง€ํ•˜๋ฉฐ ๋ฏธ์„ธํ•œ ์„ธ๋ถ€์‚ฌํ•ญ๋งŒ ์ถ”๊ฐ€ํ•˜๋„๋ก ๋•๋Š”๋‹ค.

- Laplacian ์ขŒํ‘œ๋Š” ๊ฐ ์ •์  p์— ๋Œ€ํ•ด ์ •์˜๋˜๋ฉฐ, ์ด ์ขŒํ‘œ๋Š” ์ด์›ƒ ์ •์ ๋“ค ๊ฐ„์˜ ํ‰๊ท  ์œ„์น˜์™€ ๋น„๊ตํ•˜์—ฌ ๋ณ€ํ˜• ์ „ํ›„์˜ ์ฐจ์ด๋ฅผ ์ธก์ •ํ•œ๋‹ค.

- Laplacian ์ •๊ทœํ™” ์†์‹ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋œ๋‹ค. ์—ฌ๊ธฐ์„œ ํ”„๋ผ์ž„๊ฐ’๊ณผ ๊ทธ๋ƒฅ ๊ฐ’์€ ๋ณ€ํ˜• ์ „๊ณผ ํ›„์˜ Laplacian ์ขŒํ‘œ์ด๋‹ค.

 

Edge Length Regularization

- Flying vertices ๋ฌธ์ œ๋Š” ๋ณดํ†ต ๊ธด ๊ฐ„์„ (long edge) ์„ ์œ ๋ฐœํ•˜๋Š”๋ฐ, ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด edge length regularization ์„ ๋„์ž…ํ•œ๋‹ค.

- ์ด ์†์‹ค์€ ์ด์›ƒ ์ •์ ๋“ค ๊ฐ„์˜ ๊ฐ„๊ฒฉ์ด ์ง€๋‚˜์น˜๊ฒŒ ์ปค์ง€๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜์—ฌ ๋ฉ”์‰ฌ ๋ชจ๋ธ์ด ๊ณผ๋„ํ•˜๊ฒŒ ๋ณ€ํ˜•๋˜์ง€ ์•Š๋„๋ก ํ•œ๋‹ค. ๊ฐ„์„  ๊ธธ์ด์— ๋Œ€ํ•œ ์ •๊ทœํ™” ์†์‹ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋œ๋‹ค. 

 

์ตœ์ข… ์†์‹ค ํ•จ์ˆ˜(Overall Loss)

- ์ตœ์ข… ์†์‹ค ํ•จ์ˆ˜๋Š” ์—ฌ๋Ÿฌ ์†์‹ค ํ•จ์ˆ˜๋“ค์˜ ๊ฐ€์ค‘ํ•ฉ(weighted sum) ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ๊ฐ ์†์‹ค ํ•ญ๋ชฉ์—๋Š” ๊ฐ€์ค‘์น˜๊ฐ€ ํ• ๋‹น๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ๋„คํŠธ์›Œํฌ๊ฐ€ ํ•™์Šต ๊ณผ์ •์—์„œ ๊ฐ ์†์‹ค ํ•ญ๋ชฉ์˜ ์ค‘์š”๋„๋ฅผ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ๋‹ค. 

- ์ด์™€ ๊ฐ™์ด ์—ฌ๋Ÿฌ ์ •๊ทœํ™” ๊ธฐ๋ฒ•์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜์—ฌ, ์•ˆ์ •์ ์ด๊ณ  ์ž์—ฐ์Šค๋Ÿฌ์šด 3D ๋ฉ”์‰ฌ ๋ณ€ํ˜•์„ ์œ ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

Experiment

 

Data.

๋ฐ์ดํ„ฐ ์ถœ์ฒ˜ : ์ด์ „ ๋…ผ๋ฌธ Choy et al. ์—์„œ ์ œ๊ณตํ•œ ShapeNet (3D CAD ๋ชจ๋ธ๋“ค์˜ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹) ๋ฐ์ดํ„ฐ์…‹ ์ด์šฉ

๋ฐ์ดํ„ฐ ๊ตฌ์„ฑ : 50,000๊ฐœ์˜ 3D CAD ๋ชจ๋ธ, 13๊ฐœ์˜ ๊ฐ์ฒด ์นดํ…Œ๊ณ ๋ฆฌ๊ฐ€ ์žˆ์Œ(์˜ˆ: car, chair, airplane)

๋ Œ๋”๋ง ์ด๋ฏธ์ง€ : ๊ฐ 3D CAD ๋ชจ๋ธ์€ ๋‹ค์–‘ํ•œ ์นด๋ฉ”๋ผ ๊ด€์ (viewpoint) ์—์„œ 2D ์ด๋ฏธ์ง€๋กœ ๋ Œ๋”๋ง

 

Evaluation Metric.

1. F-Score : 
- ์ •๋ฐ€๋„(precision)์™€ ์žฌํ˜„์œจ(Recall)์˜ ์กฐํ™” ํ‰๊ท (harmonic mean)
- ๊ณ„์‚ฐ ๊ณผ์ •
    - ๊ฒฐ๊ณผ(์˜ˆ์ธก๊ฐ’)์™€ ์‹ค์ œ๊ฐ’(ground truth)์—์„œ ์ (point)์„ ์ƒ˜ํ”Œ๋ง
    - ๊ฐ ์ƒ˜ํ”Œ๋ง๋œ ์ ์— ๋Œ€ํ•ด ์ƒ๋Œ€๋ฐฉ(ground truth ๋˜๋Š” ์˜ˆ์ธก)์˜ Nearest Neighbor ์„ ์ฐพ๋Š”๋‹ค.
    - ํŠน์ • ์ž„๊ณ„๊ฐ’ tau ๋‚ด์— ์ƒ๋Œ€๋ฐฉ์˜ ์ ์ด ์žˆ๋‹ค๋ฉด ๋งค์นญ ์„ฑ๊ณต์œผ๋กœ ๊ฐ„์ฃผํ•œ๋‹ค.
- Precision : ์˜ˆ์ธก ๋œ ์ ๋“ค ์ค‘ ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ์ ์ด ์‹ค์ œ ๊ฐ’์˜ ์ ๊ณผ ๋งค์นญ๋˜์—ˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.
- Recall : ์‹ค์ œ ๊ฐ’์˜ ์ ๋“ค ์ค‘ ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ์ ์ด ์˜ˆ์ธก๋œ ์ ๊ณผ ๋งค์นญ๋˜์—ˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.
- ๊ฐ’์ด ํด์ˆ˜๋ก ์ข‹์Œ

2. Chamfer Distance (CD):
- ๋‘ ์  ์ง‘ํ•ฉ(์˜ˆ์ธก๊ณผ ์‹ค์ œ) ๊ฐ„์˜ ํ‰๊ท  ์  ๊ฑฐ๋ฆฌ
- ๋‘ ์  ์ง‘ํ•ฉ ๊ฐ„์˜ ์ „๋ฐ˜์ ์ธ ์œ ์‚ฌ์„ฑ ํ‰๊ฐ€ 
- ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ์ข‹์Œ

3. Earth Mover's Distance (EMD):
- ๋‘ ์  ์ง‘ํ•ฉ ๊ฐ„์˜ ์ „๋ฐ˜์ ์ธ ์œ ์‚ฌ์„ฑ ํ‰๊ฐ€ 
- ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ์ข‹์Œ

๊ธฐ์กด ์ง€ํ‘œ์˜ ํ•œ๊ณ„ : ๊ธฐ์กด ์ง€ํ‘œ๋“ค์€ ์  ๊ฐ„ ๊ฑฐ๋ฆฌ๋‚˜ ์ ์œ  ์ƒํƒœ์— ์ค‘์ ์„ ๋‘ . ๊ทธ๋Ÿฌ๋‚˜, ํ‘œ๋ฉด ํ’ˆ์งˆ(surface properties)์™€ ๊ฐ™์€ ๊ณ ์ฐจ์› ํŠน์„ฑ(์˜ˆ: continuity, smoothness, high-order details)์„ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•จ.

 

Baselines (๋น„๊ต ๋Œ€์ƒ ๋ชจ๋ธ).

1. 3D-R2N2 (Choy et al., 2016): 
- ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ 3D ๋ณผ๋ฅจ(Volume)์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ์‹
- ๋ณต์…€ ๊ธฐ๋ฐ˜ ์ถœ๋ ฅ์œผ๋กœ, 3D ํ˜•ํƒœ๋ฅผ ํ‘œํ˜„ํ•˜์ง€๋งŒ ํ•ด์ƒ๋„๊ฐ€ ๋‚ฎ์•„ ์„ธ๋ถ€์‚ฌํ•ญ ํ‘œํ˜„์ด ์ œํ•œ์ 
- ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๊ณผ ๋น„๊ตํ•˜๋ ค๋ฉด, Marching Cube ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ๋ณต์…€์„ ๋ฉ”์‰ฌ๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•จ. 

2. PSG (Fan et al., 2017): 
- ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ(Point Cloud)๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ์‹
- ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋Š” 3D ๊ณต๊ฐ„์˜ ์ ๋“ค๋กœ ๊ฐ์ฒด๋ฅผ ํ‘œํ˜„ํ•˜๋ฉฐ, ์„ธ๋ถ€์‚ฌํ•ญ ํ‘œํ˜„์€ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ ํ‘œ๋ฉด ๊ตฌ์กฐ๊ฐ€ ๋ถ€์กฑํ•  ์ˆ˜ ์žˆ์Œ
- ํ‰๊ฐ€ ์ง€ํ‘œ๊ฐ€ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ์— ์ •์˜๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ PSG์˜ ์ถœ๋ ฅ์€ ์ง์ ‘ ํ‰๊ฐ€ ๊ฐ€๋Šฅ

3. Neural 3D Mesh Renderer (N3MR, 2018): 
- ์ž…๋ ฅ ์ด๋ฏธ์ง€์—์„œ ๋ฉ”์‰ฌ(Mesh)๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ
- ์ฝ”๋“œ๊ฐ€ ๊ณต๊ฐœ๋œ ์œ ์ผํ•œ ๋ฉ”์‰ฌ ์ƒ์„ฑ ๋ชจ๋ธ๋กœ, ๊ธฐ์กด ๋ฐฉ์‹๊ณผ ์ง์ ‘์ ์ธ ๋น„๊ต ๊ฐ€๋Šฅ

๋ชจ๋“  ๋ชจ๋ธ์ด ๋™์ผํ•œ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ๊ณ , ๋™์ผํ•œ ํ•™์Šต/ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„ํ• ์„ ์‚ฌ์šฉ.
๋ชจ๋“  ๋ชจ๋ธ์ด ๋™์ผํ•œ ํ•™์Šต ์‹œ๊ฐ„์œผ๋กœ ํ›ˆ๋ จ๋˜์–ด ๋น„๊ต์˜ ๊ณต์ •์„ฑ ํ™•๋ณด

1. ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ๊ธฐ๋ฐ˜ ํ‰๊ฐ€:
- ๋ชจ๋“  ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๋™์ผํ•œ ๊ธฐ์ค€์—์„œ ํ‰๊ฐ€

2. ์™œ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋กœ ํ‰๊ฐ€ํ•˜๋‚˜?
- Chamfer Distance(CD), Earth Moverโ€™s Distance(EMD)์™€ ๊ฐ™์€ ํ‰๊ฐ€ ์ง€ํ‘œ๋Š” ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ์—์„œ ์ •์˜๋œ๋‹ค.

 

Training and Runtime.

1. ํ•™์Šต ์„ค์ •:
- ์ž…๋ ฅ ์ด๋ฏธ์ง€: 224 x 224
- ์ดˆ๊ธฐ ๋ฉ”์‰ฌ: 156๊ฐœ์˜ ์ •์ , 462๊ฐœ์˜ ๊ฐ„์„ ์œผ๋กœ ๊ตฌ์„ฑ๋œ ์ดˆ๊ธฐ ํƒ€์›์ฒด(Ellipsoid)
- ๊ตฌํ˜„ ํ™˜๊ฒฝ: TensorFlow
- ์ตœ์ ํ™” ๋ฐฉ๋ฒ•:
Adam Optimizer ์‚ฌ์šฉ, Weight Decay(1 * 10^-5 ์˜ ๊ฐ€์ค‘์น˜ ๊ฐ์‡ ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€)
- ๋ฐฐ์น˜ ํฌ๊ธฐ : 1
- ํ•™์Šต ์Šค์ผ€์ค„ : ํ•™์Šต ์—ํฌํฌ(50), ์ดˆ๊ธฐ ํ•™์Šต๋ฅ (3 * 10^-5), ํ•™์Šต๋ฅ  ๊ฐ์†Œ(40๋ฒˆ์งธ ์—ํฌํฌ ์ดํ›„, ํ•™์Šต๋ฅ  1 * 10^-5 ๋‚ฎ์ถฐ ์•ˆ์ •์ ์ธ ์ตœ์ ํ™” ์œ ๋„)
- ์ด ํ•™์Šต ์‹œ๊ฐ„: NVIDIA Titan X GPU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 72์‹œ๊ฐ„

2. ์‹คํ–‰ ์„ฑ๋Šฅ:
- ํ…Œ์ŠคํŠธ ์‹œ ๋ฉ”์‰ฌ ์ƒ์„ฑ ์†๋„: 2466๊ฐœ์˜ ์ •์  ์ƒ์„ฑ์— ํ‰๊ท  15.58ms

 

 

 

Comparison to state of the art

 F-score (Tab. 1)

์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ:
- ๋Œ€๋ถ€๋ถ„์˜ ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ Ours ๋Š” ๋ชจ๋“  ๋ฐฉ๋ฒ•๋ณด๋‹ค ๋†’์€ F-score๋ฅผ ๊ธฐ๋ก
- ํŠนํžˆ, ์ž‘์€ ์ž„๊ณ„๊ฐ’ tau ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ์— ๋น„ํ•ด ์ตœ์†Œ 10% ๋†’์€ F-score๋ฅผ ๋ณด์ธ๋‹ค.
- ์ž‘์€  tau ์—์„œ ์šฐ์ˆ˜ํ•˜๋‹ค๋Š” ๊ฒƒ์€ ์„ธ๋ถ€์ ์ธ ์ •๋ฐ€๋„๋ฅผ ์ž˜ ๋ณต์›ํ–ˆ์Œ์„ ์˜๋ฏธํ•œ๋‹ค.

์˜ˆ์™ธ:
- Watercraft ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ์— ๋’ค์ณ์ง.

N3MR์˜ ์„ฑ๋Šฅ:
- ์•ฝ 50% ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๊ธฐ๋ก.
- ์›์ธ: N3MR์€ ์ด๋ฏธ์ง€์˜ ์‹ค๋ฃจ์—ฃ ์‹ ํ˜ธ(silhouette signal)๋งŒ ํ•™์Šตํ•˜๋ฉฐ, 3D ๋ฉ”์‰ฌ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

 

CD์™€ EMD (Tab. 2)

์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ:
- ๋Œ€๋ถ€๋ถ„์˜ ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ ๊ฐ€์žฅ ๋‚ฎ์€ CD์™€ EMD๋ฅผ ๊ธฐ๋กํ•˜๋ฉฐ, ์ตœ๊ณ  ํ‰๊ท  ์ ์ˆ˜(0.591)

PSG์™€์˜ ๋น„๊ต:
- PSG๋Š” ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž์œ ๋„๊ฐ€ ๋†’์•„ CD์™€ EMD์—์„œ ๋” ๋‚ฎ์€ ๊ฐ’์„ ์–ป๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Œ
- ๊ทธ๋Ÿฌ๋‚˜, ์ด ์ž์œ ๋„๋Š” ์ ์ ˆํ•œ ์ •๊ทœํ™”๊ฐ€ ์—†์œผ๋ฉด ๋ฉ”์‰ฌ ๋ชจ๋ธ ํ’ˆ์งˆ์— ๋ถ€์ •์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์นจ

 

 

์ •์„ฑ์  ๊ฒฐ๊ณผ ๋ถ„์„ (Fig. 8)

๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค์˜ ํ•œ๊ณ„

1. 3D-R2N2:
- ๋‚ฎ์€ ํ•ด์ƒ๋„๋กœ ์ธํ•ด ๋””ํ…Œ์ผ ๋ถ€์กฑ
- ์˜ˆ: ์˜์ž ๋‹ค๋ฆฌ์™€ ๊ฐ™์€ ์„ธ๋ถ€ ์‚ฌํ•ญ์ด ๋ณต์›๋˜์ง€ ์•Š์Œ
- Octree ๊ธฐ๋ฐ˜ ํ•ด๊ฒฐ ์‹œ๋„: ํ•ด์ƒ๋„๋ฅผ ์ฆ๊ฐ€์‹œํ‚ค๋ ค ํ–ˆ์œผ๋‚˜, ํ‘œ๋ฉด ๋””ํ…Œ์ผ ๋ณต์›์ด ์—ฌ์ „ํžˆ ์–ด๋ ค์› ๋‹ค

2. PSG:
- ํฌ์†Œํ•œ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋ฅผ ์ƒ์„ฑ
- Chamfer loss๊ฐ€ ํšŒ๊ท€ ์†์‹ค์ฒ˜๋Ÿผ ์ž‘๋™ํ•˜์—ฌ ์ž์œ ๋„๊ฐ€ ๋„ˆ๋ฌด ๋†’์•„ ๋ฉ”์‰ฌ ๋ณต์›์ด ์–ด๋ ค์›€

3. N3MR:
- ๋งค์šฐ ๊ฑฐ์นœ ํ˜•ํƒœ๋ฅผ ์ƒ์„ฑ: ๋‹จ์ˆœ ๋ Œ๋”๋ง ์ž‘์—…์—๋Š” ์ถฉ๋ถ„ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์˜์ž, ํ…Œ์ด๋ธ”๊ณผ ๊ฐ™์€ ๋ณต์žกํ•œ ๊ฐ์ฒด ๋ณต์›์—๋Š” ๋ถ€์ ํ•ฉ

 

์ œ์•ˆ๋œ ๋ชจ๋ธ์˜ ์žฅ์ 

1. ๋ฉ”์‰ฌ ํ‘œํ˜„:
- ๋ฉ”์‰ฌ ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด์ƒ๋„์— ์ œํ•œ๋˜์ง€ ์•Š์Œ
- ๋ฉ”๋ชจ๋ฆฌ ์ œํ•œ์„ ๊ทน๋ณตํ•˜๋ฉฐ ๋งค๋„๋Ÿฌ์šด ํ‘œ๋ฉด๊ณผ ์ง€์—ญ์  ์„ธ๋ถ€ ์‚ฌํ•ญ(local details)์„ ํฌํ•จ

2. ์ง€๊ฐ์  ํŠน์ง• ํ†ตํ•ฉ:
- ์ž…๋ ฅ ์ด๋ฏธ์ง€์˜ Perceptual Feature๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜์—ฌ ์„ธ๋ถ€ ์‚ฌํ•ญ ๋ณต์›

3. ํ•™์Šต ๊ณผ์ •์—์„œ ์ •๊ตํ•˜๊ฒŒ ์ •์˜๋œ ์†์‹ค:
- Chamfer loss, Normal loss, Laplacian regularization ๋“ฑ ์ ์ ˆํ•œ ์†์‹ค ํ•จ์ˆ˜ ์„ค๊ณ„๋กœ ์•ˆ์ •์ ์ธ ํ•™์Šต ์œ ๋„

 

Ablation Study (์„ฑ๋ถ„ ๋ถ„์„ ์‹คํ—˜)

Ablation Study ๋Š” ๋”ฅ๋Ÿฌ๋‹, ๋จธ์‹ ๋Ÿฌ๋‹, ๋˜๋Š” ์‹œ์Šคํ…œ ์„ค๊ณ„์—์„œ ํŠน์ • ์š”์†Œ๋ฅผ ์ œ๊ฑฐํ•˜๊ฑฐ๋‚˜ ๋ณ€๊ฒฝํ•œ ํ›„, ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜์—ฌ ๊ฐ ์š”์†Œ์˜ ์ค‘์š”์„ฑ์„ ๋ถ„์„ํ•˜๋Š” ์—ฐ๊ตฌ ๋ฐฉ๋ฒ•์ด๋‹ค.

<์ •๋Ÿ‰์  ํ‰๊ฐ€์˜ ํ•œ๊ณ„>
- Tab. 3์— ๋”ฐ๋ฅด๋ฉด ์—ฃ์ง€ ๊ธธ์ด ์ •๊ทœํ™”(Edge Length Regularization) ๋ฅผ ์ œ๊ฑฐํ•œ ๋ชจ๋ธ์ด F-score, CD, EMD์—์„œ ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„
- ๊ทธ๋Ÿฌ๋‚˜, Fig. 5์—์„œ ํ™•์ธํ•œ ์‹œ๊ฐ์  ํ’ˆ์งˆ์€ ๊ฐ€์žฅ ๋‚˜์œ ๋ฉ”์‰ฌ๋ฅผ ์ƒ์„ฑํ•จ
- ์˜ˆ: ๋ฉ”์‰ฌ๊ฐ€ ๋น„์ •์ƒ์ ์œผ๋กœ ์™œ๊ณก๋˜๊ฑฐ๋‚˜, flying vertices ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒ

 <์ •์„ฑ์  ํ‰๊ฐ€์˜ ์ค‘์š”์„ฑ>
- ์ •๋Ÿ‰์  ์ง€ํ‘œ๋Š” ์  ๊ฐ„ ๊ฑฐ๋ฆฌ๋‚˜ ์ ์œ  ์ƒํƒœ๋งŒ ์ธก์ •ํ•˜๋ฉฐ, ๋ฉ”์‰ฌ์˜ ์‹œ๊ฐ์  ํ’ˆ์งˆ(์˜ˆ: ๋งค๋„๋Ÿฌ์šด ํ‘œ๋ฉด, ๋””ํ…Œ์ผ ๋ณด์กด)์„ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•œ๋‹ค
- Fig. 5์˜ ์‹œ๊ฐ์  ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด, ๊ฐ ๊ตฌ์„ฑ ์š”์†Œ๊ฐ€ 3D ๋ฉ”์‰ฌ ํ’ˆ์งˆ์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ์–ด๋–ป๊ฒŒ ๊ธฐ์—ฌํ•˜๋Š”์ง€ ํ™•์ธ ๊ฐ€๋Šฅ

<์ œ๊ฑฐ๋œ ์š”์†Œ>

1. Graph Unpooling

- Graph Unpooling Layers๋ฅผ ์ œ๊ฑฐํ•˜์—ฌ, ๋ชจ๋“  ๋ธ”๋ก์—์„œ ์ •์ (vertex) ์ˆ˜๊ฐ€ ๋™์ผํ•˜๊ฒŒ ์œ ์ง€๋˜๋„๋ก ์„ค์ •

- ์ด๋Š” ๊ธฐ์กด์˜ ์ ์ง„์ ์œผ๋กœ ์ •์ ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹(Coarse-to-Fine)์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ตฌ์กฐ๋ฅผ ์˜๋ฏธ.

- ๊ฒฐ๊ณผ:
    - ์ดˆ๊ธฐ ๋‹จ๊ณ„์—์„œ ๋ณ€ํ˜•(Deformation) ๊ณผ์ •์—์„œ ์˜ค๋ฅ˜ ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ ์ฆ๊ฐ€:
        - ์ดˆ๊ธฐ ์˜ค๋ฅ˜๊ฐ€ ํ›„์† ๋ธ”๋ก์—์„œ ์ˆ˜์ •๋˜์ง€ ๋ชปํ•จ.
    - ๊ฒฐ๊ณผ์ ์œผ๋กœ, ๊ฐ์ฒด์˜ ์ผ๋ถ€ ์˜์—ญ์—์„œ ๋ˆˆ์— ๋„๋Š” ์™œ๊ณก(artifacts)์ด ๋ฐœ์ƒ.

- ์˜์˜: Graph Unpooling์€ ์ •์ ์„ ์ ์ง„์ ์œผ๋กœ ์ถ”๊ฐ€ํ•˜๋ฉด์„œ ๋ณ€ํ˜• ๊ณผ์ •์„ ์„ธ๋ฐ€ํ•˜๊ฒŒ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋ฏ€๋กœ, ๋ชจ๋ธ์˜ ์•ˆ์ •์  ํ•™์Šต์— ํ•„์ˆ˜์ 

 

2. G-ResNet (Shortcut Connections)

- G-ResNet์—์„œ Shortcut Connection(Residual Connection)์„ ์ œ๊ฑฐํ•˜์—ฌ ์ผ๋ฐ˜์ ์ธ Graph Convolutional Network(GCN)์œผ๋กœ ๋ณ€๊ฒฝ

- ๊ฒฐ๊ณผ:
    - ๋ชจ๋“  ํ‰๊ฐ€ ์ง€ํ‘œ(Tab. 3)์—์„œ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ๊ฐ์†Œ: Chamfer Distance ์ตœ์ ํ™” ์‹คํŒจ
    - ์›์ธ:
        - 2D CNN์—์„œ๋„ ๊ด€์ฐฐ๋œ Degradation Problem(์„ฑ๋Šฅ ํ‡ดํ™” ๋ฌธ์ œ):
            - ๋„ˆ๋ฌด ๊นŠ์€ ๋„คํŠธ์›Œํฌ๋Š” ํ•™์Šต์ด ์–ด๋ ค์›Œ์ ธ ํ•™์Šต ์˜ค๋ฅ˜(training error)๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ณ , ํ…Œ์ŠคํŠธ ์˜ค๋ฅ˜(testing error)๋„ ์ฆ๊ฐ€
            - ์ œ์•ˆ๋œ ๋ชจ๋ธ์€ 42๊ฐœ์˜ Graph Convolutional Layers๋ฅผ ๊ฐ€์ง€๋ฏ€๋กœ, ์ด ๋ฌธ์ œ์˜ ์˜ํ–ฅ์ด ๋”์šฑ ๋‘๋“œ๋Ÿฌ์ง„๋‹ค

- ์˜์˜:
    - Shortcut Connection์€ ๊ธฐ์šธ๊ธฐ ์†Œ์‹ค ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ , ๊นŠ์€ ๋„คํŠธ์›Œํฌ์—์„œ๋„ ํ•™์Šต ์•ˆ์ •์„ฑ์„ ์ œ๊ณต
    - G-ResNet์—์„œ Shortcut Connection์€ 3D ๋ฉ”์‰ฌ ๋ณต์› ๊ณผ์ •์˜ ํ•ต์‹ฌ ๊ตฌ์„ฑ ์š”์†Œ์ž„

 

3. ์†์‹ค ํ•จ์ˆ˜(Loss Terms)

A. Normal Loss ์ œ๊ฑฐ
- ๊ฒฐ๊ณผ:
    - ํ‘œ๋ฉด์˜ ๋งค๋„๋Ÿฌ์›€(Smoothness)๊ณผ ์ง€์—ญ์  ๋””ํ…Œ์ผ(Local Details)์ด ์‹ฌ๊ฐํ•˜๊ฒŒ ์†์ƒ
    - ์˜ˆ: ์˜์ž ๋“ฑ๋ฐ›์ด(seat back)์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์ด ์‚ฌ๋ผ์ง
- ์˜์˜:
    - Normal Loss๋Š” ํ‘œ๋ฉด์˜ ๋ฒ•์„  ๋ฐฉํ–ฅ์„ ์œ ์ง€ํ•˜์—ฌ ๋งค๋„๋Ÿฌ์šด ํ‘œ๋ฉด๊ณผ ์„ธ๋ถ€ ์‚ฌํ•ญ ๋ณต์›์„ ๋•๋Š” ์ค‘์š”ํ•œ ์—ญํ• 

B. Laplacian Term ์ œ๊ฑฐ
- ๊ฒฐ๊ณผ:
    - Geometry Self-Intersection(๊ธฐํ•˜ํ•™์  ๊ต์ฐจ)๊ฐ€ ๋ฐœ์ƒ
    - ์˜ˆ: ์˜์ž ์†์žก์ด(handheld)๊ฐ€ ๊ต์ฐจํ•˜๊ฑฐ๋‚˜ ์ž˜๋ชป๋œ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง
- ์˜์˜:
    - Laplacian Term์€ ์ •์ ์˜ ๊ตญ์†Œ์ ์ธ ๊ตฌ์กฐ(topology)๋ฅผ ์œ ์ง€ํ•˜๊ณ , ์•ˆ์ •์ ์ธ ๋ณ€ํ˜•์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•จ

C. Edge Length Term ์ œ๊ฑฐ
- ๊ฒฐ๊ณผ:
    - Flying Vertices์™€ ๋น„์ •์ƒ์ ์œผ๋กœ ๊ธด ๊ฐ„์„ (edge)์ด ์ƒ์„ฑ๋˜์–ด ํ‘œ๋ฉด์ด ๋ง๊ฐ€์ง
    - ์˜ˆ: ๋ฉ”์‰ฌ ํ‘œ๋ฉด ์ „์ฒด๊ฐ€ ์ œ๋Œ€๋กœ ํ˜•์„ฑ๋˜์ง€ ๋ชปํ•จ
- ์˜์˜:
    - Edge Length Term์€ ์ •์  ๊ฐ„์˜ ๊ฐ„๊ฒฉ์„ ์กฐ์ •ํ•˜์—ฌ ๋ฉ”์‰ฌ์˜ ์—ฐ๊ฒฐ์„ฑ์„ ์œ ์ง€ํ•˜๊ณ , ํ‘œ๋ฉด ์™œ๊ณก์„ ๋ฐฉ์ง€

 

Number of Deformation Blocks

Deformation Block ์ด๋ž€?
- Deformation Block์€ ์ดˆ๊ธฐ ๋ฉ”์‰ฌ(Ellipsoid) ์—์„œ ์‹œ์ž‘ํ•˜์—ฌ ๋ฉ”์‰ฌ๋ฅผ ์ ์ง„์ ์œผ๋กœ ๋ณ€ํ˜•์‹œํ‚ค๋Š” ๋‹จ๊ณ„์ด๋‹ค.
- ๊ฐ ๋ธ”๋ก์€ ์ •์ ๊ณผ ๊ฐ„์„ ์„ ์ฆ๊ฐ€์‹œํ‚ค๊ณ , ์ƒˆ๋กœ์šด ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์ถ”๊ฐ€ํ•˜์—ฌ 3D ๋ฉ”์‰ฌ์˜ ํ•ด์ƒ๋„๋ฅผ ๋†’์ธ๋‹ค.

์œ„์˜ ์™ผ์ชฝ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด ๋ธ”๋ก์˜ ๊ฐœ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ• ์ˆ˜๋ก F-score ๊ฐ€ ์ƒ์Šนํ•˜๊ณ  Chamfer Distance ๊ฐ€ ๊ฐ์†Œํ•˜์—ฌ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ์„ ๋ณด์—ฌ์ค€๋‹ค. 
ํ•˜์ง€๋งŒ, ๋ธ”๋ก ๊ฐœ์ˆ˜๊ฐ€ 3์—์„œ 4๋กœ ์ฆ๊ฐ€ํ•  ๋•Œ๋Š” ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ๊ฑฐ์˜ ํฌํ™” ์ƒํƒœ(Saturated) ์ž„์„ ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ๋‹ค.

๋ธ”๋ก ๊ฐœ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•˜๋ฉด ์ •์ ๊ณผ ๊ฐ„์„  ์ˆ˜๊ฐ€ ๋งŽ์•„์ ธ์„œ ๋ฉ”์‰ฌ๊ฐ€ ๋”์šฑ ์„ธ๋ฐ€ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์†๋„์™€ ํšจ์œจ์„ฑ, ๊ณ„์‚ฐ ์‹œ๊ฐ„๊ณผ ๋ณต์žก๋„๊ฐ€ ํฌ๊ฒŒ ์ฆ๊ฐ€ํ•˜๋ฏ€๋กœ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ธ”๋ก ์ˆ˜ 3์„ ์„ ํƒํ–ˆ๋‹ค.

 

 

Reconstructing Real-World images

- ์ œ์•ˆ๋œ ๋ชจ๋ธ์€ Synthetic Data (ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ) ์ธ ShapeNet์—์„œ ํ•™์Šต๋˜์—ˆ๋‹ค.
- ๋ชจ๋ธ์„ Real-World images(์‹ค์ œ ์ด๋ฏธ์ง€) ์—์„œ ํ…Œ์ŠคํŠธํ•  ๋•Œ ์ถ”๊ฐ€์ ์ธ fine-tuning ์—†์ด, ํ•™์Šต๋œ ์ƒํƒœ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜์˜€๋‹ค.
- ๊ฒฐ๊ณผ : ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋„ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

 

 

Conclusion

- ์ œ์•ˆ๋œ ์ ‘๊ทผ๋ฒ•์€ ๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ๊ณ ํ’ˆ์งˆ์˜ 3D triangular mesh ๋ฅผ ์ƒ์„ฑํ•˜๋Š”๋ฐ ์„ฑ๊ณต์ ์ด์—ˆ๋‹ค.
<๋„คํŠธ์›Œํฌ ์„ค๊ณ„>
- ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ:
    - ๋งค์šฐ ๊นŠ์€(cascaded) ๊ทธ๋ž˜ํ”„ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง(Graph Convolutional Neural Network) ์„ค๊ณ„
    - Shortcut Connections(์ž”์ฐจ ์—ฐ๊ฒฐ)๋ฅผ ํ†ตํ•ด ํ•™์Šต ์•ˆ์ •์„ฑ๊ณผ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ
- ๋ฉ”์‰ฌ ์ •๋ฐ€ํ™” ๊ณผ์ •:
    - ๋„คํŠธ์›Œํฌ๋Š” Chamfer Loss์™€ Normal Loss๋กœ ์—”๋“œ ํˆฌ ์—”๋“œ(end-to-end) ํ•™์Šต
    - ๋ฉ”์‰ฌ๋ฅผ ๋‹จ๊ณ„์ ์œผ๋กœ ์ •๋ฐ€ํ•˜๊ฒŒ ๊ฐœ์„ .

- ๋ฉ”์‰ฌ ํ‘œํ˜„์˜ ์žฅ์ ์„ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜๋ฉฐ, ๊ธฐ์กด SOTA ๋ฐฉ์‹(3D Volume, 3D Point Cloud) ๋ณด๋‹ค ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.