Three-dimensional shape space learning for visual concept construction: challenges and research progress

Xin TONG

doi:10.1631/FITEE.2200318

Your Location：

Home >

Browse articles >

Three-dimensional shape space learning for visual concept construction: challenges and research progress

Special Column on Visual Knowledge | Updated：2022-10-14

- Three-dimensional shape space learning for visual concept construction: challenges and research progress
- 面向视觉概念构建的三维形状空间学习：挑战与研究进展
- Frontiers of Information Technology & Electronic Engineering Vol. 23, Issue 9, Pages: 1290-1297(2022)
- Affiliations：
  
  Microsoft Research Asia, Beijing 100080, China
- Author bio：
- Funds：
- DOI：10.1631/FITEE.2200318
  CLC：
- Published：2022-09，
  
  Received：26 July 2022，
  
  Accepted：2022-08-26
- Accepted：
Scan QR Code
XIN TONG. Three-dimensional shape space learning for visual concept construction: challenges and research progress. [J]. Frontiers of information technology & electronic engineering, 2022, 23(9): 1290-1297.
DOI：

XIN TONG. Three-dimensional shape space learning for visual concept construction: challenges and research progress. [J]. Frontiers of information technology & electronic engineering, 2022, 23(9): 1290-1297. DOI： 10.1631/FITEE.2200318.

摘要

人类可以熟练的对真实世界中物体按照形状或者功能进行分类，并在思维中建立每类物体的视觉概念和周围真实世界的视觉知识（

Pan

2019

Pan

2019

）。

Pan（2021）

指出建立这些视觉概念和视觉知识的计算表达是发展下一代人工智能的一个关键步骤。学习同一视觉概念下所有物体的三维形状空间是实现视觉概念计算表达的一个关键步骤。本文提出三维形状空间学习中面临的关键技术挑战，并围绕这些技术挑战回顾了这一领域的研究进展，最后讨论了三维形状空间学习领域的研究趋势和未来发展方向。

Abstract

关键词

视觉概念视觉知识三维几何学习三维形状空间三维结构

Keywords

references

Bai S, Bai X, Zhou ZC, et al., 2016. GIFT: a real-time and scalable 3D shape search engine. IEEE Conf on Computer Vision and Pattern Recognition, p.5023-5032. doi: 10.1109/CVPR.2016.543http://doi.org/10.1109/CVPR.2016.543

Cao C, Weng YL, Zhou S, et al., 2014. FaceWareHouse: a 3D facial expression database for visual computing. IEEE Trans Visual Comput Graph, 20(3):413-425. doi: 10.1109/TVCG.2013.249http://doi.org/10.1109/TVCG.2013.249

Chan ER, Monteiro M, Kellnhofer P, et al., 2021. pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5799-5809. doi: 10.1109/CVPR46437.2021.00574http://doi.org/10.1109/CVPR46437.2021.00574

Chen ZQ, Zhang H, 2019. Learning implicit fields for generative shape modeling. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5932-5941. doi: 10.1109/CVPR.2019.00609http://doi.org/10.1109/CVPR.2019.00609

Deng Y, Yang JL, Tong X, 2021. Deformed implicit field: modeling 3D shapes with learned dense correspondence. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10286-10296. doi: 10.1109/CVPR46437.2021.01015http://doi.org/10.1109/CVPR46437.2021.01015

Deng Y, Yang J, Xiang J, et al., 2022. GRAM: generative radiance manifolds for 3D-aware image generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10673-10683.

Egger B, Smith WA, Tewari A, 2020. 3D morphable face models past, present, and future. ACM Trans Graph, 39(5):157. doi: 10.1145/3395208http://doi.org/10.1145/3395208

Gadelha M, Maji S, Wang R, 2017. 3D shape induction from 2D views of multiple objects. Int Conf on 3D Vision, p.402-411. doi: 10.1109/3DV.2017.00053http://doi.org/10.1109/3DV.2017.00053

Groueix T, Fisher M, Kim VG, et al., 2018. A Papier-Mache approach to learning 3D surface generation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.216-224. doi: 10.1109/CVPR.2018.00030http://doi.org/10.1109/CVPR.2018.00030

Hughes JF, van Dam A, McGuire M, et al., 2013. Computer Graphics: Principles and Practice (3rd Ed.). Addison-Wesley, Upper Saddle River, USA.

Jiang C, Huang J, Tagliasacchi A, et al., 2020. ShapeFlow: learnable deformation flows among 3D shapes. Advances in Neural Information Processing Systems 33, p.9745-9757.

Jin YW, Jiang DQ, Cai M, 2020. 3D reconstruction using deep learning: a survey. Commun Inform Syst, 20(4):389-413. doi: 10.4310/CIS.2020.v20.n4.a1http://doi.org/10.4310/CIS.2020.v20.n4.a1

Li X, Dong Y, Peers P, et al., 2019. Synthesizing 3D shapes from silhouette image collections using multi-projection generative adversarial networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5530-5539. doi: 10.1109/CVPR.2019.00568http://doi.org/10.1109/CVPR.2019.00568

Liu F, Liu XM, 2020. Learning implicit functions for topology-varying dense 3D shape correspondence. Proc 34th Int Conf on Neural Information Processing Systems, p.4823-4834.

Loper M, Mahmood N, Romero J, et al., 2015. SMPL: a skinned multi-person linear model. ACM Trans Graph, 34(6):248. doi: 10.1145/2816795.2818013http://doi.org/10.1145/2816795.2818013

Lun ZL, Gadelha M, Kalogerakis E, et al., 2017. 3D shape reconstruction from sketches via multi-view convolutional networks. Proc Int Conf on 3D Vision, p.67-77. http://arxiv.org/abs/1707.06375http://arxiv.org/abs/1707.06375

Masci J, Boscaini D, Bronstein MM, et al., 2015. Geodesic convolutional neural networks on Riemannian manifolds. Proc IEEE Int Conf on Computer Vision Workshop, p.832-840. doi: 10.1109/ICCVW.2015.112http://doi.org/10.1109/ICCVW.2015.112

Měch R, Prusinkiewicz P, 1996. Visual models of plants interacting with their environment. Proc 23rd Annual Conf on Computer Graphics and Interactive Techniques, p.397-410. doi: 10.1145/237170.237279http://doi.org/10.1145/237170.237279

Mescheder L, Oechsle M, Niemeyer M, et al., 2019. Occupancy networks: learning 3D reconstruction in function space. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4455-4465. doi: 10.1109/CVPR.2019.00459http://doi.org/10.1109/CVPR.2019.00459

Mo KC, Zhu SL, Chang AX, et al., 2019. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.909-918. doi: 10.1109/CVPR.2019.00100http://doi.org/10.1109/CVPR.2019.00100

Müller P, Wonka P, Haegler S, et al., 2006. Procedural modeling of buildings. ACM SIGGRAPH Papers, p.614-623. doi: 10.1145/1141911.1141931http://doi.org/10.1145/1141911.1141931

Niu CJ, Li J, Xu K, 2018. Im2Struct: recovering 3D shape structure from a single RGB image. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4521-4529. doi: 10.1109/CVPR.2018.00475http://doi.org/10.1109/CVPR.2018.00475

Pan YH, 2019. On visual knowledge. Front Inform Technol Electron Eng, 20(8):1021-1025. doi: 10.1631/FITEE.1910001http://doi.org/10.1631/FITEE.1910001

Pan YH, 2021a. Miniaturized five fundamental issues about visual knowledge. Front Inform Technol Electron Eng, 22(5):615-618. doi: 10.1631/FITEE.2040000http://doi.org/10.1631/FITEE.2040000

Pan YH, 2021b. On visual understanding. Front Inform Technol Electron Eng, early access. doi: 10.1631/FITEE.2130000http://doi.org/10.1631/FITEE.2130000

Park JJ, Florence P, Straub J, et al., 2019. DeepSDF: learning continuous signed distance functions for shape representation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.165-174. doi: 10.1109/CVPR.2019.00025http://doi.org/10.1109/CVPR.2019.00025

Paschalidou D, Katharopoulos A, Geiger A, et al., 2021. Neural parts: learning expressive 3D shape abstractions with invertible neural networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3204-3215. doi: 10.1109/CVPR46437.2021.00322http://doi.org/10.1109/CVPR46437.2021.00322

Qi CR, Su H, Mo KC, et al., 2017. PointNet: deep learning on point sets for 3D classification and segmentation. IEEE Conf on Computer Vision and Pattern Recognition, p.77-85. doi: 10.1109/CVPR.2017.16http://doi.org/10.1109/CVPR.2017.16

Riegler G, Ulusoy AO, Geiger A, 2017. OctNet: learning deep 3D representations at high resolutions. IEEE Conf on Computer Vision and Pattern Recognition, p.6620-6629. doi: 10.1109/CVPR.2017.701http://doi.org/10.1109/CVPR.2017.701

Sinha A, Bai J, Ramani K, 2016. Deep learning 3D shape surfaces using geometry images. Proc 14th European Conf on Computer Vision, p.223-240. doi: 10.1007/978-3-319-46466-4_14http://doi.org/10.1007/978-3-319-46466-4_14

Su H, Maji S, Kalogerakis E, et al., 2015. Multi-view convolutional neural networks for 3D shape recognition. IEEE Int Conf on Computer Vision, p.945-953. doi: 10.1109/ICCV.2015.114http://doi.org/10.1109/ICCV.2015.114

Sun CY, Zou QF, Tong X, et al., 2019. Learning adaptive hierarchical cuboid abstractions of 3D shape collections. ACM Trans Graph, 38(6):241. doi: 10.1145/3355089.3356529http://doi.org/10.1145/3355089.3356529

Tulsiani S, Su H, Guibas LJ, et al., 2017. Learning shape abstractions by assembling volumetric primitives. IEEE Conf on Computer Vision and Pattern Recognition, p.1466-1474. doi: 10.1109/CVPR.2017.160http://doi.org/10.1109/CVPR.2017.160

Wang NY, Zhang YD, Li ZW, et al., 2018. Pixel2Mesh: generating 3D mesh models from single RGB images. Proc 15th European Conf on Computer Vision, p.55-71. doi: 10.1007/978-3-030-01252-6_4http://doi.org/10.1007/978-3-030-01252-6_4

Wang PS, Liu Y, Guo YX, et al., 2017. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph, 36(4):72. doi: 10.1145/3072959.3073608http://doi.org/10.1145/3072959.3073608

Wang PS, Liu Y, Tong X, 2022. Dual octree graph networks for learning adaptive volumetric shape representations. ACM Trans Graph, 41(4):103. doi: 10.1145/3528223.3530087http://doi.org/10.1145/3528223.3530087

Wen C, Zhang YD, Li ZW, et al., 2019. Pixel2Mesh++: multi-view 3D mesh generation via deformation. IEEE/CVF Int Conf on Computer Vision, p.1042-1051. doi: 10.1109/ICCV.2019.00113http://doi.org/10.1109/ICCV.2019.00113

Wu ZR, Song SR, Khosla A, et al., 2015. 3D ShapeNets: a deep representation for volumetric shapes. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1912-1920. doi: 10.1109/CVPR.2015.7298801http://doi.org/10.1109/CVPR.2015.7298801

Xiao YP, Lai YK, Zhang FL, et al., 2020. A survey on deep geometry learning: from a representation perspective. Comput Visual Med, 6(2):113-133. doi: 10.1007/s41095-020-0174-8http://doi.org/10.1007/s41095-020-0174-8

Yang J, Mo KC, Lai YK, et al., 2023. DSG-Net: learning disentangled structure and geometry for 3D shape generation. ACM Trans Graph, 42(1):1. doi: 10.1145/3526212http://doi.org/10.1145/3526212

Yang KZ, Chen XJ, 2021. Unsupervised learning for cuboid shape abstraction via joint segmentation from point clouds. ACM Trans Graph, 40(4):152. doi: 10.1145/3450626.3459873http://doi.org/10.1145/3450626.3459873

Yu FG, Liu K, Zhang Y, et al., 2019. PartNet: a recursive part decomposition network for fine-grained and hierarchical shape segmentation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9483-9492. doi: 10.1109/CVPR.2019.00972http://doi.org/10.1109/CVPR.2019.00972

Yu LQ, Li XZ, Fu CW, et al., 2018. PU-Net: point cloud upsampling network. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.2790-2799. doi: 10.1109/CVPR.2018.00295http://doi.org/10.1109/CVPR.2018.00295

Zheng XY, Liu Y, Wang PS, et al., 2022. SDF-StyleGAN: implicit SDF-based StyleGAN for 3D shape generation. https://arxiv.org/abs/2206.12055https://arxiv.org/abs/2206.12055

Zheng ZR, Yu T, Dai QH, et al., 2021. Deep implicit templates for 3D shape representation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1429-1439. doi: 10.1109/CVPR46437.2021.00148http://doi.org/10.1109/CVPR46437.2021.00148

Zuffi S, Kanazawa A, Jacobs DW, et al., 2017. 3D Menagerie: modeling the 3D shape and pose of animals. IEEE Conf on Computer Vision and Pattern Recognition, p.5524-5532. doi: 10.1109/CVPR.2017.586http://doi.org/10.1109/CVPR.2017.586

Views

Downloads

CSCD

Alert me when the article has been cited

Submit

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

Address：Zhejiang University Press, 148 Tianmushan Road, Hangzhou, China Postal code：310028
Tel：+86-571-88273162 Email：fitee@zju.edu.cn
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰