On the principles of Parsimony and Self-consistency for the emergence of intelligence

Yi MA; Doris TSAO; Heung-Yeung SHUM

doi:10.1631/FITEE.2200297

Your Location：

Home >

Browse articles >

On the principles of Parsimony and Self-consistency for the emergence of intelligence

Special Column on Visual Knowledge | Updated：2022-08-22

- On the principles of Parsimony and Self-consistency for the emergence of intelligence
- 论智能起源中的简约与自洽原则
- Frontiers of Information Technology & Electronic Engineering Vol. 23, Issue 9, Pages: 1298-1323(2022)
- Affiliations：
  
  1.Electrical Engineering and Computer Science Department, University of California, Berkeley, CA 94720, USA
  2.Department of Molecular & Cell Biology and Howard Hughes Medical Institute, University of California, Berkeley, CA 94720, USA
  3.International Digital Economy Academy, Shenzhen 518045, China
- Author bio：
  
  ‡Corresponding author
  dortsao@berkeley.edu;
  hshum@idea.edu.cn
- Funds：
- DOI：10.1631/FITEE.2200297
  CLC： TP18
- Received：10 July 2022，
  
  Revised：2022-07-29，
  
  Accepted：24 July 2022，
  
  Published Online：12 August 2022，
  
  Published：2022-09
- Accepted：
Scan QR Code
Yi MA, Doris TSAO, Heung-Yeung SHUM. On the principles of Parsimony and Self-consistency for the emergence of intelligence[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(9): 1298-1323.
DOI：

Yi MA, Doris TSAO, Heung-Yeung SHUM. On the principles of Parsimony and Self-consistency for the emergence of intelligence[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(9): 1298-1323. DOI： 10.1631/FITEE.2200297.

摘要

深度学习重振人工智能十年后的今天，我们提出一个理论框架来帮助理解深度神经网络在整个智能系统里面扮演的角色。我们引入两个基本原则：简约与自洽；分别解释智能系统要学习什么以及如何学习。我们认为这两个原则是人工智能和自然智能之所以产生和发展的基石。虽然这两个原则的雏形早已出现在前人的经典工作里，但是我们对这些原则的重新表述使得它们变得可以精准度量与计算。确切地说，简约与自洽这两个原则能自然地演绎出一个高效计算框架：压缩闭环转录。这个框架统一并解释了现代深度神经网络以及众多人工智能实践的演变和进化。尽管本文主要用视觉数据建模作为例子，我们相信这两个原则将会有助于统一对各种自动智能系统的理解，并且提供一个帮助理解大脑工作机理的框架。

Abstract

Ten years into the revival of deep networks and artificial intelligence

we propose a theoretical framework that sheds light on understanding deep networks within a bigger picture of intelligence in general. We introduce two fundamental principles

Parsimony

and

Self-consistency

which address two fundamental questions regarding intelligence: what to learn and how to learn

respectively. We believe the two principles serve as the cornerstone for the emergence of intelligence

artificial or natural. While they have rich classical roots

we argue that they can be stated anew in entirely measurable and computable ways. More specifically

the two principles lead to an effective and efficient computational framework

compressive closed-loop transcription

which unifies and explains the evolution of modern deep networks and most practices of artificial intelligence. While we use mainly visual data modeling as an example

we believe the two principles will unify understanding of broad families of autonomous intelligent systems and provide a framework for understanding the brain.

关键词

Keywords

references

Agarwal A , Kakade S , Krishnamurthy A , et al. , 2020 . FLAMBE: structural complexity and representation learning of low rank MDPs . Proc 34 th Int Conf on Neural Information Processing Systems , p. 20095 - 20107 .

Azulay A , Weiss Y , 2019 . Why do deep convolutional networks generalize so poorly to small image transformations? https://arxiv.org/abs/1805.12177 https://arxiv.org/abs/1805.12177

Baek C , Wu ZY , Chan KHR , et al. , 2022 . Efficient maximal coding rate reduction by variational forms . https://arxiv.org/abs/2204.00077 https://arxiv.org/abs/2204.00077

Bai SJ , Kolter JZ , Koltun V , 2019 . Deep equilibrium models . Proc 33 rd Int Conf on Neural Information Processing Systems , p. 690 - 701 .

Baker B , Gupta O , Naik N , et al. , 2017 . Designing neural network architectures using reinforcement learning . https://arxiv.org/abs/1611.02167 https://arxiv.org/abs/1611.02167

Bao PL , She L , McGill M , et al. , 2020 . A map of object space in primate inferotemporal cortex . Nature , 583 ( 7814 ): 103 - 108 . doi: 10.1038/s41586-020-2350-5 http://doi.org/10.1038/s41586-020-2350-5

Barlow HB , 1961 . Possible principles underlying the transformations of sensory messages . In: Rosenblith WA (Ed.), Sensory Communication . MIT Press , Cambridge, MA, USA , p. 217 - 234 .

Bear DM , Fan CF , Mrowca D , et al. , 2020 . Learning physical graph representations from visual scenes . Proc 34 th Int Conf on Neural Information Processing Systems , p. 6027 - 6039 .

Belkin M , Hsu D , Ma SY , et al. , 2019 . Reconciling modern machine-learning practice and the classical bias-variance trade-off . Proc Natl Acad Sci USA , 116 ( 32 ): 1 5849 - 15854 . doi: 10.1073/pnas.1903070116 http://doi.org/10.1073/pnas.1903070116

Benna MK , Fusi S , 2021 . Place cells may simply be memory cells: memory compression leads to spatial tuning and history dependence . Proc Natl Acad Sci USA , 118 ( 51 ): e2018422118 . doi: 10.1073/PNAS.2018422118 http://doi.org/10.1073/PNAS.2018422118

Bennett J , Carbery A , Christ M , et al. , 2008 . The Brascamp–Lieb inequalities: finiteness, structure and extremals . Geom Funct Anal , 17 ( 5 ): 1343 - 1415 . doi: 10.1007/s00039-007-0619-6 http://doi.org/10.1007/s00039-007-0619-6

Berner C , Brockman G , Chan B , et al. , 2019 . Dota 2 with large scale deep reinforcement learning . https://arxiv.org/abs/1912.06680 https://arxiv.org/abs/1912.06680

Bertsekas DP , 2012 . Dynamic Programming and Optimal Control, Volume I and II . Athena Scientific , Belmont, Massachusetts, USA .

Bronstein MM , Bruna J , Cohen T , et al. , 2021 . Geometric deep learning: grids, groups, graphs, geodesics, and gauges . https://arxiv.org/abs/2104.13478 https://arxiv.org/abs/2104.13478

Bruna J , Mallat S , 2013 . Invariant scattering convolution networks . IEEE Trans Patt Anal Mach Intell , 35 ( 8 ): 1872 - 1886 . doi: 10.1109/TPAMI.2012.230 http://doi.org/10.1109/TPAMI.2012.230

Buchanan S , Gilboa D , Wright J , 2021 . Deep networks and the multiple manifold problem . https://arxiv.org/abs/2008.11245 https://arxiv.org/abs/2008.11245

Candès EJ , Li XD , Ma Y , et al. , 2011 . Robust principal component analysis? J ACM , 58 ( 3 ): 11 . doi: 10.1145/1970392.1970395 http://doi.org/10.1145/1970392.1970395

Chai JX , Tong X , Chan SC , et al. , 2000 . Plenoptic sampling . Proc 27 th Annual Conf on Computer Graphics and Interactive Techniques , p. 307 - 318 . doi: 10.1145/344779.344932 http://doi.org/10.1145/344779.344932

Chan ER , Monteiro M , Kellnhofer P , et al. , 2021 . pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis . https://arxiv.org/abs/2012.00926 https://arxiv.org/abs/2012.00926

Chan KHR , Yu YD , You C , et al. , 2022 . ReduNet: a white-box deep network from the principle of maximizing rate reduction . J Mach Learn Res , 23 ( 114 ): 1 - 103 .

Chan TH , Jia K , Gao SH , et al. , 2015 . PCANet: a simple deep learning baseline for image classification? IEEE Trans Image Process , 24 ( 12 ): 5017 - 5032 . doi: 10.1109/TIP.2015.2475625 http://doi.org/10.1109/TIP.2015.2475625

Chang L , Tsao DY , 2017 . The code for facial identity in the primate brain . Cell , 169 ( 6 ): 1013 - 1028 . doi: 10.1016/j.cell.2017.05.011 http://doi.org/10.1016/j.cell.2017.05.011

Cohen H , Kumar A , Miller SD , et al. , 2017 . The sphere packing problem in dimension 24 . Ann Math , 185 ( 3 ): 1017 - 1033 . doi: 10.4007/annals.2017.185.3.8 http://doi.org/10.4007/annals.2017.185.3.8

Cohen TS , Welling M , 2016 . Group equivariant convolutional networks . https://arxiv.org/abs/1602.07576 https://arxiv.org/abs/1602.07576

Cohen TS , Geiger M , Weiler M , 2019 . A general theory of equivariant CNNs on homogeneous spaces . Proc 33 rd Int Conf on Neural Information Processing Systems , p. 9145 - 9156 .

Cover TM , Thomas JA , 2006 . Elements of Information Theory ( 2 nd Ed.). John Wiley & Sons, Inc. , Hoboken, New Jersey, USA .

Dai XL , Tong SB , Li MY , et al. , 2022 . Closed-loop data transcription to an LDR via minimaxing rate reduction . https://arxiv.org/abs/2111.06636 https://arxiv.org/abs/2111.06636

Dosovitskiy A , Beyer L , Kolesnikov A , et al. , 2021 . An image is worth 16×16 words: transformers for image recognition at scale . https://arxiv.org/abs/2010.11929 https://arxiv.org/abs/2010.11929

El Ghaoui L , Gu FD , Travacca B , et al. , 2021 . Implicit deep learning . SIAM J Math Data Sci , 3 ( 3 ): 930 - 958 . doi: 10.1137/20M1358517 http://doi.org/10.1137/20M1358517

Engstrom L , Tran B , Tsipras D , et al. , 2019 . A rotation and a translation suffice: fooling CNNs with simple transformations . https://arxiv.org/abs/1712.02779v3 https://arxiv.org/abs/1712.02779v3

Fefferman C , Mitter S , Narayanan H , 2013 . Testing the manifold hypothesis . https://arxiv.org/abs/1310.0425 https://arxiv.org/abs/1310.0425

Fiez T , Chasnov B , Ratliff LJ , 2019 . Convergence of learning dynamics in Stackelberg games . https://arxiv.org/abs/1906.01217 https://arxiv.org/abs/1906.01217

Friston K , 2009 . The free-energy principle: a rough guide to the brain? Trends Cogn Sci , 13 ( 7 ): 293 - 301 . doi: 10.1016/j.tics.2009.04.005 http://doi.org/10.1016/j.tics.2009.04.005

Fukushima K , 1980 . Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position . Biol Cybern , 36 ( 4 ): 193 - 202 . doi: 10.1007/BF00344251 http://doi.org/10.1007/BF00344251

Goodfellow IJ , Pouget-Abadie J , Mirza M , et al. , 2014 . Generative adversarial nets . Proc 27 th Int Conf on Neural Information Processing Systems , p. 2672 - 2680 .

Gortler SJ , Grzeszczuk R , Szeliski R , et al. , 1996 . The lumigraph . Proc 23 rd Annual Conf on Computer Graphics and Interactive Techniques , p. 43 - 54 . doi: 10.1145/237170.237200 http://doi.org/10.1145/237170.237200

Gregor K , LeCun Y , 2010 . Learning fast approximations of sparse coding . Proc 27 th Int Conf on Machine Learning , p. 399 - 406 .

Hadsell R , Chopra S , LeCun Y , 2006 . Dimensionality reduction by learning an invariant mapping . IEEE Computer Society Conf on Computer Vision and Pattern Recognition , p. 1735 - 1742 . doi: 10.1109/CVPR.2006.100 http://doi.org/10.1109/CVPR.2006.100

He KM , Zhang XY , Ren SQ , et al. , 2016 . Deep residual learning for image recognition . IEEE Conf on Computer Vision and Pattern Recognition , p. 770 - 778 . doi: 10.1109/CVPR.2016.90 http://doi.org/10.1109/CVPR.2016.90

Hinton GE , Zemel RS , 1993 . Autoencoders, minimum description length and Helmholtz free energy . Proc 6 th Int Conf on Neural Information Processing Systems , p. 3 - 10 .

Hinton GE , Dayan P , Frey BJ , et al. , 1995 . The “wake-sleep” algorithm for unsupervised neural networks . Science , 268 ( 5214 ): 1158 - 1161 . doi: 10.1126/science.7761831 http://doi.org/10.1126/science.7761831

Ho J , Jain A , Abbeel P , 2020 . Denoising diffusion probabilistic models . https://arxiv.org/abs/2006.11239 https://arxiv.org/abs/2006.11239

Hochreiter S , Schmidhuber J , 1997 . Long short-term memory . Neur Comput , 9 ( 8 ): 1735 - 1780 . doi: 10.1162/neco.1997.9.8.1735 http://doi.org/10.1162/neco.1997.9.8.1735

Huang G , Liu Z , van der Maaten L , et al. , 2017 . Densely connected convolutional networks . IEEE Conf on Computer Vision and Pattern Recognition , p. 2261 - 2269 . doi: 10.1109/CVPR.2017.243 http://doi.org/10.1109/CVPR.2017.243

Hughes JF , van Dam A , McGuire M , et al. , 2014 . Computer Graphics: Principles and Practice ( 3 rd Ed.). Addison-Wesley , Upper Saddle River, NJ, USA .

Hutter F , Kotthoff L , Vanschoren J , 2019 . Automated Machine Learning: Methods, Systems, Challenges . Springer Cham . doi: 10.1007/978-3-030-05318-5 http://doi.org/10.1007/978-3-030-05318-5

Hyvärinen A , 1997 . A family of fixed-point algorithms for independent component analysis . IEEE Int Conf on Acoustics, Speech, and Signal Processing , p. 3917 - 3920 . doi: 10.1109/ICASSP.1997.604766 http://doi.org/10.1109/ICASSP.1997.604766

Hyvärinen A , Oja E , 1997 . A fast fixed-point algorithm for independent component analysis . Neur Comput , 9 ( 7 ): 1483 - 1492 . doi: 10.1162/neco.1997.9.7.1483 http://doi.org/10.1162/neco.1997.9.7.1483

Jin C , Netrapalli P , Jordan MI , 2020 . What is local optimality in nonconvex-nonconcave minimax optimization? https://arxiv.org/abs/1902.00618 https://arxiv.org/abs/1902.00618

Jolliffe IT , 1986 . Principal Component Analysis . Springer-Verlag , New York, NY, USA . doi: 10.1007/978-1-4757-1904-8 http://doi.org/10.1007/978-1-4757-1904-8

Josselyn SA , Tonegawa S , 2020 . Memory engrams: recalling the past and imagining the future . Science , 367 ( 6473 ): eaaw4325 . doi: 10.1126/science.aaw4325 http://doi.org/10.1126/science.aaw4325

Kakade SM , 2001 . A natural policy gradient . Proc 14 th Int Conf on Neural Information Processing Systems: Natural and Synthetic , p. 1531 - 1538 .

Kanwisher N , 2010 . Functional specificity in the human brain: a window into the functional architecture of the mind . Proc Natl Acad Sci USA , 107 ( 25 ): 11163 - 11170 . doi: 10.1073/pnas.1005062107 http://doi.org/10.1073/pnas.1005062107

Kanwisher N , McDermott J , Chun MM , 1997 . The fusiform face area: a module in human extrastriate cortex specialized for face perception . J Neurosci , 17 ( 11 ): 4302 - 4311 . doi: 10.1523/JNEUROSCI.17-11-04302.1997 http://doi.org/10.1523/JNEUROSCI.17-11-04302.1997

Keller GB , Mrsic-Flogel TD , 2018 . Predictive processing: a canonical cortical computation . Neuron , 100 ( 2 ): 424 - 435 . doi: 10.1016/j.neuron.2018.10.003 http://doi.org/10.1016/j.neuron.2018.10.003

Kelley HJ , 1960 . Gradient theory of optimal flight paths . ARS J , 30 ( 10 ): 947 - 954 . doi: 10.2514/8.5282 http://doi.org/10.2514/8.5282

Kingma DP , Welling M , 2013 . Auto-encoding variational Bayes . https://arxiv.org/abs/1312.6114 https://arxiv.org/abs/1312.6114

Kobyzev I , Prince SJD , Brubaker MA , 2021 . Normalizing flows: an introduction and review of current methods . IEEE Trans Patt Anal Mach Intell , 43 ( 11 ): 3964 - 3979 . doi: 10.1109/tpami.2020.2992934 http://doi.org/10.1109/tpami.2020.2992934

Koopman BO , 1931 . Hamiltonian systems and transformation in Hilbert space . Proc Natl Acad Sci USA , 17 ( 5 ): 315 - 318 . doi: 10.1073/pnas.17.5.315 http://doi.org/10.1073/pnas.17.5.315

Kramer MA , 1991 . Nonlinear principal component analysis using autoassociative neural networks . AIChE J , 37 ( 2 ): 233 - 243 . doi: 10.1002/aic.690370209 http://doi.org/10.1002/aic.690370209

Kriegeskorte N , Mur M , Ruff DA , et al. , 2008 . Matching categorical object representations in inferior temporal cortex of man and monkey . Neuron , 60 ( 6 ): 1126 - 1141 . doi: 10.1016/j.neuron.2008.10.043 http://doi.org/10.1016/j.neuron.2008.10.043

Krizhevsky A , Sutskever I , Hinton GE , 2012 . ImageNet classification with deep convolutional neural networks . Proc 25 th Int Conf on Neural Information Processing Systems , p. 1097 - 1105 .

Kulkarni TD , Whitney WF , Kohli P , et al. , 2015 . Deep convolutional inverse graphics network . Proc 28 th Int Conf on Neural Information Processing Systems , p. 2539 - 2547 .

LeCun Y , 2022 . A Path Towards Autonomous Machine Intelligence . https://openreview.net/pdf?id=BZ5a1r-kVsf https://openreview.net/pdf?id=BZ5a1r-kVsf

LeCun Y , Browning J , 2022 . What AI can tell us about intelligence. NO-EMA Magazine . https://www.noemamag.com/what-ai-can-tell-us-about-intelligence/ https://www.noemamag.com/what-ai-can-tell-us-about-intelligence/

LeCun Y , Bottou L , Bengio Y , et al. , 1998 . Gradient-based learning applied to document recognition . Proc IEEE , 86 ( 11 ): 2278 - 2324 . doi: 10.1109/5.726791 http://doi.org/10.1109/5.726791

LeCun Y , Bengio Y , Hinton G , 2015 . Deep learning . Nature , 521 ( 7553 ): 436 - 444 . doi: 10.1038/nature14539 http://doi.org/10.1038/nature14539

Lei N , Su KH , Cui L , et al. , 2017 . A geometric view of optimal transportation and generative model . https://arxiv.org/abs/1710.05488 https://arxiv.org/abs/1710.05488

Levoy M , Hanrahan P , 1996 . Light field rendering . Proc 23 rd Annual Conf on Computer Graphics and Interactive Techniques , p. 31 - 42 . doi: 10.1145/237170.237199 http://doi.org/10.1145/237170.237199

Li G , Wei YT , Chi YJ , et al. , 2020 . Breaking the sample size barrier in model-based reinforcement learning with a generative model . Proc 34 th Int Conf on Neural Information Processing Systems , p. 12861 - 12872 .

Ma Y , Soatto S , Košecká J , et al. , 2004 . An Invitation to 3-D Vision: from Images to Geometric Models . Springer-Verlag , New York, USA . doi: 10.1007/978-0-387-21779-6 http://doi.org/10.1007/978-0-387-21779-6

Ma Y , Derksen H , Hong W , et al. , 2007 . Segmentation of multivariate mixed data via lossy data coding and compression . IEEE Trans Patt Anal Mach Intell , 29 ( 9 ): 1546 - 1562 . doi: 10.1109/TPAMI.2007.1085 http://doi.org/10.1109/TPAMI.2007.1085

MacDonald J , Wäldchen S , Hauch S , et al. , 2019 . A rate-distortion framework for explaining neural network decisions . https://arxiv.org/abs/1905.11092 https://arxiv.org/abs/1905.11092

Marcus G , 2020 . The next decade in AI: four steps towards robust artificial intelligence . https://arxiv.org/abs/2002.06177 https://arxiv.org/abs/2002.06177

Marr D , 1982 . Vision . MIT Press , Cambridge, MA, USA .

Mayr O , 1970 . The Origins of Feedback Control . MIT Press , Cambridge, MA, USA .

McCloskey M , Cohen NJ , 1989 . Catastrophic interference in connectionist networks: the sequential learning problem . Psychol Learn Motiv , 24 : 109 - 165 . doi: 10.1016/S0079-7421(08)60536-8 http://doi.org/10.1016/S0079-7421(08)60536-8

Mildenhall B , Srinivasan PP , Tancik M , et al. , 2020 . NeRF: representing scenes as neural radiance fields for view synthesis . https://arxiv.org/abs/2003.08934 https://arxiv.org/abs/2003.08934

Nash J , 1951 . Non-cooperative games . Ann Math , 54 ( 2 ): 286 - 295 . doi: 10.2307/1969529 http://doi.org/10.2307/1969529

Newell A , Simon HA , 1972 . Human Problem Solving . Prentice Hall , Englewood Cliffs, New Jersey, USA .

Ng AY , Russell SJ , 2000 . Algorithms for inverse reinforcement learning . Proc 17 th Int Conf on Machine Learning , p. 663 - 670 .

Olshausen BA , Field DJ , 1996 . Emergence of simple-cell receptive field properties by learning a sparse code for natural images . Nature , 381 ( 6583 ): 607 - 609 . doi: 10.1038/381607a0 http://doi.org/10.1038/381607a0

Osband I , van Roy B , 2014 . Model-based reinforcement learning and the eluder dimension . Proc 27 th Int Conf on Neural Information Processing Systems , p. 1466 - 1474 .

Pai D , Psenka M , Chiu CY , et al. , 2022 . Pursuit of a discriminative representation for multiple subspaces via sequential games . https://arxiv.org/abs/2206.09120 https://arxiv.org/abs/2206.09120

Papyan V , Romano Y , Sulam J , et al. , 2018 . Theoretical foundations of deep learning via sparse representations: a multilayer sparse model and its connection to convolutional neural networks . IEEE Signal Process Mag , 35 ( 4 ): 72 - 89 . doi: 10.1109/MSP.2018.2820224 http://doi.org/10.1109/MSP.2018.2820224

Papyan V , Han XY , Donoho DL , 2020 . Prevalence of neural collapse during the terminal phase of deep learning training . https://arxiv.org/abs/2008.08186 https://arxiv.org/abs/2008.08186

Patterson D , Gonzalez J , Hölzle U , et al. , 2022 . The carbon footprint of machine learning training will plateau, then shrink . https://arxiv.org/abs/2204.05149 https://arxiv.org/abs/2204.05149

JR Quinlan , 1986 . Induction of decision trees . Mach Learn , 1 ( 1 ): 81 - 106 . doi: 10.1007/BF00116251 http://doi.org/10.1007/BF00116251

Rao RPN , Ballard DH , 1999 . Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects . Nat Neurosci , 2 ( 1 ): 79 - 87 . doi: 10.1038/4580 http://doi.org/10.1038/4580

Rifai S , Vincent P , Muller X , et al. , 2011 . Contractive auto-encoders: explicit invariance during feature extraction . Proc 28 th Int Conf on Machine Learning , p. 833 - 840 .

Rissanen J , 1989 . Stochastic Complexity in Statistical Inquiry . World Scientific Publishing Co., Inc. , Singapore .

Roberts DA , Yaida S , 2022 . The Principles of Deep Learning Theory . Cambridge University Press , Cambridge, MA, USA .

Rosenblatt F , 1958 . The perceptron: a probabilistic model for information storage and organization in the brain . Psychol Rev , 65 ( 6 ): 386 - 408 . doi: 10.1037/h0042519 http://doi.org/10.1037/h0042519

Rumelhart DE , Hinton GE , Williams RJ , 1986 . Learning representations by back-propagating errors . Nature , 323 ( 6088 ): 533 - 536 . doi: 10.1038/323533a0 http://doi.org/10.1038/323533a0

Russell S , Norvig P , 2020 . Artificial Intelligence: a Modern Approach ( 4 th Ed.). Pearson Education, Inc. , River Street, Hoboken, NJ, USA .

Sastry S , 1999 . Nonlinear Systems: Analysis, Stability, and Control . Springer , New York, USA .

Saxe AM , Bansal Y , Dapello J , et al. , 2019 . On the information bottleneck theory of deep learning . J Stat Mech , 2019 : 124020 . doi: 10.1088/1742-5468/ab3985 http://doi.org/10.1088/1742-5468/ab3985

Shamir A , Melamed O , BenShmuel O , 2022 . The dimpled manifold model of adversarial examples in machine learning . https://arxiv.org/abs/2106.10151 https://arxiv.org/abs/2106.10151

Shannon CE , 1948 . A mathematical theory of communication . Bell Syst Techn J , 27 ( 3 ): 379 - 423 . doi: 10.1002/j.1538-7305.1948.tb01338.x http://doi.org/10.1002/j.1538-7305.1948.tb01338.x

Shazeer N , Mirhoseini A , Maziarz K , et al. , 2017 . Outrageously large neural networks: the sparsely-gated mixture-of-experts layer . https://arxiv.org/abs/1701.06538 https://arxiv.org/abs/1701.06538

Shum HY , Chan SC , Kang SB , 2007 . Image-Based Rendering . Springer , New York, USA .

Shwartz-Ziv R , Tishby N , 2017 . Opening the black box of deep neural networks via information . https://arxiv.org/abs/1703.00810 https://arxiv.org/abs/1703.00810

Silver D , Huang A , Maddison CJ , et al. , 2016 . Mastering the game of Go with deep neural networks and tree search . Nature , 529 ( 7587 ): 484 - 489 . doi: 10.1038/nature16961 http://doi.org/10.1038/nature16961

Silver D , Schrittwieser J , Simonyan K , et al. , 2017 . Mastering the game of Go without human knowledge . Nature , 550 ( 7676 ): 354 - 359 . doi: 10.1038/nature24270 http://doi.org/10.1038/nature24270

Simon HA , 1969 . The Sciences of the Artificial . MIT Press , Cambridge, MA, USA .

Srivastava A , Valkoz L , Russell C , et al. , 2017 . VeeGAN: reducing mode collapse in GANs using implicit variational learning . Proc 31 st Int Conf on Neural Information Processing Systems , p. 3310 - 3320 .

Srivastava RK , Greff K , Schmidhuber J , 2015 . Highway networks . https://arxiv.org/abs/1505.00387 https://arxiv.org/abs/1505.00387

Sutton RS , Barto AG , 2018 . Reinforcement Learning: an Introduction ( 2 nd Ed.). MIT Press , Cambridge, MA, USA .

Szegedy C , Zaremba W , Sutskever I , et al. , 2014 . Intriguing properties of neural networks . https://arxiv.org/abs/1312.6199 https://arxiv.org/abs/1312.6199

Szeliski R , 2022 . Computer Vision: Algorithms and Applications ( 2 nd Ed.). Springer-Verlag , Switzerland . doi: 10.1007/978-3-030-34372-9 http://doi.org/10.1007/978-3-030-34372-9

Tenenbaum JB , de Silva V , Langford JC , 2000 . A global geometric framework for nonlinear dimensionality reduction . Science , 290 ( 5500 ): 2319 - 2323 . doi: 10.1126/science.290.5500.2319 http://doi.org/10.1126/science.290.5500.2319

Tishby N , Zaslavsky N , 2015 . Deep learning and the information bottleneck principle . IEEE Information Theory Workshop , p. 1 - 5 . doi: 10.1109/ITW.2015.7133169 http://doi.org/10.1109/ITW.2015.7133169

Tong SB , Dai XL , Wu ZY , et al. , 2022 . Incremental learning of structured memory via closed-loop transcription . https://arxiv.org/abs/2202.05411 https://arxiv.org/abs/2202.05411

Uehara M , Zhang XZ , Sun W , 2022 . Representation learning for online and offline RL in low-rank MDPs . https://arxiv.org/abs/2110.04652v1 https://arxiv.org/abs/2110.04652v1

van den Oord A , Li YZ , Vinyals O , 2019 . Representation learning with contrastive predictive coding . https://arxiv.org/abs/1807.03748v1 https://arxiv.org/abs/1807.03748v1

Vaswani A , Shazeer N , Parmar N , et al. , 2017 . Attention is all you need . https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762

Viazovska MS , 2017 . The sphere packing problem in dimension 8 . Ann Math , 185 ( 3 ): 991 - 1015 . doi: 10.4007/annals.2017.185.3.7 http://doi.org/10.4007/annals.2017.185.3.7

Vidal R , 2022 . Attention: Self-Expression Is All You Need . https://openreview.net/forum?id=MmujBClawFo https://openreview.net/forum?id=MmujBClawFo

Vidal R , Ma Y , Sastry SS , 2016 . Generalized Principal Component Analysis . Springer Verlag , New York, USA . doi: 10.1007/978-0-387-87811-9 http://doi.org/10.1007/978-0-387-87811-9

Vinyals O , Babuschkin I , Czarnecki WM , et al. , 2019 . Grandmaster level in StarCraft II using multi-agent reinforcement learning . Nature , 575 ( 7782 ): 350 - 354 . doi: 10.1038/s41586-019-1724-z http://doi.org/10.1038/s41586-019-1724-z

von Neumann J , Morgenstern O , 1944 . Theory of Games and Economic Behavior . Princeton University Press , Princeton, NJ, USA .

Wang TR , Buchanan S , Gilboa D , et al. , 2021 . Deep networks provably classify data on curves . https://arxiv.org/abs/2107.14324 https://arxiv.org/abs/2107.14324

Wiatowski T , Bölcskei H , 2018 . A mathematical theory of deep convolutional neural networks for feature extraction . IEEE Trans Inform Theory , 64 ( 3 ): 1845 - 1866 . doi: 10.1109/TIT.2017.2776228 http://doi.org/10.1109/TIT.2017.2776228

Wiener N , 1948 . Cybernetics . MIT Press , Cambridge, MA, USA .

Wiener N , 1961 . Cybernetics ( 2 nd Ed.). MIT Press , Cambridge, MA, USA .

Wisdom S , Powers T , Pitton J , et al. , 2017 . Building recurrent networks by unfolding iterative thresholding for sequential sparse recovery . IEEE Int Conf on Acoustics, Speech and Signal Processing , p. 4346 - 4350 . doi: 10.1109/ICASSP.2017.7952977 http://doi.org/10.1109/ICASSP.2017.7952977

Wood E , Baltrušaitis T , Hewitt C , et al. , 2021 . Fake it till you make it: face analysis in the wild using synthetic data alone . IEEE/CVF Int Conf on Computer Vision , p. 3661 - 3671 . doi: 10.1109/ICCV48922.2021.00366 http://doi.org/10.1109/ICCV48922.2021.00366

Wright J , Ma Y , 2022 . High-Dimensional Data Analysis with Low-Dimensional Models: Principles, Computation, and Applications . Cambridge University Press , Cambridge, MA, USA . doi: 10.1017/9781108779302 http://doi.org/10.1017/9781108779302

Wright J , Tao Y , Lin ZY , et al. , 2007 . Classification via minimum incremental coding length (MICL) . Proc 20 th Int Conf on Neural Information Processing Systems , p. 1633 - 1640 .

Xie SN , Girshick R , Dollár P , et al. , 2017 . Aggregated residual transformations for deep neural networks . IEEE Conf on Computer Vision and Pattern Recognition , p. 5987 - 5995 . doi: 10.1109/CVPR.2017.634 http://doi.org/10.1109/CVPR.2017.634

Yang ZT , Yu YD , You C , et al. , 2020 . Rethinking bias-variance trade-off for generalization of neural networks . Proc 37 th Int Conf on Machine Learning , p. 10767 - 10777 .

Yildirim I , Belledonne M , Freiwald W , et al. , 2020 . Efficient inverse graphics in biological face processing . Sci Adv , 6 ( 10 ): eaax5979 . doi: 10.1126/sciadv.aax5979 http://doi.org/10.1126/sciadv.aax5979

Yu A , Fridovich-Keil S , Tancik M , et al. , 2021 . Plenoxels: radiance fields without neural networks . https://arxiv.org/abs/2112.05131 https://arxiv.org/abs/2112.05131

Yu YD , Chan KHR , You C , et al. , 2020 . Learning diverse and discriminative representations via the principle of maximal coding rate reduction . Proc 34 th Int Conf on Neural Information Processing Systems , p. 9422 - 9434 .

Zeiler MD , Fergus R , 2014 . Visualizing and understanding convolutional networks . Proc 13 th European Conf on Computer Vision , p. 818 - 833 . doi: 10.1007/978-3-319-10590-1_53 http://doi.org/10.1007/978-3-319-10590-1_53

Zhai YX , Yang ZT , Liao ZY , et al. , 2020 . Complete dictionary learning via <math id="M256"><mrow><msup><mi mathvariant="script">l</mi><mn>4</mn></msup></mrow></math> -norm maximization over the orthogonal group . J Mach Learn Res , 21 ( 1 ): 6622 - 6689 .

Zhu JY , Park T , Isola P , et al. , 2017 . Unpaired image-to-image translation using cycle-consistent adversarial networks . IEEE Int Conf on Computer Vision , p. 2242 - 2251 . doi: 10.1109/ICCV.2017.244 http://doi.org/10.1109/ICCV.2017.244

Zoph B , Le QV , 2017 . Neural architecture search with reinforcement learning . https://arxiv.org/abs/1611.01578 https://arxiv.org/abs/1611.01578

Views

157

Downloads

CSCD

Alert me when the article has been cited

Submit

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

Chat

Address：Zhejiang University Press, 148 Tianmushan Road, Hangzhou, China Postal code：310028
Tel：+86-571-88273162 Email：fitee@zju.edu.cn
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰