四、【参考文献】
[1] Hinton, G. E.; Salakhutdinov, R. R. (2006). "Reducing the Dimensionality of Data with Neural Networks.Science". 313 (5786): 504–507.
[2] Larochelle, H.; Bengio, Y. (2008). "Classification using discriminative restricted Boltzmann machines". Proceedings of the 25th international conference on Machine learning - ICML '08. p. 536.
[3] Coates, A.; Lee, H.; Ng, A. Y. (2011). "An analysis of single-layer networks in unsupervised feature learning". International Conference on Artificial Intelligence and Statistics (AISTATS).
[4] Yuxi Li. (2018). "DEEP REINFORCEMENT LEARNING”. arXiv.
[5] Krizhevsky, Alex , I. Sutskever , and G. Hinton. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." NIPS Curran Associates Inc.
[6] Y. LeCun, “LeNet-5, convolutional neural networks”. History summary page.
[7] Eugenio Culurciello. (2016). "Navigating the unsupervised learning landscape".
[8] Klambauer G, Unterthiner T, Mayr A, et al. Self-normalizing neural networks[J]. Advances in neural information processing systems, 2017, 30.
[9] Misra D. Mish: A self regularized non-monotonic activation function[J]. arXiv preprint arXiv:1908.08681, 2019.
[10] Ramachandran P, Zoph B, Le Q V. Searching for activation functions[J]. arXiv preprint arXiv:1710.05941, 2017.
[11] Howard A, Sandler M, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 1314-1324.
[12] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The journal of machine learning research, 2014, 15(1): 1929-1958.
[13] Goodfellow I, Warde-Farley D, Mirza M, et al. Maxout networks[C]//International conference on machine learning. PMLR, 2013: 1319-1327.
[14] Hadsell R, Chopra S, LeCun Y. Dimensionality reduction by learning an invariant mapping[C]//2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06). IEEE, 2006, 2: 1735-1742.
[15] Schroff F, Kalenichenko D, Philbin J. Facenet: A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 815-823.
[16] Sohn K. Improved deep metric learning with multi-class n-pair loss objective[J]. Advances in neural information processing systems, 2016, 29.
[17] Hinton G, Srivastava N, Swersky K. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent[J]. Cited on, 2012, 14(8): 2.
[18] Teo Y S, Shin S, Jeong H, et al. Benchmarking quantum tomography completeness and fidelity with machine learning[J]. New Journal of Physics, 2021, 23(10): 103021.
[19] Gao M, Wang Q, Lin Z, et al. Tuning Pre-trained Model via Moment Probing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 11803-11813.
[20] Hochreiter, Sepp & Schmidhuber, Jürgen. (1997). Long Short-term Memory. Neural computation. 9. 1735-80. 10.1162/neco.1997.9.8.1735.
[21] Lee, Daeil & Koo, Seoryong & Jang, Inseok & Kim, Jonghyun. (2022). Comparison of Deep Reinforcement Learning and PID Controllers for Automatic Cold Shutdown Operation. Energies. 15. 2834. 10.3390/en15082834.
[22] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
[23] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.