| # | 年份 | 论文名称 | 作者 |
| 8 | 1995 | Learning in the Presence of Concept Drift and Hidden Contexts | Leslie Valiant |
| 9 | 1997 | Long Short-Term Memory (LSTM) | Sepp Hochreiter, Jürgen Schmidhuber |
| 10 | 1998 | Gradient-Based Learning Applied to Document Recognition | Yann LeCun, Léon Bottou, Yoshua Bengio |
| 11 | 1998 | Boosting: A Weak Learning Algorithm | Yoav Freund, Robert Schapire |
| 12 | 2001 | Random Forests | Leo Breiman |
| 13 | 2002 | An Improved Boosting Algorithm | Alexander Grove, Dale Schuurmans |
| 14 | 2003 | A Tutorial on Support Vector Machines | Nello Cristianini, John Shawe-Taylor |
| 15 | 2006 | A Fast Learning Algorithm for Deep Belief Nets | Geoffrey Hinton, Simon Osindero, Yee-Whye Teh |
| 16 | 2008 | Sparse Feature Learning for Deep Belief Networks | Marc'Aurelio Ranzato, et al. |
| 17 | 2010 | Variational Learning for Digits | A. Mnih, K. Kavukcuoglu |
| 18 | 2011 | Neural Networks for NLP | Richard Socher, et al. |
| # | 年份 | 论文名称 | 作者 |
| 19 | 2012 | ImageNet Classification with Deep Convolutional Neural Networks (AlexNet) | Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton |
| 20 | 2013 | Efficient Estimation of Word Representations in Vector Space (Word2Vec) | Tomas Mikolov, et al. |
| 21 | 2013 | Playing Atari with Deep Reinforcement Learning | Volodymyr Mnih, et al. |
| 22 | 2014 | Generative Adversarial Networks (GAN) | Ian Goodfellow, et al. |
| 23 | 2014 | Sequence to Sequence Learning with Neural Networks | Ilya Sutskever, Oriol Vinyals, Quoc V. Le |
| 24 | 2014 | Neural Machine Translation by Jointly Learning to Align and Translate | Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio |
| 25 | 2014 | Dropout: A Simple Way to Prevent Neural Networks from Overfitting | Nitish Srivastava, et al. |
| 26 | 2015 | Batch Normalization: Accelerating Deep Network Training | Sergey Ioffe, Christian Szegedy |
| 27 | 2015 | Deep Residual Learning for Image Recognition (ResNet) | Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun |
| 28 | 2015 | Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet) | Karen Simonyan, Andrew Zisserman |
| 29 | 2015 | Spatial Transformer Networks | Max Jaderberg, et al. |
| 30 | 2016 | Mastering the Game of Go with Deep Neural Networks and Tree Search (AlphaGo) | David Silver, Aja Huang, et al. |
| 31 | 2016 | FastText: Bag of Tricks for Efficient Text Classification | Armand Joulin, et al. |
| 32 | 2016 | WaveNet: A Generative Model for Raw Audio | Aaron van den Oord, et al. |
| # | 年份 | 论文名称 | 作者 |
| 33 | 2017 | Attention Is All You Need (Transformer) | Ashish Vaswani, Noam Shazeer, et al. |
| 34 | 2017 | Neural Machine Translation with Latent Alignment | Dzmitry Bahdanau, et al. |
| 35 | 2017 | Fast and Accurate Reading Comprehension by Neural Network | Romain Gloannec, et al. |
| 36 | 2018 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Jacob Devlin, Ming-Wei Chang, et al. |
| 37 | 2018 | Improving Language Understanding by Generative Pre-Training (GPT) | Alec Radford, et al. |
| 38 | 2018 | GLUE: A Multi-Task Benchmark and Evaluation for NLP | Alex Wang, et al. |
| 39 | 2019 | XLNet: Generalized Autoregressive Pretraining for Language Understanding | Zhilin Yang, et al. |
| 40 | 2019 | Visualizing and Measuring the Geometry of BERT | Ian Tenney, et al. |
| 41 | 2019 | RoBERTa: A Robustly Optimized BERT Pretraining Approach | Yinhan Liu, et al. |
| 42 | 2019 | ControlNet: Conditional Neural Networks | Lvmin Zhang, et al. |
| 43 | 2020 | Language Models are Few-Shot Learners (GPT-3) | Tom Brown, et al. |
| 44 | 2020 | Image GPT | Xi Chen, et al. |
| 45 | 2020 | YOLOv4: Optimal Speed and Accuracy of Object Detection | Alexey Bochkovskiy, et al. |
| 46 | 2020 | Bootstrap Your Own Latent (BYOL) | Jean-Bastien Grill, et al. |
| 47 | 2020 | SimCLRv2: Big Self-Supervised Models are Strong Semi-Supervised Learners | Ting Chen, et al. |
| 48 | 2020 | An Image is Worth 16x16 Words: Transformers for Image Recognition (ViT) | Alexey Dosovitskiy, et al. |
| 49 | 2020 | Exploring the Limits of Transfer Learning with T5 | Colin Raffel, et al. |
| 50 | 2020 | Denoising Diffusion Probabilistic Models | Jonathan Ho, Ajay Jain, Pieter Abbeel |
| 51 | 2020 | Score-Based Generative Modeling through Stochastic Differential Equations | Yang Song, et al. |