1、【精校】大神Andrej Karpathy 大模型讲座 | 构建makemore 系列之一:讲解语言建模的明确入门_哔哩哔哩_bilibili
2、karpathy/makemore: An autoregressive character-level language model for making more things
Bigram (one character predicts the next one with a lookup table of counts)
MLP, following Bengio et al. 2003
CNN, following DeepMind WaveNet 2016 (in progress...)
RNN, following Mikolov et al. 2010
LSTM, following Graves et al. 2014
GRU, following Kyunghyun Cho et al. 2014
Transformer, following Vaswani et al. 2017
评论区