-->
skip to content
Tags → #ml-ai 16 Sept 2025 Visual Geometry Grounded Transformer - CVPR2025 3 Mar 2025 This is a summary from my understanding of reinforcement learning, based on the book Reinforcement Learning: An Introduction by Sutton and Barto, and supplemented with the YouTube series. 25 Feb 2025 A post on how FPGA lost to NVIDIA. Not written by me. 23 Feb 2025 Trying to understand how Flash Attention works on Tenstorrent and how it compares to CUDA 7 Sept 2024 Understanding Adaptive Layer Normalization. First introduced in the DiT paper 31 Jul 2024 Just a brief explanation of how attention mechanism works. As well as the quadratic scaling of attention. 1 May 2024 layer normalization of GPT by Andrej Karpathy 4 Apr 2024 Exploring GPT-3's diverse training datasets for language model pretraining development. 30 Mar 2024 How I understand the Decoder Transformer in Generative Text Models 27 Mar 2024 A brief history of large language models, from bigrams to transformers Next Tags →
© Amar Jay
2024. 👨🏽🔧😋💤
One Love ☝️️