-->
skip to content
Tags → #ml-ai 3 Mar 2025 This is a summary from my understanding of reinforcement learning, based on the book Reinforcement Learning: An Introduction by Sutton and Barto, and supplemented with the YouTube series. 25 Feb 2025 A post on how FPGA lost to NVIDIA. Not written by me. 23 Feb 2025 Trying to understand how Flash Attention works on Tenstorrent and how it compares to CUDA 7 Sept 2024 Understanding Adaptive Layer Normalization. First introduced in the DiT paper 31 Jul 2024 Just a brief explanation of how attention mechanism works. As well as the quadratic scaling of attention. 1 May 2024 layer normalization of GPT by Andrej Karpathy 4 Apr 2024 Exploring GPT-3's diverse training datasets for language model pretraining development. 30 Mar 2024 How I understand the Decoder Transformer in Generative Text Models 27 Mar 2024 A brief history of large language models, from bigrams to transformers 25 Mar 2024 This was my attempt at ner for medical reporting in technofest -- had to step down before qualifying stage Next Tags →
© Amar Jay
2024. 👨🏽🔧😋💤
One Love ☝️️