-->
skip to content
Posts
-
FPGAs: the ultimate flex by Jon Y from Asianometry
-
C code style by Malcolm Inglis
-
layer normalization of GPT by Andrej Karpathy
-
Exploring GPT-3's diverse training datasets for language model pretraining development.
-
How I understand the Decoder Transformer in Generative Text Models
-
A brief history of large language models, from bigrams to transformers
-
how accurately are the value of a stock/bond calculated
-
This was my attempt at ner for medical reporting in technofest -- had to step down before qualifying stage
-
How hybrid computation became as it is. CUDA, and beyond, How hybrid computation became as it is. CUDA, and beyond