Understanding Flash Attention: Writing the Algorithm from Scratch in Triton
Why is Flash Attention so fast? Find out how Flash Attention works. Afterward, we'll polish our understanding by writing a GPU kernel of the algorithm in Triton.
Those are hard! In this section, I discuss algorithms that I encountered during work or my college assignments. I strive to describe them in the most simple way
5 postsWhy is Flash Attention so fast? Find out how Flash Attention works. Afterward, we'll polish our understanding by writing a GPU kernel of the algorithm in Triton.
Easy to understand explanation of suffix automaton with implementation. Finally, generating correct Rickroll lyrics suffix automaton
I'm going to show how complex SwiftUI views can be animated efficiently using VectorArithmetic protocol with Accelerate library for fast computations.
Binary search trees are mostly hard. Writing red-black tree is a nightmare. Here, I'm going to explain one of the easiest, yet efficient and powerful balanced binary tree — treap or cartesian tree
Skip List is a nice structure that lets you to perform insertions, searches, and finding n-th maximum. In this post I focus on skip list indexation