mcgrof

AI: post transformers

Explore the evolution of neural networks through insightful reviews of groundbreaking research papers, tracing the journey from transformers to contemporary AI advancements.

Listen on Apple Podcasts

STAR: Sub-Entry Sharing TLB for Multi-Instance GPU Efficiency

18 mins • Oct 26, 2025

Recent Episodes

Oct 26, 2025

STAR: Sub-Entry Sharing TLB for Multi-Instance GPU Efficiency

18 mins

Oct 26, 2025

Strata: Efficient Hierarchical Context Caching for LLM Serving

16 mins

Oct 26, 2025

FlashAttention: IO-Aware Fast and Memory-Efficient Attention

14 mins

Oct 26, 2025

Introducing MTEB v2: Multimodal Embedding Evaluation

12 mins

Oct 26, 2025

Structural Understanding of LLM Overthinking

17 mins

Language
English
Country
United States
Categories
Feed Host
Request an Update
Updates may take a few minutes.