Kernel Case Study: Flash AttentionUnderstanding all versions of flash attention through a triton implementationMar 24Mar 24
Model Context Protocol is awesome Part I: Turbocharging AI with Formula 1 DataAnthropic released the open-source Model Context Protocol (MCP) back in November 2024 to a rather cold response [1]. But the start of 2025…Mar 17Mar 17
Simplifying CUDA kernels with Triton: A Pythonic Approach to GPU ProgrammingWriting custom CUDA kernels for whatever reasons had always looked like a daunting task. This is where OpenAI’s triton comes in handy. With…Mar 8Mar 8
Published inTowards DevBuilding an Asynchronous Pipeline in Python using Celery, RabbitMQ and MongoDBFirst of all, apologies for the jargon in the title. But let me help you break this down and tell you why you should care.Sep 23, 2022Sep 23, 2022
Published inTowards DevServe MLFlow models in Kubernetes — Auto deploy your production models with easeML Flow is quite an amazing tool and it will serve an ML practitioner on multiple levels while building an end to end ML lifecycle. With…Sep 20, 20221Sep 20, 20221
NVIDIA Triton Inference Server — Serve DL models like a proDeep learning model deployment, on a scalable and optimised infrastructure be it GPU or CPU and streamlining the whole process can be…Sep 19, 20222Sep 19, 20222
Integrating LabelStudio in React Machine Learning ApplicationsIf you have been wondering how to label new data points within your react applications, be it Text, Images, or Audio LabelStudio has got…Dec 18, 20213Dec 18, 20213
Vision API on Your Local Linux Machine Using PythonIf you are really keen to play around with what Google built to detect faces and all those in Image Annotation this can give you a head…Feb 25, 2018Feb 25, 2018
Create a Spark enabled Jupyter Notebook on AzureI hope you have read my piece on doing the same thing on a Google Cloud Platform. Same stuff , just nuances in the platform. If you haven’t…Nov 26, 2017Nov 26, 2017