We at Positron set out to build a cost-effective alternative to NVIDIA for LLM inference, and after 12 months, our Florida-based head of sales made our first sale. He taught us the value of chasing our largest competitive advantages, across industries and around the globe. We also managed to build a FPGA-based hardware-and-software inference platform capable of serving monolithic and mixture-of-experts models at very competitive token rates. It wasn't easy, because the LLM landscape changes meaningfully every two weeks. Yet today we have customers both evaluating and in production, with both our physical servers and our hosted cloud service. We'll share a few of the hairy workarounds and engineering heroics that achieved equivalence with NVIDIA so quickly, and tamed the complexity of building a dedicated LLM computer from FPGAs.
Barrett Woodside
In developer-oriented, marketing, and product roles, Barrett spent the past decade of his career working on AI inference, first at NVIDIA, running and profiling computer vision workloads on Jetson. After three years shoehorning models onto embedded systems powering drones, robots, and surveillance systems, he joined Google Cloud where he first-hand experienced the incredible power of Transformer models running accurate translation workloads on third-generation TPUs. He helped launch Cloud AutoML Vision with Fei-Fei Li and announced the TPU Pod's first entry into the MLPerf benchmark. Most recently, he spent two years at Scale AI working on product strategy and go-to-market for Scale Spellbook, its first LLM inference and fine tuning product. Today, he is Positron's co-founder and VP of Product.
Positron AI
Website: https://www.positron.ai/
Positron delivers vendor freedom and faster inference for both enterprises and research teams, by allowing them to use hardware and software explicitly designed from the ground up for generative and large language models (LLMs).
Through lower power usage and drastically lower total cost of ownership (TCO), Positron enables you to run popular open source LLMs to serve multiple users at high token rates and long context lengths. Positron is also designing its own ASIC to expand from inference and fine tuning to also support training and other parallel compute workloads.
Barrett Woodside
In developer-oriented, marketing, and product roles, Barrett spent the past decade of his career working on AI inference, first at NVIDIA, running and profiling computer vision workloads on Jetson. After three years shoehorning models onto embedded systems powering drones, robots, and surveillance systems, he joined Google Cloud where he first-hand experienced the incredible power of Transformer models running accurate translation workloads on third-generation TPUs. He helped launch Cloud AutoML Vision with Fei-Fei Li and announced the TPU Pod's first entry into the MLPerf benchmark. Most recently, he spent two years at Scale AI working on product strategy and go-to-market for Scale Spellbook, its first LLM inference and fine tuning product. Today, he is Positron's co-founder and VP of Product.
Edward Kmett
Edward spent most of his adult life trying to build reusable code in imperative languages. He converted to Haskell in 2006 while searching for better building materials. Edward served as the founding chair for the Haskell Core Libraries Committee and continues to collaborate with hundreds of other developers on over three hundred functional programming projects on Github. He previously wrote software for stock exchanges at Standard & Poors, served as a researcher at MIRI, and ran Groq's Software Engineering team. As Positron's CTO, today he sets long-term software, architecture, and optimization strategy.
Panasonic, IBM partner with Constellation Network to debut its DoD-vetted blockchain
Kindo raises $20.6M to bring security to enterprise AI
CEU Eligibility: COC, CPC, CPC-P, CPB, CPPM
Over the last few years there have been many cases of hospitals receiving inappropriate reimbursement for medical procedures. This session will focus on the procedures associated with these cases, including diagnostic and therapeutic procedures for access sites of dialysis patients, peripheral vascular patients and a variety of surgical procedures. We will explore these cases and discuss the characteristics and scenarios that lead to inappropriate reimbursement.
Learning Objectives:
-Through the case study approach, examine specific types of hospital procedures that have been associated with inappropriate reimbursement
-Explore methods for preventing, detecting and correcting errors leading to inappropriate reimbursement for these procedures.