
Optimizing data storage for AI, generative AI, and machine learning: challenges, architectures, and future direction
As Artificial Intelligence (AI), Generative AI, Machine Learning (ML), and Retrieval-Augmented Generation (RAG) systems continue to evolve at rapid pace, the efficiency of data storage has become a foundational determinant of system performance and scalability. In this talk, Ankush Gautam explores how storage architecture directly impacts the reliability and throughput of modern AI workloads. Drawing from recent scholarly research and industry case studies, he outlines best practices for managing large-scale datasets across cloud, hybrid, and on-premise environments. The session also dives into the trade-offs between storage formats and architectures, and how to architect low-latency, high-throughput solutions tailored for training and inference pipelines.