In the previous post, we looked at different types and design objectives of high-level IRs used by ML compilers. Today, we are going to look at different optimizations that are performed using high-level IRs.These optimizations can be performed either offline or online. In online mode, the optimizations are done before performing the inference, while in offline mode, the runtime saves the optimized graph to disk. In this post, I have attempted to further extend Li et al.’s [1] classification of high-level IR optimizations into five categories to the best of my knowledge. Figure 1: Overview of Optimizations done on High-level…