Broadly speaking, in mathematics, sparsity generally refers to the property of having a relatively small number of non-zero elements or structures within a larger space or set. This concept can be applied to various mathematical objects, such as matrices, graphs, functions, and more. Sparsity is often exploited to optimize efficiency. For instance, in linear algebra, sparse matrices (matrices with a large number of zero elements) are handled differently from dense matrices, leveraging the abundance of zeros to save memory and accelerate computations. In graph theory, sparse graphs (graphs with relatively few edges) can be processed more efficiently with different algorithms.…
ML Compilers Part 2: An Overview of Graph Optimizations
In the previous post, we looked at different types and design objectives of high-level IRs used by ML compilers. Today, we are going to look at different optimizations that are performed using high-level IRs.These optimizations can be performed either offline or online. In online mode, the optimizations are done before performing the inference, while in offline mode, the runtime saves the optimized graph to disk. In this post, I have attempted to further extend Li et al.’s [1] classification of high-level IR optimizations into five categories to the best of my knowledge. Figure 1: Overview of Optimizations done on High-level…
ML Compilers Part 1: High-Level Intermediate Representation
High-level intermediate representations (IRs) play a crucial role in machine learning (ML) compilers, enabling optimization and code generation for various hardware platforms. These IRs provide an abstract representation of the machine learning models that can be transformed and optimized before being compiled into executable code. In this blog post, I will discuss the design objectives, and the type of high-level IRs used by popular, real-world ML compilers. Workflow of ML compilers Content Design Objectives Types of High-level IR Examples of High-level IR Relay (TVM) Torch.Fx Design Objectives for High-Level IR Optimizations Optimizing the input ML model to enhance its runtime…
A survey of concurrency bug patterns
Humans tend to make the same mistake again and again. We, software developers, are no different. Hello everyone, in this blog post, I’m going to talk about common concurrency bug patterns. Bug patterns give good insight into developer’s psychology, specifically, the assumptions we make about the programming languages, compilers, 3rd party code, OS scheduler, and hardware architecture.
From Linearizability to CAP Theorem: A Brief Overview of Consistency Guarantees in Concurrent Programs
Concurrent systems, including both multiprocessor and distributed systems, often rely on object replication for improving performance and scalability. Object replication, the idea of keeping multiple local copies of shared data structures, demands another sub-system that ensures consistency between these local copies in real-time!! Network delays in distributed systems and overlapping operations on shared data structures within multiprocessor systems are some of the major challenges in ensuring consistency. To overcome these issues, without sacrificing correctness and performance, the research community has proposed various “relaxations” in the consistency requirement that I intend to cover in the next few sections. Following is the…
Protected: My Checklist for Writing Great Research Papers
There is no excerpt because this is a protected post.
Writing Your First LLVM Pass and Registering it in the Clang Toolchain
There are several detailed tutorials[1] on writing an LLVM pass so that I won’t cover it in much detail. However, as of today(May 2020), there is no detailed guide on registering an LLVM pass within the OPT and Clang toolchains. So, this post will be mainly regarding that. Content: Introduction Types of LLVM passes Writing a basic function pass in LLVM Registering a pass within the OPT toolchain Clang toolchain Introduction LLVM is an extremely modular compiler infrastructure that provides back-end tools for code optimization and transformation. It works on an intermediate representation called LLVM IR. Clang, on the other…
DDoS Defense Triad
Distributed Denial Of Service(DDoS) is one of the most disruptive cybercrimes that can cause substantial financial loss to its the victim. Recent DDoS attacks brought up to 1.3 Tbps of traffic with 126.9 million packets of data per second(Github, Feb 2018) and were used to serve a variety of motives ranging from financial gains to political protests, “Hacktivism.” In this post, I’ll cover three primary DDoS defense techniques that are low cost and can be easily deployed by small to medium scale businesses. Content: Intrusion prevention General Security Policy Ingress and Egress Filtering 3rd party DDoS protection services static signature…
Understanding Dependency Graphs for Program Analysis
Hi, In this blog post, I’ll briefly cover several types of dependency graphs used for program analysis, transformation, and representation. Following is the table of content: Prerequisites Reaching definition analysis Pointer analysis Dominator tree Call graphs Control flow graphs Control dependency graphs Data dependency graphs Program dependency graphs System dependency graphs Prerequisites Reaching definition analysis (RDA) RDA aims to determine the set of variable definitions (assignments) that may (or must) reach a given program point. Every program statement has an Entry set and Exit set associated with it, where the entry set is a set of all variable definitions reaching…
Static Program Analysis: Motivation and Techniques
Static analysis is a method to extract specific information about the program without actually running it. In this blog post, we will first have a look at different types of information that we can obtain using static analysis. Later, we will have a comprehensive overview of different techniques for static program analysis. Motivation There are broadly two categories of information, based on their usage, that we can extract: Proscriptive information and Descriptive information[1]. Proscriptive information is obtained when we statically analyze the program behavior against a set of prescribed rules. For instance, identification, localization, and explanation of all software anomalies…