This post was motivated by some R code that I came across (over a thousand lines of it) with a bunch of if-statements that were never called. I wanted an automatic way to get a minimal reproducing example of a test from this file. While reading about how to do this, I came across Dead Code Elimination, which kills unused and unreachable code and variables as an example.
An R library exists for code optimization but it didn’t address my issue of removing if statements by running through the code. So, to learn a bit about this, I made a naive attempt. It was more challenging and ugly that I thought it’d be.
My attempt was as follows - I break up the code into statements and check if a statement is an if statement, which has the following syntax: if (condition) expression_true else expression_false. I then evaluate the condition and replace the statement with the appropriate expression. Identifying how to play around with the syntax trees was challenging at first, but I noticed that as.list(quote(...)) breaks up ... into a list of expressions representing the syntax tree (noticed this in the pryr source!). This is obviously so much cleaner than regex but I still had to write a lot of special cases, e.g. handling NULLs, inline if statements, etc. I recursively split expressions, evaluate them and then recombine the lists into a call. Loops are ignored. A lot of work can still be done on this, but dunno if it’s worth the time.
Demo: R Code
contains_if <- function(ex_list) {
    if(length(ex_list) <= 1)
        return(FALSE)
    else if(ex_list[[1]] == quote(`if`))
        return(TRUE)
    else {
        return(any(sapply(as.list(ex_list), contains_if)))
    }
}
check_for_inline_if <- function(ex_list) {
    if(length(ex_list) == 3)
    if(ex_list[[1]] == quote(`<-`))
    if(length(as.list(ex_list[[3]])) >= 1)
    if(as.list(ex_list[[3]])[[1]] == quote(`if`))
        return(TRUE)
    return(FALSE)
}
fix_inline_if <- function(ex_list) {
    if_statement <- as.list(ex_list[[3]])
    my_list <- c(ex_list[1:2], if_statement[3:4])
    if_statement[[3]] <- as.call(my_list[c(1, 2, 3)])
    if_statement[[4]] <- as.call(my_list[c(1, 2, 4)])
    return(if_statement)
}
remove_unused_ifs <- function(expr) {
    ex_list <- as.list(expr)
    if(check_for_inline_if(ex_list)) {
        ex_list <- fix_inline_if(ex_list)
        expr <- as.call(ex_list)
    }
    if(length(expr) == 1) {
        return(expr)
    } else if(!contains_if(ex_list) |
              ex_list[[1]] == quote(`for`) | 
              ex_list[[1]] == quote(`while`)) {
        eval(expr, .GlobalEnv)
        return(expr)
    } else if(ex_list[[1]] == quote(`if`)) {
        condition <- ex_list[[2]]
        expr_true <- ex_list[[3]]
        expr_flse <- if(length(ex_list) == 4)
                     ex_list[[4]] else quote({})
        expr <- if(eval(condition, .GlobalEnv))
                expr_true else expr_flse
        return(remove_unused_ifs(expr))
    } else {
        return(lapply(ex_list, remove_unused_ifs))
    }
}
recombine <- function(ex_list) {
    if(is.list(ex_list)) {
        if(any(sapply(ex_list, is.list))) {
            return(recombine(lapply(ex_list, recombine)))
        } else {
            return(as.call(ex_list))
        }
    } else {
        return(ex_list)
    }
}
recombine(remove_unused_ifs(body(function() {
    my_list <- list(my_bool = F)
    abc <- if(TRUE) 1 else NULL
    print(abc)
    if(my_list$my_bool) {
        print('hello_a')
    } else {
        if(!my_list$my_bool) {
            if(TRUE) print('hello_b')
        } else print('bye')
    }
})))
# Output:
# {
#     my_list <- list(my_bool = F)
#     abc <- 1
#     print(abc)
#     {
#         {
#             print("hello_b")
#         }
#     }
# }2021
Efficient Gaussian Process Computation
Using einsum for vectorizing matrix ops
Gaussian Processes in MGCV
I lay out the canonical GP interpretation of MGCV’s GAM parameters here. Prof. Wood updated the package with stationary GP smooths after a request. Running through the predict.gam source code in a debugger, the computation of predictions appears to be as follows:
Short Side Projects
Snowflake GP
Photogrammetry
I wanted to see how easy it was to do photogrammetry (create 3d models using photos) using PyTorch3D by Facebook AI Research.
Dead Code & Syntax Trees
This post was motivated by some R code that I came across (over a thousand lines of it) with a bunch of if-statements that were never called. I wanted an automatic way to get a minimal reproducing example of a test from this file. While reading about how to do this, I came across Dead Code Elimination, which kills unused and unreachable code and variables as an example.
2020
Astrophotography
I used to do a fair bit of astrophotography in university - it’s harder to find good skies now living in the city. Here are some of my old pictures. I’ve kept making rookie mistakes (too much ISO, not much exposure time, using a slow lens, bad stacking, …), for that I apologize!
Probabilistic PCA
I’ve been reading about PPCA, and this post summarizes my understanding of it. I took a lot of this from Pattern Recognition and Machine Learning by Bishop.
Spotify Data Exploration
The main objective of this post was just to write about my typical workflow and views. The structure of this data is also outside my immediate domain so I thought it’d be fun to write up a small diary working with the data.
Random Stuff
For dealing with road/city networks, refer to Geoff Boeing’s blog and his amazing python package OSMnx. Go to Shapely for manipulation of line segments and other objects in python, networkx for networks in python and igraph for networks in R.
Morphing with GPs
The main aim here was to morph space inside a square but such that the transformation preserves some kind of ordering of the points. I wanted to use it to generate some random graphs on a flat surface and introduce spatial deformation to make the graphs more interesting.
SEIR Models
The model is described on the Compartmental Models Wikipedia Page.
Speech Synthesis
The initial aim here was to model speech samples as realizations of a Gaussian process with some appropriate covariance function, by conditioning on the spectrogram. I fit a spectral mixture kernel to segments of audio data and concatenated the segments to obtain the full waveform.
Sparse Gaussian Process Example
Minimal Working Example
2019
An Ising-Like Model
… using Stan & HMC
Stochastic Bernoulli Probabilities
Consider:
