Assignment #11: Debugging and Defensive Programming in R

 


VaShay Carpenter

Objectives

  • Learn to reproduce and interpret error messages in R.
  • Practice identifying and fixing logical versus element-wise operations.
  • Document a defensive programming workflow.

Background & Buggy Code

Below is a function intended to flag rows of a numeric matrix x that are outliers in every column according to the Tukey rule. The function contains a deliberate bug. A helper function for detecting Tukey outliers is provided first so that you can focus on debugging the logical error in the main function.

tukey.outlier <- function(x, k = 1.5) {
  q1 <- quantile(x, 0.25, na.rm = TRUE)
  q3 <- quantile(x, 0.75, na.rm = TRUE)
  iqr <- q3 - q1
  x < (q1 - k * iqr) | x > (q3 + k * iqr)
}

tukey_multiple <- function(x) {
  outliers <- array(TRUE, dim = dim(x))
  for (j in 1:ncol(x)) {
    outliers[, j] <- outliers[, j] && tukey.outlier(x[, j])
  }
  outlier.vec <- vector("logical", length = nrow(x))
  for (i in 1:nrow(x)) {
    outlier.vec[i] <- all(outliers[i, ])
  }
  return(outlier.vec)
}

Hint: The line with && may be incorrect.

Tasks

  1. Reproduce the Error
    In R, create a test matrix and run the function:
    set.seed(123)
    test_mat <- matrix(rnorm(50), nrow = 10)
    tukey_multiple(test_mat)
    Capture the exact error message or warning message you see.


  2. Diagnose the Bug
    Explain why using && inside the loop is incorrect. Recall that && evaluates only the first element of a logical vector, whereas this task requires element-wise comparison across all rows.

    The && operator triggered a length mismatch warning, processing only the first row of each column. 

  3. Fix the Code
    Edit the function so that the logical operation is applied element-wise. Replace the buggy line with:
    outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])

    #Switched to element-wise logic (&) to ensure every data point is audited.
  4. Validate Your Fix
    Re-run your corrected function on test_mat and verify that it returns a logical vector of length 10 without error:
    corrected_tukey <- function(x) {
      outliers <- array(TRUE, dim = dim(x))
      for (j in seq_len(ncol(x))) {
        outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])
      }
      outlier.vec <- logical(nrow(x))
      for (i in seq_len(nrow(x))) {
        outlier.vec[i] <- all(outliers[i, ])
      }
      outlier.vec
    }
    
    corrected_tukey(test_mat)


    #I swapped the && for the & operator. This ensures the Tukey rule is applied to every coordinate in the matrix. Re-running corrected_tukey(test_mat) now returns a logical vector of length 10 without warnings, confirming every row was audited for outliers across all columns.
  5. Defensive Enhancements (Optional)
    Add checks at the top of your function to:
    • Ensure that x is a matrix.
    • Ensure that x is numeric.
    • Provide informative errors if assumptions are not met.
    Example:
    corrected_tukey <- function(x) {
      if (!is.matrix(x)) {
        stop("x must be a matrix.")
      }
      if (!is.numeric(x)) {
        stop("x must be a numeric matrix.")
      }
    
      outliers <- array(TRUE, dim = dim(x))
      for (j in seq_len(ncol(x))) {
        outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])
      }
    
      outlier.vec <- logical(nrow(x))
      for (i in seq_len(nrow(x))) {
        outlier.vec[i] <- all(outliers[i, ])
      }
    
      outlier.vec
    }
  6. Document Your Debugging Workflow
    On your blog, include:
    • The original error or warning message.
    • Your diagnosis of the bug and why the code was wrong.
    • The corrected code and its successful output.
    • Any additional defensive checks you added.

Submission

  • Push your corrected R script (module11_debug.R) to GitHub.
  • Create a blog post documenting your debugging steps and results.
  • Submit the URLs for your GitHub repository and blog post in Canvas.


GitHub Link: https://github.com/cryo-cell/r-programming-assignments/blob/main/module11_debug.Rmd

Disclaimer:


Generative AI is integrated into my professional workflow for drafting, structural organization, and code optimization. To avoid redundancy, this statement serves as a standing disclaimer for all entries. Generative AI has been utilized to ensure technical accuracy and to facilitate the very documentation requirements mandated by the curriculum available within the course syllabus.


Comments

Popular posts from this blog

Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

Module # 6 Doing math in R part 2

Module #2 Assignment