Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

 VaShay Carpenter


Objectives

  • Compare three visualization systems in R: base graphics, lattice, and ggplot2.
  • Apply each system to the same dataset and observe similarities and differences.
  • Develop clear, reproducible code and articulate your insights.

Dataset

Choose one dataset from the Rdatasets collectionLinks to an external site.. Load it in R with:

data("DatasetName", package = "PackageName")
head(DatasetName)

Tasks

  1. Base R Graphics
    Create at least two plots using base R functions. Examples:
    # Scatter plot
    plot(DatasetName$x, DatasetName$y,
         main   = "Base: x vs. y",
         xlab   = "x",
         ylab   = "y")
    
    # Histogram
    hist(DatasetName$z,
         main   = "Base: Distribution of z",
         xlab   = "z")
  2. Lattice Graphics
    Use the lattice package to produce conditioned or multivariate plots. Examples:
    library(lattice)
    
    # Conditional scatter plot (small multiples)
    xyplot(y ~ x | factor(group),
           data = DatasetName,
           main = "Lattice: y vs. x by group")
    
    # Box-and-whisker plot
    bwplot(z ~ factor(category),
           data = DatasetName,
           main = "Lattice: z by category")
  3. ggplot2
    Use ggplot2’s grammar of graphics to create layered visuals. Examples:
    library(ggplot2)
    
    # Scatter plot with smoothing
    ggplot(DatasetName, aes(x = x, y = y, color = factor(group))) +
      geom_point() +
      geom_smooth(method = "lm") +
      labs(title = "ggplot2: y vs. x with trend by group")
    
    # Faceted histogram
    ggplot(DatasetName, aes(z)) +
      geom_histogram(binwidth = 1) +
      facet_wrap(~ category) +
      labs(title = "ggplot2: z distribution by category")










Discussion

On your blog, embed your three visualizations (one from each system) and address:

  • How does the syntax and workflow differ between base, lattice, and ggplot2?
  • Which system gave you the most control or produced the most “publication‑quality” output with minimal code?
  • Any challenges or surprises you encountered when switching between systems.

How the syntax and workflow differ:

  • Base R: Uses a pen and paper model. You call a function, and it draws. Adding a legend requires a separate, manual command. It is fast for a quick glance but tedious for complex layering.

  • Lattice: Uses a formula interface (y ~ x | z). It is designed for multi-panel trellis displays. It is more automated than Base R but the syntax is less intuitive than ggplot2.

  • ggplot2: Uses the "Grammar of Graphics." You build the plot in layers (+). It is the most powerful for mapping multiple variables (Color, Size, Shape) simultaneously.

Which system gave the most control? 

ggplot2 provided the most 'publication-quality' output with minimal code. The ability to map a 4th variable (Weight/Size) and a 5th (Regression lines) within a single coherent block of code is unmatched by Base R or Lattice.

Challenges encountered: 

One challenge encountered were warnings regarding the size aesthetic being dropped during statistical transformation. I resolved this by moving the size mapping specifically into the geom_point() layer and utilizing the updated linewidth parameter for geom_smooth(). This ensures the regression line remains focused on the relationship between Horsepower and MPG without attempting to ingest the 'Weight' variable meant only for the individual data observations. 

The primary challenge was the 'mental shift' between the formulaic approach of Lattice and the layered approach of ggplot2. In Lattice, you define the panels upfront; in ggplot2, you can add facets or layers at any point in the process. Additionally, Base R lacks a native way to handle multivariate legends automatically, which highlights why modern data analysis has moved toward ggplot2.

Submission


Disclaimer:


Generative AI is integrated into my professional workflow for drafting, structural organization, and code optimization. To avoid redundancy, this statement serves as a standing disclaimer for all entries. Generative AI has been utilized to ensure technical accuracy and to facilitate the very documentation requirements mandated by the curriculum available within the course syllabus.

Comments

Popular posts from this blog

Module # 6 Doing math in R part 2

Module #2 Assignment