Skip to content
Go back

ggplot From Zero 01: Your First Plot

On this page13 sections
Cover image for ggplot From Zero 01: Your First Plot

Who This Is For

This article is for readers who already know how to run a short R script, but have never built a plot with ggplot2. The goal is simple: make one successful plot first, understand the smallest useful ggplot() pattern, and leave with enough confidence to keep going.

What You Will Do

  • Load ggplot2 and a beginner-friendly dataset.
  • Build your first scatter plot with ggplot() and geom_point().
  • Add one more variable through color and shape.
  • Learn what data, mapping, and aes() are doing.

Before You Start

  • You need a working R environment.
  • You need ggplot2 and palmerpenguins.
  • You should know how to run a script with Rscript or inside RStudio.

The companion script for this article is:

R draw/scripts/01-ggplot-from-zero-your-first-plot.R

Show Explanation

Package Setup

If you are using the planned Conda environment for this series, create and activate it first:

```bash
conda env create -f "R draw/scripts/r-ggplot-environment.yml"
conda activate r-ggplot
```

If you are using an existing R installation instead, install the packages once:

```r
install.packages(c("ggplot2", "palmerpenguins"))
```
```

Step 1: Load the Packages and Clean the Data

Start with the smallest setup that still feels realistic. palmerpenguins::penguins is a popular teaching dataset because it has multiple numeric and categorical variables that work well with scatter plots.

library(ggplot2)
library(palmerpenguins)

penguins_clean <- na.omit(
  penguins[, c("bill_length_mm", "bill_depth_mm", "species", "sex")]
)

This step matters because ggplot2 only becomes pleasant when the data you send into it already has the columns you need and no missing values in the key plotting fields.

Step 2: Build the First Successful Plot

Now build the smallest complete ggplot:

ggplot(
  data = penguins_clean,
  mapping = aes(x = bill_length_mm, y = bill_depth_mm)
) +
  geom_point()

This is the core pattern you should remember:

  • ggplot(data = ...) tells ggplot where the data comes from.
  • aes(...) maps columns in the data to visual roles.
  • geom_point() says you want points rather than bars, lines, or boxes.

Here is the beginner-friendly version generated by the companion script:

01-first-scatter-basic.png

Step 3: Add One More Variable with Aesthetic Mapping

Once the first plot works, the next useful step is not “make it fancy.” It is “show more information clearly.”

ggplot(
  data = penguins_clean,
  mapping = aes(
    x = bill_length_mm,
    y = bill_depth_mm,
    color = species,
    shape = sex
  )
) +
  geom_point(size = 2.8, alpha = 0.85)

This version maps:

  • species to color
  • sex to shape

That lets one plot carry more meaning without changing the underlying geometry.

01-first-scatter-mapped.png

Step 4: Understand the Smallest Mental Model

For this first article, keep the mental model small:

  1. Start with a data frame.
  2. Choose the columns you want to compare.
  3. Put those columns inside aes().
  4. Add one geom_*() layer.

If you remember only one thing today, remember this:

ggplot(data = ..., aes(x = ..., y = ...)) + geom_point()

That line is the foundation for almost everything else in the series.

How to Confirm It Worked

  • Your script runs without an error after library(ggplot2).
  • You can print a scatter plot in RStudio or another graphics device.
  • The companion script creates:
    • R draw/figures/01-first-scatter-basic.png
    • R draw/figures/01-first-scatter-mapped.png

Common Questions

Why start with a scatter plot instead of a bar chart?

Scatter plots are a very direct way to understand the grammar of x, y, and mapped aesthetics. They help you see the link between data columns and plot structure right away.

What is the difference between mapping = aes(...) and color = "blue"?

Inside aes(), you are mapping a variable from your data. Outside aes(), you are setting one fixed visual value for every point.

Do I need to write mapping = every time?

No. Many people write ggplot(data, aes(...)) as a shortcut. This tutorial keeps the explicit names first because they are easier for beginners to read.

Review Score

Score: 93/100 Verdict: This draft is ready for the tutorial queue and gives beginners a clean first win with ggplot2.

Show Explanation

Score Breakdown

  • Accuracy: 23/25. The article teaches the canonical ggplot() + geom_point() workflow and keeps the first explanation focused.
  • Beginner friendliness: 24/25. The mental model is intentionally small and the examples are not overloaded.
  • Reproducibility: 24/25. A companion script and generated figures exist, and the data source is stable.
  • Professional judgment and risk handling: 22/25. The article avoids early complexity, though later articles will still need to reinforce the difference between mapping and setting.

Review Notes

  • Ready for human review.
  • Before publication, add one console screenshot from the script run if you want the setup to feel even more guided.
```

Personnel

  • ✍ Creator: Chenglin Cai
  • 🤖 AI Collaboration: ChatGPT
  • 🧪 Data Provider: palmerpenguins package dataset
  • 💻 Code Contributor: ChatGPT