Who This Is For

This article is for readers who want to see what a complete beginner-friendly ggplot workflow looks like from start to finish. Instead of learning one isolated geom, you will move from a raw table to a final publishable figure using the same ideas from the earlier articles.

What You Will Do

Start from a small synthetic bioinformatics-style table.
Summarize and clean the data.
Build a rough first plot.
Improve it with grouping, faceting, labels, and styling.
Export a final figure and compare it to the rough version.

Before You Start

You should be comfortable with the first seven articles in this series.
You need ggplot2, dplyr, tidyr, ggrepel, and patchwork.
You should understand that plot design is a sequence of decisions, not a one-line trick.

The companion script for this article is:

R draw/scripts/08-ggplot-from-zero-full-workflow.R

Step 1: Start from a Raw Table

The script creates a synthetic table with:

sample_id
module
condition
batch
replicate
score

This dataset is small on purpose. The goal is not biological realism at full scale. The goal is to practice the workflow with a structure that feels closer to real scientific data than the earlier public teaching datasets.

raw_bioinfo <- tidyr::expand_grid(
  sample_id = paste0("Sample_", 1:24),
  module = c("Inflammation", "Cell cycle", "Interferon"),
  condition = c("Control", "Treatment"),
  batch = c("Batch_1", "Batch_2"),
  replicate = 1:3
)

Step 2: Summarize Before Plotting

Rather than plotting every raw row immediately, the script creates a summary table first:

workflow_table <- raw_bioinfo %>%
  group_by(module, condition, batch) %>%
  summarise(
    mean_score = mean(score),
    sd_score = sd(score),
    n = n(),
    se = sd_score / sqrt(n),
    .groups = "drop"
  )

This is a very common scientific plotting pattern. The figure is usually built from an analysis-ready summary table, not always from the raw table directly.

Step 3: Build a Rough First Draft

Start with the simplest plot that answers the question at all.

plot_rough <- ggplot(workflow_table, aes(x = module, y = mean_score, fill = condition)) +
  geom_col(position = "dodge") +
  labs(title = "Rough first draft")

That is enough to test whether the plot type is reasonable.

Step 4: Improve the Plot Deliberately

Then improve it step by step:

better labels
explicit colors
error bars
faceting by batch
cleaner theme
better legend placement

The companion script produces this comparison:

And here is the final version on its own:

Step 5: Highlight the Most Important Result

The series ends with one more refinement: selective labeling.

Instead of labeling everything, the script labels only the highest summary values within each batch. That keeps the figure readable while still drawing the reader’s attention.

Step 6: Adapt This Workflow to Your Own Data

If you want to reuse this pattern for your own project, replace these pieces first:

the raw input table
the grouping variables
the summary statistics
the color scale
the faceting variable

You do not need to rewrite the whole plot from zero every time. You only need to swap the parts that define your own scientific question.

Show Explanation

Files Produced by the Companion Script

R draw/figures/08-full-workflow-rough-vs-final.png
R draw/figures/08-full-workflow-final.png
R draw/figures/08-full-workflow-labeled.png
R draw/figures/08-full-workflow-raw-table.csv
R draw/figures/08-full-workflow-summary-table.csv

```

How to Confirm It Worked

The script creates all three figure files and both CSV files.
You can explain why the rough plot is weaker than the final plot.
You can point to at least three design decisions that improved the final figure.
You can identify which columns you would replace in your own dataset.

Common Questions

Why use synthetic data here instead of a downloaded real dataset?

Because the goal is a stable, self-contained tutorial that still feels close to a real scientific workflow without adding fragile external download steps.

Why make a rough version first?

Because rough drafts help you test the question and the plot type quickly. They are part of a good workflow, not a sign of failure.

What is the main lesson of the whole series?

A good ggplot figure is built by a chain of small decisions: data, mapping, geometry, structure, style, and export.

Review Score

Score: 95/100 Verdict: This draft is ready for human review and works well as the series finale because it turns the earlier isolated lessons into one complete workflow.