Making Decent Graphs

One of the double-edged swords of R is its flexibility. When it comes to making graphs, this flexibility provides almost unlimited options but also means that there’s not a single “right” way to make a graph. Here, I provide some tips on how to make decent looking graphs, primarily with built-in functions. The whole process seems kind of weird at first, but becomes more natural as you get used to it.


The function to plot a histogram in R is “hist()”. To plot a histogram, you can use the following commands (each command should be on one line, even though here they spill over to multiple lines):

mydata <- c(10, 25, 32, 11, 5, 15, 22, 16, 12, 4, 4, 21, 17, 11, 22, 33, 6) #Creates a vector of data, height=4) #Create a new window for the plot


These three commands result in a default histogram, which does its job but looks mediocre at best. For instance, the lines are too thin, the labels might be less than ideal, and so forth.

You can add commands to your histogram to customize it. To increase the line width, use par(lwd=3) before the hist() command. Additional commands can be called within the hist parentheses. Here’s one potential approach that results in a nicer histogram (I’ll explain the commands below):

mydata <- c(10, 25, 32, 11, 5, 15, 22, 16, 12, 4, 4, 21, 17, 11, 22, 33, 6) #Creates a vector of data, height=4) #Create a new window for the plot


hist(mydata, main=”My Histogram”, xlab=list(“Value”, cex=1.3), ylab=list(“Number of Observations”, cex=1.3), lwd=3, breaks=c(0,10,20,30,40), col=”maroon”, cex.axis=1.3)

This histogram has thicker lines, filled bars, and custom labels. The font is also bigger, making it more readable. Notice how most of the commands are just piled inside the hist() parentheses (separated by commas with no carriage returns). This feature is common in R. The meaning of each command is:

main: Sets the title for the graph.

xlab: Sets the x-axis label. By using a list, we can pass more than one command to xlab, and here we are passing the text and a parameter setting.

cex: A text scaling factor. Play around with this value until you get a text size you like.

ylab: Like xlab but for the y-axis.

lwd: Sets the line width. This statement can have different effects depending on the type of graph.

breaks: Manually sets the divisions between categories.

col: Sets the fill color for the bars.

cex.axis: A scaling factor for the text size on the axes.

See the R help page for more ways to manipulate histograms. You can get there by typing
?hist at the R command prompt.


Scatter Plots

Scatter plots are a very common way to show the association between to variables. Here is a way to generate a scatter plot in R using the built in “plot” command.

First, here are two vectors containing the independent and dependent variables:

x_variable <- c(0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3)
y_variable <- c(0,0,0,0,0,178,74,48,174,71,280,47,427,245,452,308,364,452,255)

Now, here is a basic scatter plot, which will not be very nice looking:

#Most Basic Scatter Plot:
plot(y_variable ~ x_variable)

We can tweak the scatter plot to make it look better by adding some additional commands. Play around with some of the values to see what the commands actually do to the appearance of the plot. This code will generate a reasonably pleasant looking figure:

#Better Scatter Plot: = 6, height = 5)

plot(my_plot <- y_variable ~ x_variable, xlab=list(“Independent Variable”, cex=1.3), ylab=list(“Dependent Variable”, cex=1.3), pch=21, ylim=c(0,500), cex.axis=1.3)

axis(1, lwd=3, cex.axis=1.3)

axis(2, lwd=3, cex.axis=1.3)

points(my_plot, pch=21, col=”black”, lwd=2, bg=”red”, cex=1.5)

Here are the definitions of some of the key commands:

xlab and ylab: These allow us to access the labels for the axes, and here we’re changing the text content and size.

cex: Scales the size of the text.

pch: Chooses the symbol for the points on the graph. Try different numbers between 1 and 25 to see how it changes the graph’s appearance. This number should be the same in the plot and points commands or things might get a little weird.

ylim: Changes the range of values on the y-axis. A similar command, xlim, does the same thing for the x-axis.

box(lwd=3): Changes the thickness of the border around the plot.

axis(): Selects and axis and allows us to change the line width and text size of that axis.

points(): Allows us to change the appearance of the points in the plot. Note that we are using the name of the plot (my_plot) to access its points (the name is assigned in the plot command with the <- operator). Then we are changing the border of each point (with col) and the background of each point (with bg). The commands lwd and cex change the line width and overall size of each point.