Today, I will talk about the GGPLOT2 package in R that is based on the grammar of graphics. The idea that you can build every graph from the same components: a data set, a coordinate system, and geoms—visual marks that represent data points. We did an article about seaborn in python. seaborn is used to create super easy graphs with just a line of code. Similarly in R, we have this package called ggplot2, it is famous for its ease of use and professional looking graphs. Most of the graphs and visual representation of data you see online and in print are made using ggplot and also will be introducing a package called esquisse.
why bother with ggplot & esquisse when excel can do just fine?
there are times when graphs made in excel are sufficient enough but sometimes you want to up your game by having professional looking graphs. Here are two graphs for your reference. one is made using excel and the other using ggplot
Installation of ggplot2
ggplot2 package is part of the tidyverse package. It can be installed as a part of tidyverse when you install it or separately using the blow command
Just like anyother package, you can load the package using library command
in order to plot a graphs, you need to start with following code
ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point()
In the example above, the command has following components and their use
- ggplot() to call the ggplot package
- mpg, this is the name of dataset, it is to be replaced with the name of your dataset
- aes( is used to define the aesthetics of the graph. you can specify the X – Axis and Y Axis here
- displ, this is the column name to be used as X- Axis. alternatlively, It can also be written as x = displ to make it more user friendly while reading the code
- hwy, this is the column name to be used as Y- Axis. alternatlively, It can also be written as y = hwy to make it more user friendly while reading the code
- colour = , this is used to define the colour of different variables appearing in chart. it can refer to a column name to use the data in that column as reference. You can create graph without the use of colour part
- geom_point(), this part tells the ggplot to create a point graph. where data is represented by points. if you need to change the graph type you need to modify this line.
so in order to get all the functionality of the ggplot, you can use the following statement and replace the components as required
ggplot (data = <DATA>) +<GEOM_FUNCTION>(mapping = aes(<MAPPING> ), stat = <STAT>, position =<POSITION> ) +
<COORDINATE_FUNCTION> + <FACET_FUNCTION> + <SCALE_FUNCTION> + <THEME_FUNCTION>
now you might be thinking that’s a scary looking command out there. do I need to provide all this data? you haven’t used much of it yourself in the example above.
Relax you don’t need to worry about any of it. You can refer to this handy cheatsheet created by R Studio team.
the package also supports a very good command for quick plots, its called qplot. Its much easier to use and gets less variables as input. yet provides very good output
qplot(x = cty, y = hwy, data = mpg, geom = “point")
the above command also makes the same graphs with very few elements to enter.
so why bother with ggplot when qplot can do the same?
qplot is good for quick plots whereas ggplot gives all the options you can imagine for a plot.
I am too lazy to learn this command what should I do?
ah…. fine. I know you are coming from excel or power point. Where making the graphs is mostly point and click and drag and drop of data fields. You want the power and beauty of R graphs.
let me introduce you a package, that let you do all the graphics drag and drop. Creates the code for you to reproduce.
the package is available in CRAN and can be installed like anyother.
you can call the data selector using the command
where <DATA> is to be replaced with the dataset name you have loaded.
here you can select the data frame from the list of loaded data frames
You can choose the plot type to be created. You get all the data fields in the selector that can be used to drag and drop.
You can label and title your graphs, and can do all the customization that you can do with command line.
Want to learn more? There’s a whole list of packages that build on ggplot to further enhance the capability. a list can be found here
Today we learned about ggplot & esquisse. Rest I leave it to you to explore these beautiful package.s Don’t forget to share your experience with rest of us in the comments below.