Waterfall charts using ggplot2 in R
A Waterfall Chart is a form of data visualization which helps in determining the cumulative effect of sequentially introduced positive or negative values – Wikipedia
Waterfall charts can be used in many areas including inventory analysis, profit-loss analysis, and sales analysis. Excel is a popular tool used for creating waterfall charts. But R gives us a quick easy way to create these charts!
We are going to visually understand a profit and loss statement by creating a waterfall chart. Let’s first create a dataset in R, which talks about the different sources of income and cost.
Now let’s proceed with some data preparation steps in R.
- We convert desc to factor type
- We then create a new column called type which describes the different types of cash flows(in-flow, out-flow or total net income)
Next we create two columns called start and end.
- end is the cumulative sum of the amount.
- start is the end variable with a lag.
We are going to use the function ggplot, to plot the different metrics of a P&L statement (balance dataset which we have created).It also gives us the range for every type of cash flow (in,out or net)
Let’s first look at the various components of a ggplot to understand better:
- data refers to the dataset that you are looking to visualize
- mapping is the aesthetic mapping that describe the relationship between the variables & the visual attributes
- geom ,short for geometric objects, describes the type of plot eg.lines,points,etc
- stat is very useful for transforming your data before plotting. For e.g. bin data, get quantiles etc.
- facet helps to display subsets of data in different panels.
- scales helps to control the mapping between data and aesthetics
The beauty of plotting charts using ggplot is that we can add functions as layers. we can add layers to our chart using the “+” symbol. For example in the chart shown above,we have used the ggplot function to first plot desc variable.Next,we have added another layer using the geom_rect which specifies that the type of chart is rectangles. Also we have specified what values these rectangular bars should take using the variables ymin and ymax (ymin=end, ymax=start)
So here’s your waterfall chart! You can also refer to the following link to add different kinds of layers to your chart.