last hacked on Jul 22, 2017

## Abstract We will be exploring the Craft Beers data set from [Kaggle](https://www.kaggle.com/nickhould/craft-cans) and attempting to answer two main questions: * Is there a direct correlation between IBU and ABV in beer? * Which states produce the most beer/have the most breweries?
## The Formula, The Process, The Alcohol * Subtract the Original Gravity from the Final Gravity * Multiply this number by 131.25 * The resulting number is your alcohol percent, or ABV% To figure out the alcohol content, we need to measure the sugar content at the beginning of the brewing process and the end. Breweries typically use a hydrometer which will easily give us the specific gravity of the beer. By gravity, we mean the measure of the density of a liquid compared to water. __Fun Fact__: Alcohol is essentially yeast pee. Since the yeast consumes the sugar in the wort, it creates alcohol and carbon dioxide, and thus the carbon dioxide floats up and out of the beer while the alcohol stays behind. ![alt-text](https://cloud.githubusercontent.com/assets/22850980/25462833/9d043aba-2aa6-11e7-8c56-c9817671778b.jpg) ## Is There a Correlation? The first thing we did to check for a correlation between IBU and ABV is create a scatterplot comparing the two. We plotted all the beers in our data set that had IBU values and added a "line of best fit" using local polynomial regression fitting. There appears to be a slight overall trend of increasing IBU ratings leading to an increase in ABV, but as you can see, there is way too much variance to call that a direct correlation between the two. <iframe width="900" height="600" frameborder="0" scrolling="no" src="//plot.ly/~bryandaetz/14.embed"></iframe> |Average ABV | Average IBU | |------------|-------------| | 5.98% | 42.71 | Next, we decided to take our analysis a step further and explore IBU and ABV trends based on specific styles of beer. We wanted to see if styles of beer that had higher IBU ratings on average also had higher ABV ratings on average. However, since our data set had over 200 different styles of beer (many of which only appeared once or twice), we decided to focus our analysis on only the most common styles of beer. <iframe width="900" height="600" frameborder="0" scrolling="no" src="//plot.ly/~bryandaetz/18.embed"></iframe> We then created another scatterplot comparing IBU and ABV for only the 8 most common styles of beer. Ideally, we would've liked to see distinct groupings or clusters of points for each unique style of beer. While this was not the case overall, there seems to be a pretty clear grouping at the high IBU/high ABV end of the spectrum. American Double / Imperial IPA had the highest values for both IBU and ABV by far. <iframe width="900" height="600" frameborder="0" scrolling="no" src="//plot.ly/~bryandaetz/16.embed"></iframe> | Style |Average ABV|Average IBU | |-----------------------------------|-----------|------------| | American Double / Imperial IPA | 8.77 | 93.32 | | American IPA | 6.48 | 67.63 | | American Porter | 6.03 | 31.92 | | American Brown Ale | 5.78 | 29.89 | | American Amber / Red Ale | 5.72 | 36.30 | | American Pale Ale (APA) | 5.50 | 44.94 | | American Blonde Ale | 5.01 | 20.98 | | American Pale Wheat Ale | 4.76 | 20.69 | ## Which States Produce the Most Beer/Have the Most Breweries? To answer our second question, we decided the best approach would be to visualize the data geographically. We have presented our findings in an interactive [Shiny Dashboard](https://bryandaetz.shinyapps.io/CraftBeersShiny/)


keep exploring!

back to all projects