PROJECT GROUP WINTER 2018

DATA SCIENCE AT UCSB
0

PYTHON

MEDIUM

last hacked on Jan 22, 2018

# Project Group Winter - Ravi This quarter we will be focusing on the process of creating data science projects. More importantly on the life cycle of a data science project, and attainable goals for data science. ## Structure of Project Groups We will treat this like a business, with each team representing a different facet that this organization has to offer in the form of a data science project. ## Data Science Life Cycle There have been many drawbacks for leading an organization when communication isn't clear. Example of the data science life cycle: ![](https://biguru.files.wordpress.com/2014/12/crisp-dm1.png) ### More concretely: ![](https://biguru.files.wordpress.com/2014/12/ds-lifecycle.png) ## Documentation A necessary evil, documentation and documentation early on will prove whether a project will stand the test of time. Many projects remain unfinished and don't follow a uniform pattern that will ensure people will want to see and possibly build on your project ### Different routes for Documentation Of course there are many routes a team can take when going about documentation as it relates to their project. + *Technical Paper* - Focused on the implementation of the project (i.e. Walkthrough the code, algorithm used or process used for data extraction). Examples include: + [Random Forest Walkthrough](https://www.datascience.com/resources/notebooks/random-forest-intro) + [Developing a Deep Learning Bag-of-Words Model](https://machinelearningmastery.com/deep-learning-bag-of-words-model-sentiment-analysis/) + *Anecdotal Blog* - Focused on the subject within the data set through an interest or hobby within the team + [Rick and Morty Tidy Data Principles](https://www.r-bloggers.com/rick-and-morty-and-tidy-data-principles/) + [Largest Vocabulary in Hip Hop](https://pudding.cool/2017/02/vocabulary/) + *Real World Application* - this can take the form of using a data set that is provided by a 3rd party source to further research into the domain including completions and has real world application that tries to solve a problem + [Visualizing Thefts in Chicago](https://www.r-bloggers.com/visualising-thefts-using-heatmaps-in-ggplot2/) + [Breast Cancer Detection](https://arxiv.org/pdf/1711.07831.pdf) All categories are facets of data science and many have overlap between the categories I made up. For many of us, as beginners, we are often caught in the first two since we are learning many of these tools and cycles on our own. Therefore, when learning tools on our own it often helps to create documentation because something you are struggling with is something someone has struggled with before and will struggle with in the future. Trying to create a community that thrives on open contributions to all things *data science*. ## Benefits of Creating Documentation + Reproducibility + Interpretability + Open Collaboration ## Homework We are going to create different forms of documentation in this project group. We are going to focus on the different paths mentioned earlier and are going to produce succinct documentation as it relates to inertia7.com and other forms of blogs, papers, etc. Our voice will be heard.

COMMENTS







keep exploring!

back to all projects