Welcome to the second post!
Quick Recap: This is part of my series of blogs dedicated to showing how Python (and its various libraries) is a programming language that will increase your prowess as a data analyst.
Ready to join me on this journey? Awesome! But first, let's get you set up. If you're anything like me, learning by doing is the most efficient way to dive into a new subject. Let me me guide you through setting up your environment so that you can actually follow along with the examples in this blog series. Shouldn't take more than an hour.
3 Quick Notes:
This set-up guide is geared towards newcomers to the world of programming. If you feel comfortable in a terminal or command prompt, then this guide is not for you. If you are looking for a summary of system requirements, then scroll down to the end of this post.
If you are looking to understand why one set up is preferred over the other, this guide is also not for you. This guide is to tell you which set-up is needed on your local machine to use Python for data science work and visualizations. Just cut and dry instructions to get you ready to code for data analysis. If you are looking for detailed explanations for setting up environments or you just want a deeper dive, scroll down to the end of this post for a list of resources.
Most important note: one extremely helpful lesson I have learned in my professional life is that using Google and sources such as stackexchange.com to search for troubleshooting solutions is a vital skill. Knowing how to search -- i.e. knowing which keywords and sources to use -- to find answers as you work though your specific projects and business problems is a skill that can be honed and become part of your arsenal as a professional. If you have any hiccups during set-up or at any point throughout this blog series, Google your issue for solutions! It is a good habit for any budding programmer or data analyst. Also, feel free to reach out to me!
What exactly are we downloading here?
Since the goal is to use Python for data science, we need to make sure we have all the right packages, versions, libraries, and so forth to do just that. The closest thing we could do to waving a magic wand and having an instant spin up is to download Anaconda Navigator.
Straight from the source: "...the open source Anaconda Distribution is the easiest way to do Python data science and machine learning. It includes 250+ popular data science packages and the conda package and virtual environment manager for Windows, Linux, and MacOS."
The Anaconda Navigator contains Jupyter Notebook, which we will be using throughout this whole blog series. For more information about using Navigator, see Navigator.
STEP 1: Determine Your System Details
STEP 2: Download Anaconda
STEP 3: Install Anaconda
Let's not waste great resources. Anaconda documentation provides a really well done set of detailed instructions, which include screenshots and helpful notes for common errors during the installation process. There are even specific guides for each OS!
Click on the installation guide for your OS below, and start following from Step #3. Come back to this post when you're done:
Mac OS -- Follow steps #3 - 10: https://docs.anaconda.com/anaconda/install/mac-os#macos-graphical-install
Windows -- Follow steps #3 - 15: https://docs.anaconda.com/anaconda/install/windows
STEP 4: Install Bokeh
In this blog series, we will be primarily using Bokeh to create data visualizations. If you prefer or need to use other data-viz libraries, such as seaborn and matplotlib, please do! And consider this step optional. I prefer Bokeh for several reasons, but mainly because I created a Bokeh data visualization guide for myself.
Let's take a look at the easiest way to install Bokeh, straight from the source:
Okay, let's do that!
Press 'Command ⌘' and the space bar to open spotlight search. Then type in 'terminal'.
How to Open Command Prompt
STEP 5: Test Your New Environment!
First, let's create a new Jupyter Notebook file (extension .ipynb). There are several ways you can go about this, but I'm just going to walk you through the most efficient route.
Enter the command "jupyter notebook" in a terminal or command prompt, as shown below:
If you want, you can navigate to a specific folder where you plan to save your .ipynb files. I navigated to my 'Documents' folder.
A new tab should open up on your browser. Welcome to your first jupyter notebook!
Again, I will go over how to use these things in the next blog post. For now, let's just run some code to make sure we have what we need.
Make sure that 'Code' is selected, as shown above.
If you want, click on 'Untitled' to rename your file.
from bokeh.plotting import figure, output_notebook, show
# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]
# output in notebook
# create a new plot with a title and axis labels
p = figure(plot_height = 300, plot_width = 400, title="simple line example", x_axis_label='x', y_axis_label='y')
# add a line renderer with legend and line thickness
p.line(x, y, legend="Temp.", line_width=2)
# show the results
Success! I hope. If you got the expected output, go onto the next blog post for the real fun (or head back to the ebook if you are coming from the Python Visualization Handbook)!
If you've encountered some errors, be sure to check out the detailed resources below. Or feel free to reach out to me with questions!
SUMMARY OF SYSTEM REQUIREMENTS
License: Free use and redistribution under the terms of the Anaconda End User License Agreement.
Operating system: Windows Vista or newer, 64-bit macOS 10.10+, or Linux, including Ubuntu, RedHat, CentOS 6+, and others.
System architecture: 64-bit x86, 32-bit x86 with Windows or Linux, or Power8.
Minimum 3 GB disk space to download and install.
-- Detailed Installation Information --
Anaconda Documentation -- Installation: https://docs.anaconda.com/anaconda/install
Conda Documentation -- Installation:
YouTube -- How to Install Anaconda (Windows 10):
Video -- Setting Python & Conda Path (Windows):
How to Setup a Python Environment for Machine Learning and Deep Learning with Anaconda:
Bokeh Documentation -- Quickstart: