Here is the study guide for the third (final) exam.
Yes, data scientist is the hot career of the moment, but when someone asked on Quora what the downsides were the answers were pretty telling. Here’s a look at what data scientists had to say.
Just a reminder that your final exam will be on Monday, May 8 at 5:30pm in the same room as class. Please be on time. Students will not be permitted to enter late. Please make sure that all missing assignments, quizzes and weekly questions are done before the start of the exam. Grades will be submitted after the exam.
Here are some additional links for the people analytics chat we’ll have in class. Beyond this week’s reading it’s worth checking out how analytics will be used for HR via these links. It’s not required reading, but since all of you may be applying to an algorithm for a gig you might as well know the game.
- LinkedIn: People Analytics Takes Off: Ten Things We’ve Learned
- McKinsey: People analytics reveals three things HR may be getting wrong
- How Walmart uses Tableau for people analytics
Here is the link for the driver download
Nathan’s office hours to help with any questions about the online class yesterday will be from 12:15 pm – 1:45 pm this Thursday (April 13) at Alter 236G. Time can also be used for any group trying to refine, nail down their group assignment topic.
Here’s the video and WebEx walkthrough for the April 10 class. Apologies in advance for the production value, but you’ll get the gist and we can follow up on April 17. I kept it quick. Nathan’s in class exercise walkthrough is here for the second more complicated one. You’ll see where to break for the in-class activity. Just hit pause and do it on the second one.
Other programming notes:
- The link in the WebEx I noted is here for Hadoop’s side projects and buzzwords to note.
- Gradebook should be updated by now. At the far right, you’ll see projections for where you stand now regarding the final grade (assuming status quo performance on participation, quizzes etc).
- I’m sequestered at an offsite so will be slow to respond to email.
A note about that first reading: It’s a bit dated and Hadoop has advanced since that article. Much of the focus in the open source community has been on side projects tied to Hadoop. One common theme is that analytics and better user interfaces are being layered onto Hadoop. Most companies would use Hadoop via companies like Cloudera and Hortonworks. These companies package Hadoop and sell services and support. To see what I mean re Hadoop and its other projects see the primary Apache page. For our purposes, we’ll keep Hadoop high level, but in the data science department, internship interviews etc you may want to know about projects like Hive, Cassandra, Pig and Spark.