Accomplishments – Aug 2016

As a reminder, a one-page summary of all the courses, books & videos
I’ve reviewed in the past year can be found on my Journey Roadmap page.

imageIt’s been a summer of incredible transition for me as I’ve made a permanent move from the relatively chilly climate of New York (old house shown to the right) to the equatorial heat misery of South Carolina. I can only hope that this investment pays off in the winter when I’m enjoying a balmy 50-degree day while the Northeast shovels out of a blizzard.

I’ve not posted an “Accomplishments”  blog since May, but that certainly shouldn’t indicate that I’ve not been pursuing Data Science over the summer. Far from it! Although I hadn’t completed any new courses or books in June and July, when I wasn’t busy packing up or tossing out all of my life’s possessions, I took advantage of the time to revisit a lot of the topics I’d covered in the past year.  I began creating hundreds of Mnemosyne flashcards to sharpen my skillset. I retook the UoW Machine Learning: Regression Course, going over all code examples in painstaking detail. I also re-read every word of  “An Introduction to Statistical Learning with Applications in R”, working through all of R labs and exercises, incorporating sample code into my Mnemosyne card set. It was an absolutely necessary activity, and I feel much stronger as a result. Consider revisiting some old courses you’ve taken – you’d be surprised that you can still get something new from them with multiple tries.

August, however, with the move complete, a number of endeavors also came to a successful close.

Completed Items

Coursera – Machine Learning: Clustering and Retrieval

This is the fourth course in the University of Washington Machine Learning Specialization on Coursera. Grouping and association were the theme here. Diving into large datasets of Wikipedia article entries, we found commonality between groups of articles, implemented various measures of “alikeness”, assigned articles to topics based on word groupings and made predictions on new articles based on models build from large training sets.

Continue reading


How I Take Online Courses for Data Science (Part 2 – Self-quizzing)

The Struggle

In my last blog, I shared how I take notes while engaged in an MOOC for Data Science. I proceeded happily for months in this fashion, ringing each bell as I completed course after course in various specializations.

Young man, youth tired out or shattered after a hard nightAround February of this year, however, a sinking feeling was starting to settle in: I just wasn’t retaining a lot of the information I was learning. Sure I was scoring 100’s on all the quizzes and completing the assignments on-time without any issues, but I felt uneasy.

A month after I worked on the code for a gradient descent algorithm for a lab assignment, you think I had any clue what gradient descent was?

Two months after I learned how to create a support vector machine model in python, you think I recalled what library to import to even start?

Three months after I learned to separate data into training & test groups in R, you think I could remember a single command to do so?

NO – I found myself constantly having to go back to older notes for the most basic commands. I was spending all of my time on StackOverflow looking for solutions to the most basic questions (like how to reverse a Python list, how to come up with 10 random integers in R, etc). If I was to seriously work in the Data Science realm, I knew I needed to have a solid, fundamental level of proficiency with the tools and techniques I was expecting to use.

Enter Mnemosyne

Quite a while ago, I had gotten it into my mind to learn Japanese. My sole motivation: in my academic career the only thing I absolutely sucked at was foreign languages and it wasn’t for lack of effort. In my 30’s, I wanted to wipe that blemish off my record by tackling one of the hardest languages for native English-speakers pursue.

imageI ran through all 3 levels of Rosetta Stone. I listened to every minute of Pimsleur’s entire Japanese collection. I had more books & videos than I knew what to do with. The most valuable tool I used, however, was the open-source flashcard system, Mnemosyne.

I confess, I didn’t try a different flashcard programs and settle upon this one as the best. But what I did want was a tool to help me identify the concepts I was struggling with and beat me over the head with them until they became second-nature.

From their website:

Mnemosyne uses a sophisticated algorithm to schedule the best time for a card to come up for review. Difficult cards that you tend to forget quickly will be scheduled more often, while Mnemosyne won’t waste your time on things you remember well.

Continue reading

How I Take Online Courses for Data Science (Part 1- Note-taking)

Since June of last year, I have completed 27 online courses on various Data Science topics (see my roadmap)  and am in the midst of 3 others. I thought I’d share some of the techniques I’ve refined during this time and hope you find some nuggets that will improve your own self-paced education experience.

Time planning

Once you decide to start learning about Data Science, you soon discover there are vast amounts of resources available. I learned in my first couple of months of this journey that my appetite for knowledge far exceeded the number of hours in a day (let alone the time I had available). So at the beginning of each month, I set aside some time to plan. I create a spreadsheet grid of the courses, books and other activities I intended to pursue that month and spread the work out day-by-day.

Sometimes a task is just to spend x number of minutes on a topic. For specific courses with deadlines, I could be more precise on which lessons to watch, the estimated time to be spent and stay on track with the syllabus. I also block out the days I knew I would be traveling (as I commuted 800 miles between NY and SC!). When a specific task was accomplished, I’d color that cell green.


Yes, this may be more OCD than you’d be willing to sign up for. But in addition to keeping me from getting overloaded, there’s a certain amount of positive reinforcement in watching the green start to fill, giving me that thrill of accomplishment. And it also forces me to be realistic with what can be done that month. I’ve had to pass on courses I would have normally signed up for because I saw it just couldn’t fit into a daily schedule. Continue reading