How to Deal with Venn Diagrams

This blog post is primarily for my niece Katya, who was asked this nice problem in her homework:

In a camp, there are 79 kids; 27 of them are younger than twelve, 33 are girls, and 30 are boys that are twelve or older. Fill in this chart:

Girls Boys All
Younger than twelve
Twelve or older
All

The real question was to demonstrate that this chart can only be filled in only one possible way. So great, let's first enter in what we know:

Girls Boys All
Younger than twelve 27
Twelve or older 30
All 33 79

Now, since the "All" must be equal to the sum of the parts, you have enough information to find out two of the cells, namely

Girls Boys All
Younger than twelve 27
Twelve or older 30 I must be 79-27=52
All 33 I must be 79-33=46 79

Now we're here:

Girls Boys All
Younger than twelve 27
Twelve or older 30 52
All 33 46 79

Then there are two more cells we know:

Girls Boys All
Younger than twelve I must be 46-30=16 27
Twelve or older I must be 52-30=22 30 52
All 33 46
»

Data Science Fundamentals - 07/22/2015

I'm a data scientist at Microsoft, in the ExP platform. We constantly interview and hire for other data scientists, and it's tough to get quality people.

With all the hotness around data science, it's inevitable that a bunch of schools are opening up special programs around data science, but the shit that most people forget to realize is that data science is built upon mathematics.

So if you don't know your fundamentals in math, then you're fucked. You can probably get away for a while producing some data porn, but in whatever soon-to-be-failed startup you join, you will get some data and end up describing it, even though the real value is in making useful conclusions from it.

It's tough to do an interview that involves actual data analysis in an on-site interview (some firms give homework-like exercises; we don't although I do like the idea) but let's go through some problems we ask and why we like them.

A bus runs every 15 minutes outside my apartment. If I come down at some random time, how long, on average, will I have to wait before I catch a bus?

Plenty of people I have interviewed can't even give me

»

You can't delete anymore

Cloud computing is great, and it's fantastic when I set up family and/or friends with stuff like OneDrive and they realize that they can completely separate the data from the machine. Separation of responsibilities (i.e. Single Responsibility Principle) is the core concept of object-oriented programming, and now it's making its way to the real world.

But as the Ashley Madison hack reminded me, in this world nothing will ever be deleted anymore.

»

Why?

I'm a mathematician. I always have been, and always will be. Over the course of life I have been able to do pretty awesome things by simply being able to understand the way the world works in a mathematical way.

So all of my thoughts and works here will be of my own, and will represent me. But these will be my main points:

  • Math isn't the end-all. Specifically, to be a good mathematician this day and age you need to be a decent computer programmer.
  • Data science is hot thing, but it's little more than mathematical modeling and a shit-ton of marketing.

But there's going to be a lot of cool math in this blog, too, so join along and learn something new every day.

Specifically, I'm going to be working through my blog during the semesters at the same pace of a normal course, so let's kick open the ol' calculus book and learn together.

»