[wdwod week 4] tsa claims, an attempt

I didn’t finish the assignment this week, sadly. I was trying to work with this TSA Claims Data, and spent awhile manipulating the data so that it would sum the total claims per airline. I got it to look like this (which involved figuring out how to change NaN to 0):

Screenshot 2016-04-21 10.07.48

But I couldn’t figure out how to then extract each element in the code. I’m sure this is a simple problem but I didn’t have time to fix it and make a visualization this week. 😦



[web dev with open data week 3] cheaters by religiousness

Screenshot 2016-04-13 22.14.00

(See the visualization here.)

This week we dove a little into D3.js (which made manipulating SVG quite a bit easier). I decided to use this very old dataset about extramarital affairs. There were a number of different things in the data, but I looked specifically at how religious people reported themselves to be, and what percentage of people with the same levels of religiousness had cheated on their spouse.

Originally, I had visualized it by number of people in each category, but since was a disparity between how many people were in each religiousness group, I thought showing the percentage would be more interesting.

But I also didn’t want to completely lose the info with the sheer number of people in each category. (For example, the “somewhat religious” group was much larger than the “anti religious” group). So I added an effect so that when you hover over a bar, it changes color and tells you how many people were in each category.

Screenshot 2016-04-13 22.15.28Screenshot 2016-04-13 22.15.44

(The screenshots above remove the cursor for some reason but you get the idea.)

One note is that I realize “cheaters” might not necessarily be an accurate word for what’s going on. There’s some nuance in what exactly “extramarital affair” means, and it doesn’t always mean that someone is going behind the back of their spouse without their knowledge. I’m making some assumptions about this 1969 survey, though – that the people are referring to cheating and not ethical non-monogamy.

Anyway! You can check it out here. Code is also below.

We were also asked to take a look at I Quant NY this week. I looked specifically at this post called “Parking Immunity? Diplomats Owe NYC $16 Million in Unpaid Parking Tickets. A Closer Look at the Worst Offenders.” The writer looked at a dataset showing unpaid parking tickets in NYC, and found something interesting — license plates of diplomats seemed to rack up the highest number of unpaid tickets.

I thought this was a clever insight gleaned from the data, but I am curious to know how Ben manipulated the data to get the info he ended up with. He writes:

Whats not so cool is the fact that the City loaded many rows of the data in there twice accidentally.  That meant there were multiple rows with the same ticket number and conflicting outstanding debt amounts.  Though I understand that data errors happen, I don’t understand how the City can keep putting out data sets with no ownership and no effective way to send in fixes.  A city who cares about the usability of its Open Data can do better.

He then says he “cleaned up” the data. But I wish I knew more about what he did to change it, and how he knew for sure there the duplicates were accidents.


[web dev with open data week 2] internet users

Screenshot 2016-04-07 13.53.30

(See the page here.)

This week, I made a little visualization about how many people have access to the internet around the world. It’s built sort of jankily with SVG. The data is from Gapminder, showing how many people in 100 have access to the internet in 2011. You can see it here.