Face Harvesting: Download your Facebook data for External Reporting and Visualization
I'm always on the lookout for cool new data feeds that I can mess with - as you can clearly see from some of the other posts on this site.
Usually it involves some scripting, screen-scraping, system fiddling, and general madness - basically meaning that some people can get it to work, while others cannot. This bothers me, so THIS little "data harvester"... well, this time it's all automatic, baby!
Anyways. where was I? Oh yeah, that's right - we're talking about the good ole' Book of Faces again...
Facebook.
One of the great white whales of the data world.
Thankfully instead of harpoon, we have their nifty Open Graph API.
I previously posted about pulling data from the Facebook Open Graph - but that was just an unauthenticated search on posts in the "public stream", this is completely different...
No need to cut and paste scripts this time. Basically, I made a small web app that (once properly authenticated and given permission) will download a decent portion of your Facebook records in MySQL tables and/or CSV files for you to do whatever you want with. Not only your personal data (friends, comments, posts, etc), but also "Page" data (insights, posts, likes, etc) for any Pages that you might be a administrator of. For a random user with no "Fan Pages" (like me), it's just a cool way to bring up your data and check it out (maybe in Tableau, hint, hint), but for someone who runs an actual Fan Page, it can be a great deal more interesting.
Hell, you could even massage this data, crank it into something like SPSS or R, and generate something really interesting.
Here's whats going to happen.
- You will click the button below
- You will be re-directed to log into Facebook
- You will have to give the 'Data Harvester' app permission to see your FB data
- You will be re-directed back to this site
- A live counter will begin to show records pulled from key FB "tables"
- A big-ass button will appear with a link to the "data download" page when it's done
Here's a shitty (and incomplete) diagram to give you an idea of what kind of data we will have access to.
I'm sure that there are tons of problems and bugs, so please don't be shy to let me know how it works for you! Cheers!