2 June 2011 1 Comment

Cross Post: SQL Server, Business Intelligence, Data Mining, & Major League Baseball (I)

Just a recent article I wrote for the C&C Computer Solutions site / blog. Great stuff. Here's a bit of the intro, but for the MEAT of it, you'll have to go to the C&C site and read it yourself...

Let’s be honest – we’re data guys. “Geeks” to be more exact. But, we’re also sports guys – and if there is ANY sport out there that prides itself as being massively data-centric… its Baseball. Data goes with baseball like Barry Bonds and… um, Home Runs (yeah, that’s it). Data collecting is an integral part of baseball culture and has been going on way before the internet or relational databases were even invented.

“Business Intelligence” is a term that’s been kicked around around for some time now, and basically just means “analyzing data from your past, in order to BETTER make better decisions in the future (or to better steer it in realtime)“. Its all about strategy, learning from mistakes (and successes), and being able to actively monitor / measure the health of your business. We might use this data to ask important business related questions: “Is Suzi Salesperson performing up to par this quarter?” or “What line of products have the biggest margin this time of year?”.

However, its not too far of a stretch to imagine someone who works for a Major League Baseball team sitting in front of a computer at the head office thinking: “What is the ROI of our pitching staff this year? Are they performing to expectations?” Its really no different then the business and reporting scenarios many of us encounter every single day. The exciting part is not only being able to read what has happened out on the ball field yesterday – but what might happen tomorrow. Boom. “Bizball Intelligence” anyone?

In this multi-part post, I’ll be taking you through all the steps needed to get up and running with your own historical AND current Major League Baseball statistics database (an operational / transactional DB), staging it out in a more “reportable” fashion (a data warehouse DB), and then finally building some cubes, calculated measures, and choice “Player Key Performance Indicators” (KPIs) based on the data. Who knows, maybe you’ll be so good at it that you’ll get hired by the San Diego Padres front-office, like this guy did, but more on him later.

Read the rest HERE

  • Freddy Colina

    Hi Ryan,

    I know you are actively using Tableau and the MLB Gameday data.

    I’m doing too, Gameday DB is my source of learning or applying new things on Tableau. Just wondering how are you using the data.. are you reading directly to the “transactional” and redundant tables, i.e.: atbats, batters, etc. Or did you build a DW that Tableau is reading?.

    Let me know if this is the best way to exchange ideas or if you prefer to do it by email, or other medium?.


    Freddy Colina