Friday, February 24, 2012

Downloadable PITCHf/x

So MLB provides amazing amounts of data.  The term for the system that tracks the data (and for the data itself) is PITCHf/x.  To get this data MLB has two cameras mounted in every stadium to track the speed and location of a pitched baseball from the pitcher's mound to home plate with an accuracy of better than one mile per hour and one inch. This is the same data you see in MLB's online Gameday webcast to show the path and speed of each pitch, as well as the location with respect to the strike zone as the pitch crossed the front of home plate.   For a primer on Pitchfx and how to setup your own database, read these   http://fastballs.wordpress.com/page/2/  and this http://fastballs.wordpress.com/2007/08/23/how-to-build-a-pitch-database/ Seriously, it is better than anything I'll write.  However I do have something that might interest you.  Here is a LINK to the database I created (click file, download as to get it).  It isn't perfect as it doesn't parse all the defense information but it has every game, at bat, and pitch from 2007 through 2011.  You'll need to download a working MySQL system. I like http://www.easyphp.org/  which has PhpMyAdmin installined  (a gui for running MySQL).    So between my database, easyphp and the scripts that Mike Fast has available to update the database, you'll be doing painfully complicated baseball analysis that Tom Tango hates in no time.

Thursday, February 18, 2010

" You're so money and you don't even know it!"

Apparently Amazon has chosen this blog, above all the other blogs in the world, to pair up with and sell its books through.  Given this great honor, I'm going to oblige, but don't buy the books just because I say so.  Buy them because I'll send you 1% of the 1% I make.  Then get someone to buy the book from you and collect 110% and send me 5%, I'll divide that by Pujols WAR and send the remainder of that to Amazon.  It's like an MC Escher pyramid scheme.  Ok, here are the books I'd recommend if you want to start your own blog that no one reads. (P.S. please don't buy "How to Value Players for Rotisserie Baseball" from Amazon.  They're overcharging a little, it's like $20 from Baseballhq.com.)


  

Thursday, December 10, 2009

"Mr. Madison, what you've just said is one of the most insanely idiotic things I have ever heard. At no point in your rambling, incoherent response were you even close to anything that could be considered a rational thought. Everyone in this room is now dumber for having listened to it. I award you no points, and may God have mercy on your soul"

That line could very well be what you think when you finish reading this and the upcoming posts. In the following posts I'm going to layout my thoughts on drafting strategy.  This particular post is going to get into the construction of a proper statistic to compare hitters within position, across position, and then incorporate a comparison of hitters vs pitchers.  Let's start with a few examples, and let's take out all uncertainty by focusing on actual 2009 stats instead of any projections for 2010.    As usual, I'm assuming you're playing in a standard RotoLeague with Rs/HRs/RBIs/SBs/AVG/  and on the pitching side (Wins/SVs/Ks/ERA/WHIP)


Question 1: Who was better Dustin Pedroia or Brandon Phillips: 




 Let's throw out some options,  1)  Brandon Phillips was better in HRs, SBs, and RBIs and Pedroia was better in Runs and Average, but that doesn't really help since it doesn't take into account how much better.  2) One of the metrics I like (but don't use) is to look at each player and calculate what percentage each player gives you of a certain target. I like choosing the second best finish in each category for my target.  For my league, that was 914/252/901/154/286.  This changes the stats to 




   


(I divided average by 10 under the assumption you have 10 players in your lineup on average).   If you average those you get 10% for DP and 11% for BP.  So, on average either player would get you about 10% of the way towards hitting second place in each category. I think this is fairly robust and you could rank players well by that average stat.  In fact, here is the standings for Second Base, ranked in that manor.






However it does have some weaknesses.  The first being that the target used is specific to my league and may not reflect your league or even my league next year.  The second and most important, is it doesn't take into account SCARCITY.  I put it in caps because if you spend any time reading fantasy baseball forums, people take scarcity to an extreme, to the point that people pass on Albert Pujols because first base is a "deep position".  That's just crazy talk but it should still be looked at. 


 I view two different types of scarcity that can affect a particular player's value 1)  The scarcity of productive players at a position (i.e. After Hanley Rameriz, all the other shortstops suck) and 2) The scarcity of players that supply meaningful stats in a category (i.e. No one steals bases anymore so I better draft Jacoby Ellsbury).    To take into account these two factors, you need to compare a player's stats, not to some target, but rather to the pool of players you'll be forced to choose from if you choose one over the other (i.e. how much is each player worth versus the opportunity cost of not picking them).  To make this comparison, I developed a stat I loosely call the Fantasy Z-Score (which, after doing some research, the guys at RotoRebel also use but I swear I thought of it separately, I feel like the other guy that developed Calculus).  The calculation is a little complex but I think it has a bunch of small benefits.  First calculate the player's production in a particular category (I like to use projections from Steamer, ZIPS, Cairo and Clay Davenport, with playing time projections from Fangraphs. I then try to equalize ABs for injury prone players since I don't do H2H, and I equalize for low IP forecasted players since I don't compete in GS leagues.  You do this by adding replacement level stats to their projected stats. Next subtract the average production of players for the entire pool of players likely to be drafted. Divide that by the standard deviation of the production in that category for the pool of players likely to be drafted. This is the z-score for that player by category. For rate stats like batting average, era and whip, multiply by the AB and IP divided by the average AB and IP for batters and pitchers likely to be drafted.  You can then average those category z-scores together to get one score (or total them).  Then split the pool into positions (placing each player in his most shallow eligible position). To do this split you also need to allocate the players into your utility slots (MI, CI, and Utility).  Assume that each player in your league drafts the highest available Z-Score for each position (based on position limitation).  Once you have that,  then need to determine each player's Z-Score Total above the lowest level player (replacement level).  This takes into account positional scarcity and the marginal z-score is what you should rank and value player's based on.  For auctions, divide the entire money available to purchase players (after keepers are subtracted) by the sum all the positive marginal z-scores available to be drafted.  This is what you pay per marginal z-score.  Multiply that by each player's marginal z-score for his value.  I also tend to due this separately for hitters and pitchers based on a standard 70/30 split in my leagues for hitters vs pitchers. If I'm doing a snake draft I like to do the above and then rank the players by that auction value. 1) It weights down the pitchers, which is typical and perhaps reflects risk in pitchers vs hitters and 2) it keeps it clear as to what reaching / overspending really means. It might just mean a few dollars (or it might mean $15).  That is all pretty confusing, and if you duplicate the steps you'll see that some are iterative (you need to attempt, take the output and re-run the analysis, like for finding replacement level players).  If anyone is interested, I'll provide the How-to on this in a much longer, more detailed piece.  I'll also provide my rankings for a few league formats once all the projections systems are out.

Tuesday, December 8, 2009

"So you have no frame of reference here, Donny. You're like a child who wanders into the middle of a movie and wants to know..."

Ok, perhaps that was just a gratuitous way to sneak a Big Lebowski quote into this post.  Regardless, this entry is going to focus on setting up two tools that should help you either a) get up to speed quickly on fantasy baseball if you're new or b) give you important information very quickly about players and changing roles so you can be the first to act when new information becomes available.  The two tools are Google Reader (with rss feeds) and twitter.  The nice things about both tools is there is nothing to install and both  are easily accessible from mobile devices (iphones/blackberries) as well as from any computer with internet access.

First google reader

Step 1:  Set up an account at reader.google.com.  If you already have a gmail account, you can just use that as the log in.

Step 2:  Click Add Subscription in the top left.  Paste in these urls
http://razzball.com/feed/
http://feeds.feedburner.com/rotofeed
http://www.ballbug.com/index.xml
http://www.hardballtimes.com/main/content/rss_2.0/
http://www.fangraphs.com/blogs/?feed=rss2

And if you subscribe to Baseball Prospectus or Baseball HQ, you can probably search for their feeds as well.  But those should keep you up to date on the latest baseball happenings.

Step 3:  Now just click on All to get a steady stream of fantasy baseball information. When you've read your fill,  click "Mark all as read" so that you can better separate new information from old information.   If you want to add other non-baseball sources (why would you ever do that), you can make different folders in "Manage subscriptions"

There are certain things in baseball that you want to know very quickly, like when closer roles change, whether an injured player is starting, or when a hot prospect is called up, not to mention the daily checking on if a player is getting the day off.  In particular, keeping on top of the last one can be quite beneficial.  Many people could get 5-10% more stats just by fully utilizing their 182 game limit for each position.  For the latest information, I like to use twitter. 

Step 1: Sign up at www.twitter.com
Step 2:  Click "Find People" then search on twitter for and follow (or click on the following link and follow)
FB365
baseballtips
therotofeed
SeanRoto
GreyRazzball
knoxbardeen
bigjonwilliams
robertreed
TroyPatterson
rotoinfo_com
Alyssa_Milano

Yes that last one is Alyssa Milano.  She sleeps with players, she gets insider information. 

Step 3:  Even better than these feeds is the twitter search feature. If you want to know the latest info (like on a player starting, or weather in chicago, if any new information is out about players being offered you in a trade), I'd recommend searching twitter. I find it the fastest way to get the latest news.


Step 4:  Try to check twitter often

Hopefully those tools give you an edge, allowing you to be first to the free agent pool and/or your lineup as new information comes out.

Additional Tool:
One last point, if you have a blackberry or iphone, you can take advantage of this information easier with an app from yahoo or espn.