Thursday, March 28, 2013

Corrections part 3: Almost there...maybe?

In my last two posts, I talked about the idea of applying corrections as a linear operator on the pitch parameters, with the trick being how to derive said operators.  I've tried a few methods out, and one of them has shown a hint of promise.  There is still more work to be done, as will be shown in a minute, but the initial findings leave me hopeful that I am on the right track.

I wanted to make things as simple as possible for this first test, so I took a stadium that is known to deviate from the league norm strongly in at least one observable parameter.  Tampa Bay in 2012 fit that bill, as was described by Jon Roegele here.  Next, I wanted to find 2 games that fit the following criteria:
i - The same starting pitchers went against each other in each game.
ii - One game was played at Tropicana Field, and the other at some other park.
iii - Each starting pitcher went long enough to throw at least 80 pitches.
iv - the 'some other park' of condition ii was at an elevation near Tropicanas (15 ft)

That last requirement was made so that I would, for now, be able to mostly ignore the effects of air density.  We'll tackle that problem at a later date, although my cursory glance at that problem so far indicates that the air density problem could be as much of a help to us as it is also a headache for may not be depends on a few other things that I haven't completely worked out yet, so we'll just leave it alone for now.

The easiest choices to go hunting around for games that meet those criteria then are games played within the American League East.  Except maybe Toronto (~300 ft).

New York and Boston however both seem to have some interesting effects of their own.  Perhaps we should leave them alone for now.  So how about Baltimore.  As it turns out, there weren't a whole lot of pairs of games in 2012 between the Rays and the O's in which the same two pitchers dueled it out into the late innings.  But there was at least one pair that met my conditions.
Those two games occurred within 2 weeks of each other as well.
On July 24 Jeremy Hellickson and Wei-Yin Chen both pitched into the 7th at Camden Yards in a 3-1 Rays victory.  11 days later those two met again in Tampa in a game that saw Chen throw 7 shutout innings in a 4-0 Orioles win.  Hellickson only lasted 4 innings in that game, but also managed to throw 88 pitches in those 4 innings.

Below are vertical movement vs. horizontal movement plots for (left) and velocity vs horizontal movement (right) for each pitcher.  Pitches thrown in Tampa are represented with blue dots, while pitches thrown in Baltimore are in red.

As you can see, movements appear to be shifted to the right, and that shift seems to grow as go farther to the right.  Also, relative to Baltimore, Tampa might be giving a very slight boost to fastball velocities.

Next, through a least-squares method, I derived a linear operator that would supposedly take the Tampa pitches and make them more like the Baltimore pitches.  As it's getting very late here, I will save the discussion of how I derived this (and talk about improvements that can be made) for tomorrow.  For now, I'm just going to show the plots, talk briefly about them (even though I think it's very obvious what this "correction" has done well and what it has not), and then go to bed.  So without further ado...

So as you can see, for the most part, the Tampa clusters are closer to their Baltimore counterparts.  The glaring exception is Hellicksons curveball.  It's clearly been overcorrected.  Instead of having more horizontal movement away from a righty as before, it now has less.  Also note that Chens slider and curveball both seem to have lost a little bit more velocity than they probably should have.

So there are the results of my first stab at this.  The positives:  For the fastballs at least, I'm pretty happy with the results.  The negatives:  Well, it's not perfect.

A little later I'll talk about how I got to this point, and where I can go from here (I have not even come close to exhausting all options on this method).  But I wanted to get this up at least, because hey, this is kinda cool.

No comments: