We’ve just launched this new tool to our Tier 60 and up Patrons. Along with the new batter comp tool, you’ll have access to our RoboScout, our Top 500 Prospect list, the mailbag and dynasty podcast, and the future Top 1000 Dynasty Rankings!
By: Dylan White
As I’ve mentioned many times, I find it very helpful to make “comps”. I know I’m not supposed to – sometimes it puts unfair expectations onto players – but with the way my brain works, any heuristic or shortcut to help me categorize, assess, or value a player as easily as possible is welcome. If you tell me that, a certain player is like a Kyle Schwarber type, I immediately envision ‘power, good on base percentage, probably a bad batting average and playing a position down on the defensive spectrum’. If you say that a prospect is like Nicky Lopez, I think ‘light hitting defensive whiz who will grab some stolen bases but not much power’.
Or maybe someone says “sure, Niko Kavadas is doing that…but he’s a 23 year old in Low A” and you want to know the historical context of that: do we disregard his performance as illegitimate? Or does it portend likely future success? (Or, as is probably more likely, is it somewhere in between?)
Well, to help answer these questions (and many others), with no further ado...here is the new Prospects Live Hitter Comps Tool – being rolled out to the Tier 60 Patrons as a bolt-on to RoboScout.
I’ve been playing with it for hours and I’ve (happily) gone down many fun rabbit holes.
I’ve started (and scrapped) this introductory article a few times already this week – trying to strike the right balance between explaining the mechanics of how the math works vs just showing some fun results – and where I’ve landed today is to only briefly get into the back-end details and encourage the readers to explore for themselves.
But there is some knowledge that is likely helpful before you start playing.
To access the tool once you’re a patron, navigate to the middle of this post and look for the emojis. Click on the “NEW Hitter Comp Tool” tab.
Behind the scenes, we’ve created a database of all players who accumulated 120 plate appearances at a minor league level in any season since 2007 (where the levels known as A and A- pre-“consolidation” have been combined into “A”; domestic short season ball such as the Appalachian League has been included with the contemporary Complex Leagues).
By selecting a hitter from the “Search & Select a Player” field (from the selected “Season” in the field, defaults to 2022), the tool finds the closest results “match” (in descending order) from all players at the same minor league level and age. If you want to open up the player pool to also include those players who are up to 1 year older than the player being comped, there is that ability too.
Instead of going into the weeds about metric weightings and z scores and mahalanobis distance, the best way (in my opinion) of thinking of how the tool finds the “match” is to imagine that for each minor league player in the database, we’ve created a set of percentile sliders similar to the way “baseball savant” creates their statcast metric sliders (e.g. Alec Bohm’s savant sliders are depicted below):
If you selected “Alec Bohm” from the comp tool (in this example), imagine the tool then superimposes the same set of ‘sliders’ of each player in the pool over “Alec Bohm’s” and measures the total “difference” of each metric’s percentile ranking. The player with the smallest total difference is the best comp. We also provide a “Comp Score” (showing you how close they map with each other) which you can think of as “how far off the player was as a percentage of how far off the most ‘dissimilar’ player was” and we've color-coded how close these matches are.
Oh yeah, we also highlighted the players from 2022 in yellow to help you find them immediately.
So, this is what I see:
O'Hoppe's season was similar to a couple catchers: MJ Melendez from 2021, Abraham Toro (who you may have forgotten was catching for the Astros back then) and Kyle Schwarber in 2015. Part of why we see this is not just "luck" but because Speed score is one of the compared metrics. The results also provide me confidence that O'Hoppe's performance was similar to three major leaguers. This gives me confidence that O’Hoppe will have a successful hitting career in the major leagues.
Digging into the details further, you can compare his actual performance with the other players to see if there are differences. I see that his strikeout rate, walk rate and swinging strike rate are better than Melendez, implying that he should have a better batting average and OBP (though perhaps less power).
Another thing we immediately notice is that essentially all of the players who are identified as the best comps for O’Hoppe are players that made it to the major leagues: Trent Grisham, Chris Carter, Brett Baty, Miguel Sano, and even Mike Carp (who had a career MLB wRC+ of 107 in his 1000 plate appearances). This provides even more confidence that O’Hoppe should make it to the major leagues and find success.
As a sidebar, prospect hunters from about a decade ago may remember Chris Mitchell’s KATOH model on fangraphs which essentially did this sort of analysis: found player comps and then found an aggregate “average” career WAR from the comps to estimate expected career WAR.
Oh, and as another sidebar, here's Niko Kavadas (BOS) at Low A (from the introductory paragraph). SPOILER: it's not as encouraging a collection of comps as Logan O'Hoppe, though admittedly the "accuracy" of the matches isn't as good):
I naturally will use this as another set of data points to help inform my prospect rankings. I have much more confidence that Logan O'Hoppe will not only make it to the major leagues but also carve out a sustainable career than I do with Niko Kavadas.
Finally - and this is part of the fun of the tool - is that we also can make the inference (from the O'Hoppe comps) that Connor Norby (BAL) and Brett Baty (NYM) are reasonably expected to have major league hitting success too. That’s exciting information. Of course, you can then select Connor Norby to find his comps…and down the rabbit hole we go.
I do not recommend taking the names at face value and leaving it at that. Look at the differences in performance (even in players identified as good matches) – more or less speed, defensive position, etc. – to help inform your conclusions regarding the “comp”. Also, if there seem to be no good “comps”, then essentially it is suggesting that the player’s seasonal performance was “unique”; this can be good (“unprecedented performance!”) or bad (“no one has been that bad and allowed to flounder without being demoted”). Keep all of this in mind.
OK, cool. So have some fun. And here are some recommendations I have:
1) Look at top prospects (e.g. Corbin Carroll (ARI), Elly De La Cruz (CIN)) to see who historically has had a similar profile. This may answer questions such as "has anyone had a strikeout rate at AA as high as Elly and been able to make it to the majors"?
2) Look at the 2021 top prospects (Adley Rutschman (BAL), Riley Greene (DET), (SEA)) and see if any 2022 names pop on the list. Also, see if their MLB performance this year could have been predicted - this can lend credibility (or not) to how much validity you should give to this tool, or also see where maybe we should have pumped the brakes (or gone all-in) on previous well-regarded prospects.
3) Look at the older minor league performance of current all stars (e.g. Fernando Tatis in 2016, or Bryce Harper in 2011) to see if there are any current players who have similar profiles.
4) Look at some “unique” profiles (e.g. the low SwStrk% rate of Nikau Pouaka-Grego (PHI) for a 17 year old, or the sky high 20% walk rate of Emmanuel Rodriguez (MIN)) to see if there have been precedents historically.
5) For players who have performed well but are “old for the level” (e.g Niko Kavadas (BOS), Damon Keith (LAD), Vaun Brown (SFG), etc.) to see what type of MLB future it portends.
6) For players that broke out (e.g. Jackson Chourio (MIL) and Michael Harris II (ATL) in 2022), look at their 2021 performance to see what they did before…and see if there were any signals that a breakout was imminent or, even more fun, are there current 2022 prospects who had similar seasons to Harris’s 2021 (read: Darell Hernaiz (BAL)).
7) Have fun!
OK, that’s all you need to begin using the tool in an effective way. Note that the intention is not only to have a Pitcher Comp Tool up and running soon but also that as 2023 RoboScout picks back up when the minor league season kicks off again, we'll have 2023 performances able to be "comped" (even if fewer than 120 plate appearances have been accrued) on an ongoing basis as part of RoboScout.
For those who want to learn more, here are some additional details:
- Park Factor adjustments were not made. Only the raw results are shown.
- No regression was performed on the results either (ie. a player with a 30 K% rate over 120 plate appearances will “match” with a player who has a 30 K% rate over 600 plate appearances.)
- For some levels and years, the swinging strike rate (SwStrk%) that fangraphs provides does not use the same denominator as in other years (and hence is listed as approximately twice as high as "actual"). For the years where the denominator was “per swing”, it was adjusted to “per pitch seen”. To the best of my knowledge, this has been corrected (using estimates) so that the metric is consistent across all players within a level. In other words, the SwStrk% for some players as displayed in the comp tool are actually "expected"SwStrk% and may not match what is shown in fangraphs. Note that DSL and CPX remains listed the way fangraphs lists it.
Before I go though, I can't resist...here are a few comps that caught my eye for various reasons (which, hopefully, will be self-evident to you):
Jasson Dominguez (NYY) from 2022
Kyle Tucker (HOU) from 2017