The Other Worlds Shrine

Your place for discussion about RPGs, gaming, music, movies, anime, computers, sports, and any other stuff we care to talk about... 

  • Advanced stats and basketball

  • Somehow, we still tolerate each other. Eventually this will be the only forum left.
Somehow, we still tolerate each other. Eventually this will be the only forum left.
 #164425  by Don
 Sun Dec 14, 2014 5:05 pm
I've usually a numbers guy, but the movement of advanced statistics in basketball seems kind of stupid. I get that it works for baseball and probably most other sports where you have a lot of players and the team is basically the sum of the parts. If you have 8 steroids era Bonds on the team batting you probably will be doing okay even if your defense totally sucks because that's going to produce a ridiculous good offense/WAR/VORP/whatever. But I don't think it works for basketball, and not necessarily just because it's a team sport but that this is a sport where you're not really in a '1on1' so to speak. For baseball when you're batting it's pretty much all you. Your teammates can't channel their inner energy to make it easier for you to hit the ball. This is obviously not true in basketball. Advanced statistics in basketball seem to fall under 2 type of stats:

1. shooting % dominated.
2. +/- dominated.

For the first, you end up with conclusion like Tyson Chandler, who I think is the career leader in TS%, should be taking 100 shots a game and score at least 120 points and easily win. Never mind that he has a high TS% because of his position and that he doesn't shoot much. I think this is closer to a video game where you can just hand the ball to LeBron James on every possession and since he's obviously got the best physical stats out of all basketball players he'll just keep on score at a high %. But that only works for a video game. I remember having a guy take a 3 on every possession in a basketball game on Sega Genesis and since that guy had a shooting rating of 9 that was a pretty good shot and he ended up scoring 150 points or something.

The second kind just measures how much more points the team scores with soandso on. Of course since the best teams in the NBA tends to have the best overall +/-, the inevitable conclusion you get is that everyone on the best team in the NBA tend to be leader in +/- of any kind! If you have even a cursory knowledge of basketball it should be obvious that no one on the San Antonio Spurs is that good in the individual sense but the team plays very well together. Previously garbage players always seem to suddenly become awesome on the Spurs. That should be credited to the coaching/system that allows players to shine, not because guys who are viewed as perenniel losers prior to joining Spurs are really that good and just nobody saw that coming.

I remember they interviewed Poppvich in one game against Heat and where they barely guarded LeBron at the end of 4th quarter and sure enough he passed up an open 3 to another guy which got picked off, and when asked why did he not guard LeBron who is obviously the best player in the NBA, Poppvich said LeBron was 1 assist short of triple double at that time. Yes, a triple double usually correlates strongly with winning because you have to be doing pretty good to pull this kind of numbers. But it's not like the moment you make your last assist you gained a level in your basketball power and get 5 more basketball talent points to pump into TS% or whatever. Sometimes I think LeBron James literally thinks he's doing a basketball simulation game where if he raised all his stats high enough then he will automatically win, and this goes double for most of the advanced statistics guys. Like you'd be hearing them say stuff like "Soandso passed the ball 15 times and statistics show passing the ball 15 times is better!" without realizing whether passing the ball makes sense or not.
 #164427  by Replay
 Mon Dec 15, 2014 8:13 am
Do you play basketball, Don?
 #164428  by Zeus
 Mon Dec 15, 2014 3:01 pm
Basketball and hockey are starting to install GPS trackers on their players to better track their positioning, currently in the experimentation stages right now. Once those are in place and have been vetted, you're gonna start seeing a lot more positional-based stats come out. After that, you'll get a better picture of the true value of the player when combined with the other stats.

Baseball and football (especially the former) can be broken down to a series of one-on-one events, basketball and hockey are very much about flow and positioning. Until then, the stats in the latter two can only tell so much of the story and often have other factors which are usually not able to be taken into consideration (i.e. opponent/teammate quality, etc.) so you end up with stats that are more resulting than driving.

Give them a few years, the advanced stats in these flow-based games are just starting.
 #164429  by Don
 Mon Dec 15, 2014 4:09 pm
The positional stuff is like soandso makes 75% of his shots while going to right and 35% while going to left and it's not particularly surprising to see results like that but the camera stuff will give a better idea of how to quantify that. I don't think there's anything controversial about that. The problem with basketball is that there isn't a very good overall way to measure how much you contribute. If you take WAR it's basically predicated on the idea that scoring runs (or saving them) is always good. The same cannot be said for scoring points because there are bad shots that just happened to go in, and yet sometimes you got to take bad shots because that's the only kind of shots you get because the defense is good. Besides if we use PPG to measure how good someone is then all the advanced statistics guys would be out of job. One thing that you'll notice is people now take less crazy half court shots at the end of a quarter because that shot is almost certainly going to miss and it lowers your FG%. But this is ridiculous because holding the ball until time runs out has 0% chance of scoring while a crazy shot has a nonzero chance of scoring. Technology isn't going to help if your underyling metric isn't right. Right now there seems to be 2 camps like mentioned above, and not saying these metrics don't have their uses but it seems like they're commonly misused in context. If you look at a stat like PER it's basically saying 1 rebound is worth 0.735 basketball power and 1 turnover is worth negative 0.275 basketball power. While the idea is sound, who comes up with these numbers? And if you tweak the parameters you can get the stats to say almost anything you want. You don't have that problem with WAR because I think it's based on total bases as an approximation for runs scored and you can say whether you hit a double or hit a single and then stole second that's 2 total bases and 2 total bases is better than 1 total base but not as good as 3 or 4. In basketball there's no way to say whether 25/8/8 is better than 45/2/2 for example, though advanced stats seem to be setup to always favor the former even though there's no inherent truth to why one set of stats is better than the other.
 #164435  by Zeus
 Wed Dec 17, 2014 12:00 am
Let's get the underlying stats down for basketball (and hockey) first, then we can start analyzing. It's too early right now to tell how useful or not useful it is because the underlying stats leading to the advanced analytics is so minimal and in its infancy. Give it a few years then we can start tweaking.

And don't talk to me about WAR. It's a number that means nothing to me.
 #164436  by Don
 Wed Dec 17, 2014 2:58 am
Advanced stats in basketball isn't really that new. Although PER is often misused it is fundamentally a solid concept as I'd assume all the coefficients is based on historical values of what a rebound/assist/turnover/block/steal is worth and that's fine. It's just when you get people who think if you grab an extra rebound you'll gain a basketball level or that you should have the guy with the highest PER take 100 shots a game that's when things get dumb. Or you use +/- to determine everyone on the Spurs must all be star players because the Spurs have an insane +/- as a team in general. I don't think better stats will ever solve the problem of people misusing stuff in context. For example point guards are always rated extremely highly in any kind of advanced 'clutch' stats because if I pass you the ball and you score I get part of the glory for the assist, but if I pass you the ball and you missed you're the only person that sucks here, and I doubt stats will ever get so advanced to be able to figure out what could've happened if I didn't pass the ball to you. It doesn't need to be fixed but you just got to understand what its limitations are. Another common thing you'll find is that there's a lot of no name guys that nobody would ever guard that have surprisingly good clutch stats, and that's because they suck so much nobody ever guarded them in the first place. But that doesn't mean you start out in crunch time by giving the ball to that guy first.
 #164439  by Zeus
 Thu Dec 18, 2014 7:33 pm
That's why they invented things like Relative Corsi and Relative + / -, to see how much better your team is with you than with everyone else.

It all means nothing 'til they get the underlying stats down pat. That'll take a few years. And in basketball and hockey, stats aren't as definitive as they are in baseball and even football. They all tell a little portion of the story, it's up to the analyzers to properly piece them together. That ain't happening right now because it's all too new and flawed
 #164440  by Don
 Thu Dec 18, 2014 9:40 pm
The relative +/- is generally okay but they usually ignore stuff like if you're a bench guy you tend to have pretty good +/- if you're halfway decent because you're generally playing against people who are pretty bad. Ginobli would be an example of this, though in the case of the Spurs his role is exactly that so the Spurs don't have any illusions about Ginobli being able to score 50 points had he played the whole game against the starters. They're also not particularly useful for a team that is just very good in general because every combination of players on that team should result in relatively good stats. I remember the Pistons started soomething like 30-3 one year and there was a stat about how if a random bench warmer scored at least 10 points the Pistons are 20-1. Beyond the fact that this stat is obviously misleading (that guy probably only scores a lot in games where the outcome is long decided), winning 95% of the time on a team that normally wins 90% of the time is not that meaningful.

One of the more useful stat is comparing your PER or whatever you want to the guy that corresponds to your position on the other team on average. This at least better than just ignoring the fact that your performance depends on what the other team is doing.
 #164454  by Replay
 Thu Dec 25, 2014 6:32 am
Statistics are always useful, and always not as useful as actual experience. They are a tool. :)
 #164455  by Replay
 Thu Dec 25, 2014 6:36 am
You can analyze anything with statistics, and they are certainly useful. Even unadvanced basketball stats would easily tell you, for instance, that the best way to beat Shaq in his prime was the Hack-A-Shaq technique used by countless (and in my opinion, rather cowardly) teams who couldn't bust up the big man's drive in the paint; because they would tell you quite rightly that Shaq and his gigantic hands blow chunks at shooting free throws.

What you cannot do is what some statisticians believe, and always predict the future using statistics. You can't ALWAYS predict the future using ANYTHING; if life teaches us anything it is that it is dynamic, too dynamic to be predicted. Jurassic Park was a classically cheesy horror film in some ways, but in others it was pure brilliance, and that was one of them; I have found that lesson to be twice as true in real life. Especially in something like basketball, drive and heart matter as much as anything statistics will tell you, and despite countless efforts by videogames to model a "heart" statistic it's just not that simple. :)
 #164460  by Don
 Thu Dec 25, 2014 1:45 pm
The problem is a lot of the weight in statistics is still chosen arbitarily. One of the earliest example of advanced statistics was a play where Cavliers are down by 2 with about 20 seconds to go, LeBron drives to the rim, gets double teamed and passes the ball to an open guy, who misses the 3. The advanced statistics guys will say something like LeBron has 60% chance of scoring which ties the game, and then there's 50% chance the opponent scores at the last 20 seconds and if not there's a 65% chance they'll lose in overtime, versus having a guy take a 40% chance on 3 and while the opponent still has the same % of scoring in the last 20 seconds in this case you eliminate the chance of losing in overtime so it was a good play. Now I'm sure the math checks out but where do you get the numbers? What if I say I think Lebron has a 95% chance of finishing at the rim while getting double teamed and fouled and he makes the free throw at his usual % (80%?) That's certainly much better than 40%. Okay so there's some historical data you can probably get with the cameras to say that LeBron usually scores 60% of the time while 3 feet from the rim but that's not exactly the same as being double teamed with 20 seconds left on the clock with various degree of fatigue on LeBron and his defenders. Maybe LeBron was already tired out (fatigue was a serious issue early on in his career since his coach way overplay him thinking he's some kind of manga superhero who can turn into a Super Saiyan at the end) and it's worse than usual, or maybe LeBron is indeed a manga superhero who can go into another gear when the game is on the line. Or maybe LeBron uses his superstar status to get a bailout call and gets 2 free throws anyway (which would be a much better outcome than shooting a 3 at 40%). I don't think LeBron was wrong to pass, but depending on what assumptions you made about that particular situation which certainly cannot be predicted by just history you can get any outcome you want, and I think the advanced statistics guy purposely ignore this.

A recent one I've seen a lot is 'everyone on the Spurs is awesome' because their team is awesome. After all, the Spurs are very strong overall so even though a lot of their players are people you've never heard of that never amounted to anything prior to coming to the Spurs, and even when the Spurs sit Duncan/Parker/Ginoboli they still tend to do well so clearly it's because all the backups are awesome too. And then you also get the conclusion that the guy who only shooots corner 3s on the Spurs is more valuable than Parker because obviously that guy shoots a crazily efficient shot, ignoring that it's because you got guys like Parker or Ginoboli this guy is even open in the place. Last year Mavs matched up the best against Spurs (took them to 7) and Mavs's strategy was simply figure that don't help on the penetration by Parker/Ginobli because you know they're going to kick it out to the corner 3 almost all the time, and after all the 3 point shooters are still generally unathletic guys who can't get their own shots, so if you almost never help on the penetration then those guys who previously have crazy PER or any other rating you want suddenly can't even get a shot off.
 #164461  by Replay
 Thu Dec 25, 2014 3:39 pm
The only way to analyze that is to track how often the predictors blow it. I would imagine it is rather often, and such is not a perfect science. Records from past games don't mean jack when you're in the fourth quarter, tired, with sweat all over your hands, facing one of the toughest teams in your division; I promise that just from my own experience facing far less in pickup games and watching my own streaks and misses.

Besides; anything less than a certain prediction on the call leaves room to say, "Well, we said he had an xx% chance of missing it" and get very fuzzy, because quantifying how wrong you are when you say someone has a certain percentage to succeed at something and they don't is...imprecise. You'd have to use a weighting system that is not widely understood and had to be explained, no matter what it is. If there's a standard weighting for errors in that situation, I don't know it.

For example, consider the case of a 50% prediction of success on a single shot, by a commentator taking part in measures to weigh his/her accuracy.

No matter whether or not you predict success, or failure, almost by definition that guess can't affect your own rating for correctness.

Since you predicted even chances for each outcome, how can you be "wrong" or "right" either way?

You may have done detailed statistical analysis or just "bunted" to the mathematical median between the two maximum chances of success, and yet either way, the amount of effort in your prediction DOES NOT MATTER to your accuracy rating unless you weight your own prediction off the median somehow. Oh look, he made the shot. Oh wait, no, he missed. How can you be right or wrong on either one if you said it had an even chance of happening?

So, no matter what, as you said, a weighting issue comes into play, and it almost by mathematical definition has to involve how far your own predictions are off the median. I don't know how well there is a standard for that; standard deviations and other traditional measures of stat accuracy involve analyzing means of large (or at the very least non-single-data-point) data sets, not individual "how far am I off in my prediction" measures. You'd have to come up with a weighting and do some rather fancy combinatorial analysis to come up with anything meaningful - for those who have never taken a class in what it means to create a genuinely meaningful statistic, I advise taking calculus-level statistics some time; it will change your opinion if you think statistics is a science for bored coaches and sportswriters. Or just read through the full Wikipedia intro article on it, and on combinatorics, which are used heavily by serious, Wall Street or Washington DC or Silicon Valley style numerical analysts and stat geeks; the people who work weeks at a time on one problem or important issue. It's hard stuff, and I was doing differential equations before I left high school - and this problem is not unique to basketball.

Look into predictor-corrector systems if interested in the hard math theory on it. Stock traders and industrial engineers and other advanced analysts spend ages crafting these things - could be a job skill for you one day, even - and still know how imprecise the science is even given that. Crystal ball sales have always been a...speculative industry. :)

http://en.wikipedia.org/wiki/Statistics
http://en.wikipedia.org/wiki/Combinatorics
http://en.wikipedia.org/wiki/PID_controller
 #164462  by Don
 Thu Dec 25, 2014 4:09 pm
Yeah, that brings up one of the really dumb thing about statistics. I'll see a guy predict like "Spurs have 75% chance of winning this game". Well no matter how the game turned out, there's technically no way to prove this wrong as long as he didn't predict 0% or 100%. Even if I do this for a large number of games, what does this even mean? If I predicted 10 games in a row that Spurs will win 75% and they won 7 out of 10, does that mean my analysis was spot on or was I just so far off on both ends that they balance out? After all you'd think some games Spurs should have considerably better or worse than 75% games of winning in a stretch like that. The statistics guy have a decent record when it comes to predicting these things but it's certainly nothing that'd make you want to start betting against the house, not to mention predicting who's going to win in basketball isn't terribly hard. Home court does have a rather large advantage and it's usually fairly obvious who the best teams are.

By the way, the Hack-A-Shaq actually has some interesting analysis from advanced math. I remember it's something like you've to factor in:

1. You're putting a guy who probably isn't all that good to just foul Shaq, who still has to play on offense to continue fouling Shaq.
2. This puts you in the penalty immediately, which is a significant disadvantage.
3. You have almost no transition opportunities because all the offense plays for the other side is a free throw and transition opportunities almost never occur there.

I think the conclusion they get is that when you put all this together the expected value from Hack-A-Shaq is not likely better than just letting him shoot in the long run, though it's certain viable in the short run especially if he seems to be eating you up today.

The funny thing is that teams usually stop Hack-a-Shaq because he made a few free throw in a row, even though that shouldn't be why you stop doing it. If you genuinely believe that this produces an advantage in the long run you should be fouling him even when you're up by 10 because that'd make your lead bigger. Apparently a lot of opposing teams believe Shaq who claims that he makes free throw when it counts, even though there is no statistical evidence to back this up whatsoever.
 #164463  by Replay
 Thu Dec 25, 2014 4:17 pm
Hahaha, that's hilarious that someone actually put serious thought into the Hackashaq. xD

It's less that there are dumb things about statistics - the formal science is very rigorous - it's that sportscasting is one of those "kind of xx% percent based on bullshit" anyway careers, where your own success is measured by how well you entertain a zillion bored basketball fans, more than half of whom are drunk, some quite so. You and I and our emphasis on statistical rigor is...not the majority opinion. Most people watching most sports are tired and drunk after work and would snarl at you even if their day job involved statistics of some sort; our conversation here is of the "heads only" kind of enthusiasm. I have a lot of friends who do like to do things like this, but I just happen to love math an awful lot. And that introduces what is called selection bias - the study of bias, of course, is another thing that makes statistics a genuinely brain-breaking science.

http://en.wikipedia.org/wiki/Selection_bias
http://en.wikipedia.org/wiki/Bias_%28statistics%29

Try just as far as this conversation on many drunken sports fans and you'll see why broadcast statistics is not going to be winning a lot of peer-reviewed awards of the kind university statistics research has to qualify for. Basketball is kind to stat wonks, as is baseball. There are stat wonks in every sport, but there are football and rugby cultures where fans would chase you up an empty beer case and set fire to it. :)

Still better than politics though!

Image
 #164464  by Replay
 Thu Dec 25, 2014 4:18 pm
Like I said, formal statistics is very rigorous.

Broadcasting standards? Less rigorous.

 #164465  by Replay
 Thu Dec 25, 2014 4:21 pm
Suffice it to say I refuse to quantify the percentage of newscasting that is complete bullshit with a formal statistic. :D
 #164468  by Don
 Thu Dec 25, 2014 5:11 pm
I think people use the % stuff because they're afraid to make a prediction since any prediction can of course be wrong unless you got a time machine to verify the result ahead of time, but then 'X% chance of whatever' is not verifiable because we can't go back and replay the game. So if you're not willing to make a concrete prediction it doesn't matter how rigorous the underyling math is.

John Hollinger, who probably pioneered this stuff, was hired by Memphis Grizzlies to be the stat guy and he said when he was working at ESPN it'd be like 'my stats say these guys on the Grizzlies are all overrated chuckers' but when he has to work for a particular team now he had to adjust his statistics model to say something else, and ironically his model predicted Spurs were going to win in 2014 before he got hired (and I think he was the only one who predicted that out of a panel of usual experts) while the Grizzlies got bounced in the first round by Thunders. I can just imagine him trying to explain to the team something like: "Hey boss, my statistics model says the Spurs are going to win it all and we're about to get totally crushed." It's a pretty funny lesson on how your perspective changes depending on what your position is and it taught him that the stats are not as impartial as he thought when he was just a guy at ESPN who has less of a conflict of interest.

Another funny thing John Hollinger pointed out was that if you want to sound like you know your math, make a prediction like "this team will not win the championship" at the beginning of a season. There's something like at least a 75% chance you'll be right unless that team is a prohibitive favorite like Jordan's Bulls or Shaq & Kobe era Lakers. This applies to pretty much all sports with a similar format/structure compared to the NBA. I guess it wouldn't quite work for soccer because of how the top teams are able to concentrate all the talents but it's very safe for football/hockey/baseball too.
 #164474  by Replay
 Sat Dec 27, 2014 2:33 am
I agree with him. If you want any recognition as a predictor, make a habit of predictions that you predict as certainty. Then, at least if you are wrong, you have set a clear standard for how wrong you are and why, later - for the mathematical reasons I was talking about. It improves you as a prophet, so to speak, for those interested in playing that game...which is always an unclear game, but at the same time, every bit of our bodies is in some way devoted to either predicting or responding to our external stumuli. Fortune-telling is as old as mankind, and like anything else, it may be done well or incompetently; generously or selfishly.
 #164477  by Don
 Sat Dec 27, 2014 4:26 am
I don't know if it's because people are afraid to make predictions for fear of being wrong or do people actually think the guys watching are dumb enough to fall for the 'yes and no and maybe' prediction. I mean you can be up 3-0 in a best of 7 and still ad lib on why the series isn't over yet even though winning from a 0-3 position has never been done in a NBA Playoff. I remember reading an analysis on so called expert prediction and a common theme is that people tend to way overestimate the number of 4-3 games. It just doesn't happen very often even when two teams are evenly matched, probably because whoever is behind 2-3 does have a pretty significant mental disadvantage. On your 1st vs 8th or 2nd vs 7th seed, historical data suggest predicting 4-0 or 4-1 is very safe and 4-0 is probably the most likely outcome if they're indeed a typical team of that seed (so this won't work well in Western Conference because there always seem to be some really good team that finish 7th) but you won't find very many experts predicting 4-0.

Charles Barkley actually sticks to his prediction, and even though I think he's probably meant to be there for the comedic effect, I tend to pay more attention to what he says because at least he's willing to go out and say 'soandso is going to win' instead of saying like 'if they continue to make high percentage shots and not turn the ball over and defense well they'll win. Oh they didn't win? Then they obviously didn't do what I said.' A really funny example I remember is one year, one of the analyst was saying the in a best of 7 series, the better team always win, so someone asked him what happens when the underdog wins? The guy said that means that team was really the better team all along. Well, if you put it that way, there's literally no way you can be wrong. I remember the year Pistons beat the Lakers with Shaq/Kobe/Malone/Payton and I think they it was 7-1 odds if you bet on Pistons to win, but of course every expert claimed they totally saw it coming and that the Pistons was the better team even though every one of those guys predicted the Lakers will win before the series started.
 #164478  by Replay
 Sat Dec 27, 2014 6:08 am
do people actually think the guys watching are dumb enough to fall for the 'yes and no and maybe' prediction.
You are probably used to intelligent, math-literate sports fans. But sports draws from all spectrums of literacy, from PhD to illiterate.

I have occasionally met (and sometimes not enjoyed the company of, for sure) people that I could estimate might have considered stabbing somebody who brought up mathematics and advanced statistics during a game, if you caught them on the wrong day. Remember that hooliganism exists in any body-based physical sport and most certainly basketball and football. Los Angeles fans burn shit down when people WIN. This is not always a collection of university students we're talking about.

Do not assume other sports fans all share your math training. There are people who enjoy sports who dislike READING and WRITING, Don. I am not judging them, just advocating you be aware of why broadcast stats are not always sophisticated.
 #164486  by Don
 Sat Dec 27, 2014 2:21 pm
I guess my problem is that I don't like people who are in between dumb/analytical when they're supposed to be paid to either entertain or enligthen. For example I wouldn't mind if a guy says he picks Pistons to win because Rasheed Wallace is 10-0 when he guaranteeds his team is going to win. That's actually pretty funny. Or you can say historical data says soandso will probably rebound and XYZ is shooting way beyond he should so this would regress toward the main, and if it sounds reasonable and not purposely misleading that's cool too. Instead I see stuff like '85% of the teams that won game 1 went on to win a best of 7 series'. It's like yeah, nobody could've thought that winning the first game greatly increases the chance of winning the overall series. It's fake mathematics and it's not even funny. I rather read about how they ask some animal in Germany to predict the World Cup winner!
 #164489  by Replay
 Sat Dec 27, 2014 8:51 pm
Well, real math is hard. :) Fake math is easy and appeals to drunk people who want a false sense of control.