Blob I see your frustrations, I hadnt really thought of considering that level of accuracy, the enemy of best being better, but yes the recording of the additional points acured in qualifying and then potentially lost again is a significant challenge, your capacity to extract data in an automated fashion is a skill I dont have so not something I had thought through. I do think a prospective calculation of the difficulty of a tournament based on the draw (or even the acceptance list) and ranks of seeded players is doable and interesting.
So Canberra 60k last week degree of difficulty to score equivalent points, capped at the maximum achievable at 25K would generate a score of
1.187 for last weeks 60k (0.47+ 0.2+ 0.185+ 0.09+ 0.083+ 0.080+ 0.042) 0.71 for this weeks 25k (0.277+ 0.156 +0.72+ 0.77 +0.035+ 0.034+ 0.031+ 0.028)
1.138 (0.48+ 0.242+ 0.111+ 0.106+ 0.052+ 0.051+ 0.048+ 0.048) for this weeks 25k at Santa Margherita Di Pula a potentially very tough tournament with a number 1 seed ranked 104 indeed winning the last match to convert 30 to 50 points tougher than converting 29 to 48 at the Canberra 60k in contrast that conversion at this weeks Ausdie 25k a comparative piece of cake
-- Edited by Oakland2002 on Sunday 25th of March 2018 02:08:56 PM
All European ratings are based on a set formula depending on the ranking of the person you beat, and some countries have negative points if you lose to someone ranked worse than you. The rankings are done in ranges so, effectively, if you are ranked 400 and you beat someone 300-400 you get a times one bonus. If you beat someone ranked 200-300 you get a times two bonus, 150-200 a time three. (And it works in reverse for the negatives). I think a bonus system adds a nice touch while not messing up the basic structure.
I couldn't figure out a way to chart-ify this sort of table.
It also seemed weird to have Jo last given the small incremental change at the top end has massive impact. I tried to normalise this over some sort of curve function, but couldn't do it because my math only goes about as far as Sesame Street teaches you.
I also can't think in this instance, and all others, how do you deal with UNR players? I give them 9999 which works for handing them, but extends scales out a long way, and makes proportional calculations like this very misleading. So, generally I have just excluded them for ease and laziness but they should be counted somehow.
I'd like to do it on a rolling 52 week basis.
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
I'll look at the additional quantitative scoring system over the coming days.
I'm still trying to get the National Strength ranking factored by age of player working
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
One way to cope with new entires would be to set an initial ranking equal to one more than the lowest ranking in the WTA table.
But the number of ranked players varies considerably over time; there is a general long term trend towards an increasing number of ranked players. Would this not provide a false impression that beating an UNR coded as WR[last + 1] at one point in time was somehow equivalent to beating an actual ranked player of that rank 6 months down the line? If so, is that equivalency actually real? For example, at the extreme, a 14yo local WC compared to a player that demonstrably at least earned three points in separate events. I'm not sure.
Also, I think when the new tour is introduced next year, the likelihood seems that the total number of ranked players will go down (considerably?). This seems to create a lot of players as UNR with a great deal of variation between them. It's an additional complication.
I can certainly appreciate the simplicity and elegance of this solution. As long as everyone knew that was where the line was drawn it would be fine enough. But, that is not the way tennis stats tend to work in my experience. We want to deconstruct the rankings of every player (been injured, ill, otherwise not playing, junior, college, only been playing soft events etc - always a reason why a ranking isn't really apparently a 'true' ranking).
I wish I were smarter, and less prone to over-thinking and prevarication; just make the decision and plough on.
Half-Empty.
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
You would need to use VBA to go through each row and apply a specific modifier to each rankings band. Have VBA do the hard work and automatically generate the graph or chart
Here is the chief reason why I think a metric of value predicated on seed rankings alone is problematic: Seeds don't win a third of tournaments. The scope for this table is every ITF event played in calendar year 2017:
And heres how that means the regions compare to each other per level
So, ignoring the non-seed part of the draw leaves out an important factor determining the srength of the event. This is why in the weekly field strength comparisons, I've started plotting the scatters of median ranks of seeds v non seeds on scatter plots to show the relative placement of each event compared to it's peers at it's level. The intersection of seeds and non-seeds is the useful marker.
The other method certainly shows a facet of it, along a consistent scale. But, is it the right scale?
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
You would need to use VBA to go through each row and apply a specific modifier to each rankings band. Have VBA do the hard work and automatically generate the graph or chart
If I knew VBA My charts are built in such a way that they do update automatically with out any VBA or macros. I use dynamic ranges linked to tables. As soon as I paste in new data, the formulas update, and any Pivot Tables or charts linked to them also update. e.g. paste in the new weekly rankings, the 'Strongest Nation' table is ready 2 seconds later, after it re-calculates to include the new data added to the table.
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
We're winning more than half the matches we play We did this up to this point in 2017, but through the clay, grass seasons, and to the end of the year, we fell well behind. Can we maintain it this year?
No GB Rank updates in the second week of Miami, as the rankings haven't changed
Raw numbers don't compare to last year well, but that's due to the extra GB even in 2017.
In terms of productivity per result/entry though, we're well up.
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.
When there's more a larger number of usable data points, I might break this down by country too, but I suspect country variation will not be as pronounced as Regional variation.
__________________
Data I post, opinions I offer, 'facts' I assert, are almost certainly all stupidly wrong.