Yes, I can see it's not a trivial exercise to add this extra nuance to each of the series in each of the GB rank group plots. First there is the capturing of direction of travel in the WTA ranking from week to week for each player. Then the use of that variable to modify (potentially) each data point icon in the plot series for a player, within the overall GB rank group comparison dataset.
I started tinkering a bit with python/matplotlib, and while it's possible to change icon colour within a plot series, icon types are more tricky. I saw a solution where the different icon types are plotted as separate groups, but then the player data series are lost (I think). I wonder whether a simple solution would be to just to overplot with both down/up series of 'icon modifier' symbols for the entire GB rank group as one, but then this might be like trying to combine two datasets into one graph ...
I can see this is going beyond the original intention of the chart in question, and hear what you say about the charts long term upkeep.
Is the wta ranking dataset freely available, it'd be nice to keep pondering the above with the actual data, rather than trying to create a mock-up dataset?
First, Here is the data I used to genenrate the intial viz - any other data was generated in-engine in Tableau from this raw set, which covers GB women 2017-2018 YTD: https://pastebin.com/4mYDhkZ6 That should be a csv format directly usable in other programs. Let me know if any problems, and I'll upload an actual csv file somewhere rather than a paste.
At risk of having this become a charting theory thread - and so the last post of this type on the thread - here is another run at my multi-colour chart that, in light of peoples critique about being unable to clearly distinguish similar shades, ignores colour theory altogether and just picks colours individually one by one until they 'look right', or at least, distinct throughout.
Next chart, if there is ever another chart, will try to be both tennis related, new, and 'interesting', or 'valuable'...
AliBlahBlah, thanks for the 'GB women 2017-2018 YTD' csv format dataset, and for pointing me in the direction of the Jeff Sackman's Github data repo.
Apologies for temporarily lapsing into Martian speak earlier , I'll try to keep my tennis hat on in future.
Having said that, I do like that multi-colour chart of yours. Stacking all the GB rank groups one on top of the other like that is very informative. Just staring at the player progression across the rank groups, it's easy to imagine the hard work, the elation, disappointment, the all-out fight that the players are putting in over the course of their careers, and reflected in their WTA ranking. I quess if GB had a top clay court specialist, it might even be possible to discern the effect of the clay court season in the chart.
Yes much better ABB... Now one can track the rise and fall of players through since Jan 17. For instance, I hadn't realised that Tara was as high as GB #4 then. And Katie B was only GB #10 then, now just 20 months later is GB #2. Gabi wasn't even in the top 10 then too. Nor was Katie S, though she first appeared in the 10 before Gabi... So great, the colours do really work now and allow one to track players through
But if you had red/green colour blindness you may struggle with some parts of the chart.
Yes
This is not the only article I read about it - only ten shades in the colour-blind approved palette.
Another source for colour research is that of Cynthia Brewer. The Wikipedia page on her https://en.wikipedia.org/wiki/Cynthia_Brewer is well worth reading and there are extensive links to her work and related themes.
But if you had red/green colour blindness you may struggle with some parts of the chart.
Yes
This is not the only article I read about it - only ten shades in the colour-blind approved palette.
Another source for colour research is that of Cynthia Brewer. The Wikipedia page on her https://en.wikipedia.org/wiki/Cynthia_Brewer is well worth reading and there are extensive links to her work and related themes.
My first exposure to colour theory came from maps, in geological maps that plot mineral deposits in section. Modern charts from massive data sets are designed very much like maps, and especially so given the prevalence of geographical and GIS data to plot geographic and political geographic data.
ABB, I thought I would share my first beginner's attempt at charting some of the ranking data (Jeff Sackmann's dataset) after having loaded it into a postgre database, and using python/matplotlib to do the data extraction and plotting.
This chart shows Katie Boulter's rise up the WTA rankings ladder (though it's not completely up to date ?). I'm thinking the plateau'd sections are the close season/clay court season combined?
That's nice, it all looks to match up with what I get from the same data set Crosses remind me of error bars though, so I'd use dots or a broken line. I'd also make the reference lines a bit lighter to allow the data points to stand out more.
To attempt to look at the plateaus you mentioned, I've overlaid the clay season for each year that seemed relevant to more readily identify the movement during that period and after it. The effect seems stronger on Gabi's chart, so I've plotted her on the same chart for a comparison. The thing I love about Tableau, is that once you get the mechanics down, then adding data is easy. For example, though I'm only showing two players here, with a single user filter, I can now enable the choice of any player in the entire data set to be plotted on the same chart in the same fashion. Want to look at Jo Durie v. Annabel Croft, just select ther names, and it overlays that data, and you can make bespoke it to your line of enquiry. It works for all nations, too, and you can add nation select as another Dimension easily as another filter. As I have no programmatic skill, and doing this sort of thing in Excel is a notoriously thankless task, this sort of functinallity is a godsend. The bad thing is, that getting the mechanics in place is often difficult, and there are no debugging aids.
Anyway, here's Gabi & Katie vs. the clay.
What might be instructive is: for any player, plot their ranking history, and for each point, show from which surface(s) (in some cases more than one event will contribute to the weeks points gain/loss) the points were earned or lost. Then try to use that to show... something. It's the sort of thing I think is interesting, until I try it and discover it's worthless, or, at least, no more useful than just showing the match win % by surface for any given player/year, in a table.
Your programmatic approach should allow you much more flexibility to do creative things I know there are things I'd like to try, but the tools I use don't have the functions. I understand that Matplotlib can be tailored to most anything, given enough time
ABB, I thought I would share my first beginner's attempt at charting some of the ranking data (Jeff Sackmann's dataset) after having loaded it into a postgre database, and using python/matplotlib to do the data extraction and plotting.
This chart shows Katie Boulter's rise up the WTA rankings ladder (though it's not completely up to date ?). I'm thinking the plateau'd sections are the close season/clay court season combined?
Any comments welcome
Hi foobarbaz, yes this is great to see... By the way, once you've uploaded the image, you will find you have a next set of choices to insert or remove the image, if you click insert, then it will insert onto the page as ABB is doing.
As for yours ABB tableau looks like a wonderful programme use, and something I am very ignorant about, but if it enables you to select and display from excel and other data easily then yes I would be very happy to have the ability to use it!
-- Edited by Michael D on Sunday 19th of August 2018 07:51:53 AM
AliBlahBlah, and Michael D, thanks for your very helpful replies
Michael D, I hadn't noticed the option to insert the image properly into the post, I will definitely look out for that next time . I'm a bit reluctant to post too many charts while I'm still new to this, I certainly don't want to dilute the quality products already here in this section.
AliBlahBlah, I will definitely look to make the improvements you suggest to the plot/series/marker styles, and I can see your chart has a more polished appearance (the control of axis tick labels, legend, etc). It was a rapid turnaround for you to re-make the chart, extending it with multiple players, adding the clay court season bars, and also write both suggestions for improvement, and ideas for further study. I don't know how you did it , as you say Tableau must be a very handy application that once set up enables a person to focus more on the data analysis and it's presentation, and less on the charting mechanics. I'll carry on loading the match data, and see if I can locate the more recent ranking data too. It's fun, this data analysis stuff . Not as much fun as playing racquet sports (tennis,badminton, etc), mind you
It's interesting that the chart shows that both players had to endure a significant dip in their progress, before emerging stronger than before. A sportsperson's career, while it can be exciting, is not without potential pitfalls.
ABB, I finally managed to put 2 and 2 together to make 4, ... and now realise that's why you put the recent WTA rankings data in the pastebin file link in your initial post. Thanks for making it available, I saw that the attribution in your chart included the WTA too.