I've been planning to write about this for quite some time. By the end of April I'll finish my second semester studying mathematical statistics, and I have already studied some econometrics before and I'm starting to "get it" (although I'll admit that I need to improve).
I would like to start up some kind of project of my own, doing some basketball data analysis. Before starting out, I plan to gather as much available data from the internet as possible.
What I am planning to do:
- Create a database from online available data for input.
- Since I don't have the hardware (or the money) to share this data, I plan to share the scripts that build the database.
- Do a methodology research on current analytic models.
- Do some modelling.
- python3 for fetching the data and generating the database
- postgresql for storing and managing the data
- and R for analysis.
- parsing play-by-play data
- and parsing shot charts (I promised this to vjl)
Whoever's interested could write me here or if there's a PM possible here somewhere, drop me one and I'll send my email address.
The project is a semi-longterm one, that is, I'd like to come up with some results by the end of August. I hope something nice will come out of it for all of us.
PS. (Only very loosely related) If you haven't already known: www.quandl.com is something of a must for anyone interested in data.