I’m a person that likes to keep track of their finances intently, along with someone that likes a lot of data and looking through that data. Unfortunately, these two desires start to fall apart once it gets to my retirement. As a federal employee, I have the “Thrift Savings Plan” for my funded retirement savings. The TSP is great in a lot of ways; for instance, it’s got expense ratios on it’s funds that are an order of magnitude lower than the lowest funds elsewhere (0.027%). It keeps these expense ratios low (along with the expenses to the government) by limiting a lot of things that you might find in other private retirement systems. For instance, we have only 5 funds to invest in, and another 5 “lifecycle” funds that invest in those 5 funds adjusting their allocations automatically.
One of the main things that bugs me about it, though, is that there’s no way to see the data. You get your quarterly statements to see your performance, but that’s about it. Unlike funds that you can find data for on Google Finance, you can’t see your TSP fund’s individual or combined performances during different periods. I can’t see how my retirement funds react as a whole to world news, economic cycles, or even compare it simply to my brokerage account or IRAs.
This is about all the information I can get, and that’s only from the lifecycle funds:
So, I set out to create a solution. The first thing I wanted was to get historical prices for the TSP so I can just simply graph them. The TSP, luckily, publishes the “share prices” of their funds on their website publicly, and have done so since June 2003. This means that I will be able to access this pretty simply.
The TSP site only displays this data 30 days at a time, however, and I’m not about to do that by hand. So, I decided to extend my learning of Python – which I have lately dabbled in at work – and create a script that will gather the data automatically.
Originally, I thought of using Scrapy, a Python-based library for web-scraping (essentially what I’m doing). This would all be fine and good, except that I ran into one problem: I have to control the buttons to access data further back than 30 days. These controls passed their data blindly to the user, and didn’t use URL query strings. This made Scrapy, at least on the surface, look like a bad choice. I would have to control these form controls manually, so I decided to use a combination of BeautifulSoup, html5lib, and urllib2.
Since I didn’t find another product or service similar to this, I decided it would be a good idea to host this open-source. At worst case, I would develop it and it would be able to help people who stumbled upon it. Best case, I get someone that can help me build this and make this into a better service. After some research, I found that GitHub seemed like the perfect place to host some open-source code. You can get the source code here: https://github.com/elaske/tsp-data
If anybody has a feature that they can think of that might be a useful addition, a comment here, or a comment / issue on GitHub will be the perfect forum.