One of my goals this summer is to get better at Stata programming. I’m mostly self-taught, as are most economists, and I’ve worked with different people who have different styles and I’ve definitely noticed there are things that are helpful in writing code and things that are not so helpful. I’ve also been picking up tricks that I should have learned years ago.
My goals are two-fold. I want to be more organized and I want to be more efficient with my programming.
Two books I recommend: 1. A Gentle Introduction to Stata. This is actually a great intro-to-programming book that mainly only goes over basic stuff. (#2 likes it, and has it from the library.) But I’ve managed to pick up some good tricks from it (numlabel _all, add FTW). 2. The Workflow of Data Analysis Using Stata— this is a really great book for thinking about how to organize, comment, label, etc. etc. etc. your, um, Stata workflow. It says a lot of stuff I already knew, but haven’t been acting on, and puts it all together in a way that I hope to be able to act on.
So what does this mean specifically? First, I’m being much better about commenting my code, particularly the part at the top that says what the purpose of the .do file is. I’m also getting better at consistent names– my previous system would have, for example, multiple “Table 2″s every time the tables in the paper would change (#2 is shocked). Now I’m better about saying things like, Table_2_SOLE, which would be the version of Table 2 back when the paper was presented at SOLE, and I have better more informative table names for things that aren’t official paper tables.
I’m also trying to do a better job of keeping my current files in one folder and moving out the older versions so I don’t accidentally use an older version after I’ve fixed a mistake. Recently that has caused me some embarrassment that a referee noticed. I’m getting a bit better about dating files as well. #2 tends to change the name of the files to something like “data analysis project X OLD DO NOT USE” and “revised data set USE THIS ONE”. Also #2 uses dates, but I don’t find them as useful as they should be.
In terms of programming itself, two of my big goals are to start using loops automatically instead of cut/paste/replace automatically. I need to get more practiced at them so I don’t have to look up the code each time. (I’m proud of myself for finally figuring out which `’ to use when in the loops!) I also want to start using locals more, which is again something I tend to cut/paste/replace for when I really should have shorter and cleaner code that just changes the local.
It’s a bit embarrassing that I’m just making these changes now, but as always, I remind myself that lesson I learned in graduate school– later will be even more embarrassing, so given that before is sunk, now is the best time to figure out something I should have figured out a long time ago. #2 adds, it’s never too late to improve your workflow and versioning. I’m trying to make mine better all the time.
Do you have any self-improvement things going on in your life?