I appreciate organization in most aspects of my life and my digital footprint is no exception. I’m always looking for ways to optimize my workflow, I’ll try to keep this post updated in the future. I use git to organize my work files - including text documents, presentations, and code. Git allows me to work across multiple platforms and control my versioning with decent precision. I work with sensitive data files so I always have git ignore data files (see gist).
I recently link posted on submitting Stata jobs to a Linux computing cluster running the Oracle Grid Engine. Here’s a quick post on how to submit a R job. I usually submit a qsub job by writing qsub Scripts/NAME_OF_SCRIPT into terminal. My R scripts use the following naming convention: R<PROJECT>_v<NUM>.sh or R_018v1.sh for a R bash file to run the v1 R script in the 018 project (I explain my project organization in another post).
Some statistical jobs are either too memory-greedy or computationally intensive to run on a local machine. At the Johns Hopkins Medical Institutes (JHMI), researchers have access to a Linux cluster running a Oracle Grid Enginge (previously called the Sun Grid Engine). Jobs on the Joint HPC Exchange (JHPCE) can be run interactively with the qrsh command or through a qsub bash submission. JHPCE also has Stata-MP installed so that’s another reason why I use it for larger jobs.
You might not think that programmers are artists, but programming is an extremely creative profession. It’s logic-based creativity. - John Romero I hope to use this space to post code snippets to make my life and the lives of other statistical programmers easier. As an epidemiologist, I use data from national registries and clinical trials to answer interesting questions in public health, surgery, and medicine. I code mostly in Stata and R, but I’m always interested in optimizing my workflow and learning new techniques.