8 min read

How (and Why) I came to Use R for Data Analysis and Evaluation.

I've been using R for data analysis and evaluation since 2014. (It's my 10 yeaR anniveRsaRy!) Back then, I was in graduate school and needed a job, so I applied to a graduate teaching assistant position for an introductory statistics course in the Psychology department. During the interview, the instructor asked if I knew how to use R and I told him I had no idea what he was talking about. He looked at me and said, “If you want the job, you'll need to learn R, that's what I use to teach this course. There's a learning curve, but as long as you follow the code examples in the textbook, you'll be alright.” I needed the job, so I said yes, and he handed me a hard copy of the OpenIntro Statistics book.

I had no idea what I was getting myself into or how much R would change my life.

My job was to assist during the lecture portion of the course and to lead the lab instruction. The lab provided students the opportunity to practice statistical concepts using R. I usually was a week ahead of students, sometimes working through errors in my code and getting the lab examples to work the night before I taught the lab. It was frustrating, difficult, and I often wondered, as our students did, why we were not using something easier, like SPSS or even Excel.

But I persevered, stuck to it, and realized how much more powerful and useful R was compared to other statistical software. The semester went on and things got easier, students got the hang of it, and by the end of the semester we were working with and interpreting results of linear regression models in R.

My job ended at the conclusion of the semester, but I continued using R in my graduate courses and in subsequent jobs. I used it to analyze data for my graduate capstone project and market research information for various businesses opening in the RGV. I also used it to clean and prepare data extracted from a grants management system. I created reports using RMarkdown. I used R to clean and prepare data to load to Tableau. And I used R to access, clean, analyze, and visualize Survey Monkey results.

Why R and not something else?

For me, it comes down to four reasons (though there are many more):

  1. Reproducibility: R is designed with reproducibility in mind. Scripts and code can be easily shared and rerun, ensuring that analyses can be reproduced by you or someone else. I often reuse code when I'm beginning a new project that requires a similar approach to a previous analysis. All this is crucial for collaboration, reproducibility, and efficiency in professional settings.

  2. Efficient Processes: R offers a wide range of tools and packages that streamline data analysis workflows. This makes data manipulation, cleaning, and analysis faster and more efficient than manual processes or less specialized tools. And, it reduces human error. How many times have you had to restart an analysis because new data came in? Or, how many times have you had to copy/paste an updated chart or graph into a report because the data changed? Not with R.

  3. Clarity and Transparency: R allows you to document your code and workflow in a way that makes it easy to understand months or even years later. The ability to annotate scripts and create detailed markdown documents means you can always trace your steps and understand the logic behind your code or the code of one of your team mates. This happens frequently. You work on an analysis, submit the report, and move on the next project. Months later, you need to revisit that same report. It's so easy for me and others to understand how and why I did things in the past.

  4. Extensive Package Ecosystem: The Comprehensive R Archive Network (CRAN) hosts thousands of packages that extend R’s capabilities. Whether you need to clean messy data, perform complex statistical analyses, create stunning visualizations, or connect directly to data sources like Survey Monkey, there is likely a package available to help. Thinks of R as your cell phone, and packages as phone applications. If you can dream it, there's probably a way to do it in R.

R will radically change the way you work. It will save you time, money (yes, R is free!), and headaches.

Interested in Learning R?

I didn't magically learn how to do all this. The best thing about using R is the R community. Every day, hundreds of R users share their use cases and sample code; publish blogs, articles, and books; and help one another when some has a question or gets an error in their code. There are thousands of videos on Youtube with R users sharing examples of how they use R in their own use cases. Major conferences across various industries usually include presenters sharing R applications within their domains.

I owe my skills and knowledge to so many who have shared their own knowledge with the world. Aside from the OpenIntro Statistics book I mentioned, I refer to the following resources regularly. If you're interested in learning R or just want to know more about what R is and what it can do, I recommend starting with these resources:

  • ModernDive: Statistical Inference via Data Science. This is such a great introduction to the data analysis cycle and gets you up and running analyzing and visualizing your own data. You'll be hooked before you know it.

  • R for the Rest of Us. If there is one resource that showed me I could use R for anything, and that I didn't need to have a quantitative / statistical background to apply R in my work, it is this one. R for the Rest of Us creates some of the most engaging content and courses that will help you get started in your R journey. From introductory courses to more advanced courses on data cleaning, mapping, and dashboarding with R, David and his team at R for the Rest of Us will have you wondering why you didn't convert to R sooner!

  • The Data Visualization Academy by Stephanie Evergreen. Ok, so if you want to create stunning charts, graphs, and reports in R, this is the place to learn. Stephanie's Data Visualization Academy walks you step-by-step on how to create more than 60 different chart types. Thanks to Stephanie, I've received compliments on my data visualization skills and learned how best to present data to decision makers. You'll be a data viz rockstar, using R, before you know it!

If you're looking for something a bit more advanced, take a look at these options:

  • Text Mining with R. If you work with massive amounts of qualitative, text data, this is an excellent resource to get you started making sense of some of that data. I work a lot with surveys, and surveys typically include open-ended responses. The tidy approach to text mining has been super helpful when I'm looking for common themes across survey responses. You'll even learn how to conduct sentiment analysis on your text data. Will you still have to read through these responses? Absolutely! But, this approach can get you started with your theme generation in a matter of minutes, which can then help you dig deeper into your text data.

  • Analyzing US Census Data. Have you ever had to pull Census data for a grant application or needs assessment? Navigating the Census website can be a nightmare. This book teaches you how to programmatically access, analyze, and map U.S. Census data. The book covers a few more advanced topics such as spatial data analysis and geographic data modeling. Long gone are the days were you downloaded dozens (or hundreds) of individual tables from the Census website.

  • Jessy Lecy is the Senior Data Science and Research Associate at the Urban Institute and the Director of the National Center for Charitable Statistics (NCCS). He is also an Associate Professor in Nonprofit Studies and Data Science at Arizona State University and the founding director of the M.S. in Program Evaluation and Data Analytics program. You can find all sorts of resources related to data science and data repositories for the public and social sector. Andrew Heiss is an Assistant Professor in the Department of Public Management and Policy at the Andrew Young School of Public Policy Studies at Georgia State University. He teaches a variety of courses on program evaluation, statistics, data science, and data visualization. Both of their websites provide the ability to access various course materials. If you want to learn how to use R for program evaluation, these are two of the best resources out there.

I don't consider my self a programmer or data analyst. My educational background is in history, political science, public administration and public policy, and evaluation. I am an evaluator, specifically within the philanthropic and nonprofit world, that happens to use R as the primary tool for data management, cleaning, analysis, visualization, and reporting. R has allowed me to work smarter, faster, more efficiently, and focus on creating and sharing knowledge and learnings with my peers.

If you're ready to embark on this transformative journey, feel free to reach out, or explore some of the resource I shared above. I'll be your biggest champion!

Fun fact: this website and blog were created using R! So, add website creation to the list of things you can do with R.