top of page

2020&2021: Factbook Automation Project

#R #MS Excel #ggPlot2 #automation #data visualization

§ Section 1. Project Title and Overview

Factbook Automation Project


§ Section 2. Purpose and Need

Research and develop an automated program to help reproduce the CSSEA Factbook with R and ggplot2.


§ Section 3. Business Divers and Significance

To produce various Factbooks on the data that CSSEA collects is one of the tasks of CSSEA each year. An automated program will save CSSEA great time and errors from human mistakes as the factbook involves a large number of charts and numbers.

It also improves the reliability and efficiency in producing charts and graphs with R compared to the traditional method with Excel. R can process a larger amount of data at a relatively faster speed than Excel.

I also write a user guide to help others use the program in the future.

§ Section 4. Benefits and Costs

Benefits:

Save labor from repeat works each year

Would be able to reuse the same program

Increase efficiency and reliability

Quick and fast

Costs:

Me working on the automation project

One analyst maintains and modifies the program


§ Section 5. Implementation Method

I created the program mainly with R, specifically, GGplot2. I managed to duplicate the same charts and graphs as the past year’s factbook with the same dataset and used that for the current year’s factbook with the latest dataset. The program will automatically work with the provided dataset. It cleans the data, find the variables needed, renames the variables, produces graphs and charts, accordingly, assembles graphs and charts to one page, and assembles all the pages into one complete pdf that is ready to print.


§ Section 6. Timeline

I worked with my supervisor on the project. He helped me better understand the charts and graphs, as well as supplied me with the right datasets. I independently write the whole automation program. It took me one month to work from home to finish developing and testing the program.

Power in Numbers

60

Pages

250

Charts

3500

lines

Project Gallery

bottom of page