Data Curation at the University of Michigan
What I've learned as a data curator with the Institute for Social Research
The Inter-university Consortium for Political and Social Research (ICPSR) is an international consortium of academic institutions and research organizations, maintaining the world's largest social science data archive. ICPSR is a unit of the Institute for Social Research (ISR) at the University of Michigan. 

Researchers submit studies to ICPSR so that they can be included in the data archive, providing access to other researchers in the field. Before a study is added to ICPSR's data archive, it must be curated.

I have been working as a data curator since July of 2024 (ah yes, the joy of two jobs: this one, and my fellowship program!). Data curation is the process of enhancing, organizing, cleaning, and documenting data. This involves a meticulous process of reviewing data content, running scripts, poring over documentation, manipulating data, writing code, and checking for quality. ICPSR lists even more tasks involved in curation (AKA "data enhancement") here
Highlights:
- Python scripts to edit and execute SAS syntax in bulk (looping)
- R scripts to build ph files in bulk (ways of avoiding extremely tedious processes, i.e. building ph files manually for more than 50 datasets)
- Learning more programs, i.e. SPSS, SAS, keeping my bash script/ Unix chops up,...


Pull example syntax but anonymize it
(Recreate functions with anonymized variables as an example of work)

Also mention specific data topics I've been focusing on
Back to Top