7 Useful resources for Data Collection & Managment
7.1 R Programming
7.1.1 Essential
Hadley Wickham wrote a book on using the tidyverse and the online version is FREE. This is a phenomenal resource on using R to import, tidy, and visualize data.
Posit Cheat Sheets: help with commands for using the different
tidyversepackages, RStudio shortcuts and tricks, help with R commands, and more. You definitely want the ones for Data Import, Work with Strings, Factors, Data Transformation, and Base R.Where and How to ask for help
- Hadley Wickham’s advice on how to write a good reproducible
example for getting help with R
- how to post good questions on StackOverflow
- The UF R-users listserv is very user friendly and a great place to post requests for help.
- Hadley Wickham’s advice on how to write a good reproducible
7.1.2 Tutorials and Books
Software Carpentry: Using RStudio for Project Organization & Management
Kieran Healy’s Data Visualization: a practical introduction is my favorite introductory (yet super-comprehensive) book on data visualization with R. If you scroll down to the bottom of the page you can download the datasets and code used to make the figures in the book, which makes life much easier.
ROpenSci: tools for accessing, manipulating, and visualizing open data
Learning R
Swirl
8 Specific Problems in Data Cleaning and Managemnt
Text Mining:
tidytextpackageWorking with Qualtrics survey data with the
qualtRicspackageOptical Character Recognition (OCR): extract text from images:
tesseractpackageExtract text & metadata from pdf files:
pdftoolspackageImage processing in R: the
magickpackage
8.0.1 Advanced R Packages
DataCuratorpackage: ‘a simple desktop data editor to help describe, validate and share usable open data’.RegExr: online tool to learn, build, & test Regular Expressions (RegEx / RegExp)
janitor (cleanup of file names, etc.)
knitroverview: reproducible documents with RqualtRics
## Discipline-specific Resourceshistorydatapackage: Sample data sets for historians learning R. They include population, institutional, religious, military, and prosopographical data suitable for mapping, quantitative analysis, and network analysis.The Programming Historian Website: wide range of topics, from text analysis to OpenRefine
8.1 Slide Presentations in R
8.2 Data Archives
8.3 Text Extraction and Organization
8.4 Form Design
8.5 Data Security
UF Office of Information Security and Compliance
Cyber Safeguards for UF
UF IRB
UF Data Classification Policy
UF Office of Information Security and Compliance