15  File names, formats, & organization; Data Storage & Backup: Instructor Notes

Modified

February 20, 2026

TipObjectives and Competencies
  • Describe and implement conventions for proper naming of files
  • Explain the difference between proprietary and open formats
  • Learn how to efficiently organize their research data files
  • Learn the preferred format for storing and archiving different types of data files
  • Become familiar with different options for cloud data storage and backup
  • Develop and implement a plan for short- and long-term data storage, back-up, and archiving
  • Learn rules and policies for data security
  • Become familiar with tools for such tasks as batch renaming of files, cloud data storage, and automated data backup.
  • Explain options for a long-term sustainable preservation strategy/policy for your data (e.g., discipline specific, institutional, departmental, individual).
  • Address the need for conversion to standard formats needed for re-use
  • Perform basic archival processes: checksum, auditing, format migration, etc.
  • Understand costs & time lines for data storage, management tools and services

15.1 Pre-Class Preparation & Materials Needed (Instructor):

Send in an email to students:

  • Confirmation of room and zoom link
  • Remind students to bring their computers
  • Make sure you know if everyone has R and instlled; level of R fluency
  • Snacks
  • Copies of the syllabus
  • Copy of Course Roster
  • Flip charts and markers
  • Dry write markers
  • Tent cards for student names

15.2 Pre-class Preparation (Students):

15.3 Class Outline

15.3.1 Quick Intro

  1. Address any questions from last week
  2. Overview of today’s activities.

15.3.2 File Format Competition

15.3.3 Snack Break

15.3.4 Breakout: Discussion of Data Security and Backup in the field

Prompt: Robin is a graduate student studying Malaria in Tanzania. The research project requires visiting communities, collecting mosquitoes from different sources of standing water (for later identification at the FLMNH and statistical analyses of mosquito diversity and abundance) and conducting semi–structured interviews in the community, where people are asked such things as their age, family income, their access to health care, if they have ever had malaria, their family income, use of mosquito nets. Answers are recorded on data sheets, but the interviews are also recorded with the cell phone for later transcription. Robin also takes pictures of houses to document standing water sources. Throughout the day Robin records observations in a notebook. Robin is staying in a house where there is electricity for a laptop. Once every two weeks Robin goes into town, where there is usually good internet access at the university.

What steps should Robin take in the field to safeguard the data collected in the field?

Breakout Discussion

  1. Group discussion of breakout
  2. Key message: assume the worst case scenario. become paranoid. embrace neurosis. then relax because the plan is in place and all possibilities have been acounted for.

Breakout 2: Backup Procedures - UF Research Group

Prompt: Please ask each other these questions and then briefly discuss and summarize the responses. Keep the answers anonymous. The goal is to document the range of answers to these questions, so you can summarize or give more detailed answers about individuals as needed.

  1. Does your advisor / lab / research group have formal policies that govern your data storage, protection, and backup?

  2. Are you currently backing up your research data?

  3. Do you have a ‘backup plan’ document?

  4. How frequently are you backing up your research data?

  5. Is it Partial (incremental) or full backup?

  6. Where are files backed up?

  7. What metadata accompanies these backups?

  8. How do you verify that a backup has been successfully performed?

  9. Have you ever attempted to read data from older backups?

  10. Have you ever had to restore a file from a backup version?

  11. Are you working with data requiring additional protection due to privacy or security concerns?

  12. If so, what are the additional safeguards you have implemented?

  13. Where and how are your original data collection instruments or samples stored/protected?

Collect answers and submit with Assignment 2

15.4 Messages to instill:

a. a sense of paranoia that everything that could go wrong with notebooks, datasheets, samples, and their backups will - fire, lost mail, flooding, demonic intervention - so that we plan for all scenarios. Can now photograph records, backup, leacve portabl;e hard drives, abc samples, deposit in collections, take a portable scanner to the field...whatever.  As researchers and grad students there is plenty to stress about as it is, lets, takle this stress away so that you can stress about other stuff that "matters". 
b. 'Focused Laziness': we want to do this in a way that is as automated (and automatic) as possible. This will mean less work! And that means more time to do other things, be it research or relaxation.

15.5 Assignment for Submission

This week’s assignment is designed to: (A) get you thinking critically about your own data backup and security procedures, and based on that reflection (B) prepare a data backup plan (if you already have one you can answer the questions below based on that plan). The assignment comes in four parts:

  1. Sign up for a UF Dropbox allocation: https://cloud.it.ufl.edu/collaboration-tools/dropbox/. If you are not automatically eligible to sign up, your advisor can request that you be granted an allocation: https://cloud.it.ufl.edu/media/clouditufledu/How-to-Obtain-Access-to-UF-Dropbox-for-Education.pdf.

  2. Sign up for a UF Google Suite Allocation: https://cloud.it.ufl.edu/collaboration-tools/g-suite/

  3. Briefly describe how you are adhering to the 3-2-1 backup rule for data related to your thesis. If you are not, describe how you will do so moving forward.

  4. Prepare a brief (~1 page max) Data Backup Plan for your thesis research. Include the following information; you can respond with bullet points unless more detailed answers are needed:

    1. what needs to be prepared and for how long;

    2. where backups are located;

    3. who can access backups and how contacted;

    4. how often data should be backed up;

    5. what kind of backups are performed;

    6. who is responsible for performing the backups;

    7. hardware and software used for performing backups;

    8. how / how often to check if backups is successful;

    9. the media are used to backup data;

    10. a list of any data that are not archived or backed up.

Submission and Grading Rubric

  1. Submit the Data Backup Plan document (in either .txt or .pdf format) via Canvas.

  2. Grading Rubric

    • Assignment completed with thorough answers: 50
    • Most questions answered completely; some require instructor follow-up: 40
    • Many questions missing answers or answers are cursory: 30
    • Instructor follow-up required for homework submission: 20

15.6 After class:

  • Be sure you complete and submit the assignment by deadline
  • Prepare for next session (assigned reading, videos, etc).