Workshop

Course Header


Presenters

Overview

Obtaining information about a large population by selecting and measuring a sample from that population is a common practice in epidemiological investigation. In this context, the use of complex sampling strategies is ubiquitous. However, the use of these strategies – such as clustering and stratification – has implications in terms of how the data should be analysed and how the population estimates are to be interpreted. A deep understanding of the links between sampling strategy and realisation, sample sizes, analysis techniques and accuracy of the population estimates is essential for optimal study planning and, similarly, for secondary analysis of data previously collected.

During this workshop, we will conduct a series of ‘virtual surveys’ on a simulated human population of approximately 160 000 individuals inhabiting a hypothetical territory. The participants will be able to interact individually with the virtual population (‘visit’ regions and villages, sample household and individuals, administer questionnaires and collect responses) through a graphical web application (the SurveyLab) to which they will be given access during the workshop. The results of these experiments will be used as ‘case studies’ to get a practical understanding of the pros and cons of alternative sampling strategies and statistical methods, and how poor choices can lead to profoundly misleading results and/or waste of resources. We will use the data gathered during the virtual surveys to illustrate alternative approaches that can be used to analyse this kind of information.

Target audience & prerequisites

The target audience for this workshop are researchers, students, and professionals who want to acquire a deeper understanding of issues related to the analysis and interpretation of data gathered from samples of large populations. Previous knowledge or experience with survey data analysis is not required, but participants are assumed to be familiar with basic statistical concepts (such as bias, precision and probability sampling). A (short) recap of the statistical foundations of the techniques applied will introduce the workshop. Data analysis examples will be presented using R environment for statistical computing https://www.R-project.org/, and some previous exposure to this programming language will be helpful. Participants will be advised to bring their laptops with a working version of R installed. Details on the packages to be pre-installed will be communicated to registered participants ahead of the session.

Learning objectives

From this workshop, the participants will acquire:

  1. Understanding of the links between sampling strategy, sample size, analysis technique, and accuracy of estimates in population research.
  2. Understanding of the statistical and practical implications of using complex sampling schemes, including clustering and stratification.
  3. Knowledge of alternative approaches for analysing survey data.
  4. Practical insights on collecting and analysing sample data, gathered by conducting virtual surveys on a simulated human population.

Outline

#

Session

Content

Day 1

1

Introduction

Learning objectives, methods, housekeeping rules.

2

Statistical foundations

A recap of the statistical foundations of survey data analysis.

3

The research question

Define a research question for a study to be conducted by the participants in the virtual environment.

4

Sampling strategies

Approaches to sampling

5

The SurveyLab

Presentation of the SurveyLab virtual environment. Guided examples.

6

Conducting a survey

Participants will use the SurveyLab individually to plan and conduct a virtual survey to respond to the research question.

7

Discussion and recap

Group discussion on the results of the surveys, highlighting the cost/benefit of alternative sampling choices.

Day 2

1

Analysing survey data

Dealing with clusters, strata and sampling weights.

2

Conducting data analysis

Participants will apply the technique presented to analyse the data collected during the virtual surveys.

3

Comparing results and understanding differences

An interactive section where participants present, discuss and compare the results of their analyses.

4

From questions, to sampling, to estimates: a recap

A theoretical/practical discussion of the process of conducting a survey.

5

Conclusions

Conclusion, feedback from participants, Q & A.