Insights from the 2ndSouth African Comparative Risk Assessment Study Estimating health risk factors distributions from sparse and heterogeneous data sources Annibale Cois Division of Health Systems and Public HealthStellenbosch University ILD Public LectureThu 07/21/22 5:30 PMQM369 Queen Mary CourtGreenwich Campus Data:multiple surveys with self-report dataconsumption recorded asintervalsof number of drinksExample5:Average consumption of alcohol among drinkers Resources:Administrative data on alcohol sales and import/exportEstimates of self-produced alcoholEvidence of the shape of the distribution of alcohol consumption across populations Main (untestable)assumptions:Level of underreporting of alcohol consumption is approximately constant across ages,sexesSmoothness of the consumption trend across time and ageLong-term consumption above150g/day are very unlikely ProbstC,ShuperPA,Rehm,J.(2017).Addiction,112:705710 Distribution shape Interval data Total consumption matched admin data +self-productionBayesian model Consumption above 150g/day areunlikely Example 1:Prevalence of IDA among women of reproductive age 2000 data VS Cross-walk/1 Resources:cross-walk equationEstimates of%of pregnant women from DHS Main (untestable)assumptions:what is valid in theglobalGBD population is also valid in SAStevens GA,Finucane MM et al.LancetGlob Health.2013Jul;1(1):e16-25. 2003 data:quantity questionsalone(as compared tofrequency-quantityquestions)Does not differentiate between fruit&fruit juice Example 2:Daily consumption of fruit&Vegetables in the adult population + Resources:2012survey includes both frequency and quantity questions 0 0.2 0.4 0.6 0.8 1 1.2 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (vegetables) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (fruit) Main (untestable)assumptions:no change in ratio fruit/fruit juice between2003and2012no difference on the effect of question wording between2012and2003Cross-walk/2 Data:multiple surveys(various populations)different diagnostic criteriadata generally not available for FPGExample3:Distribution of FPPG in the adult population Resources:Known relationships between different diagnostic criteriaShape of the FPG distributionCross-walk equation between mean FPG and diabetes prevalence Main (untestable)assumptions:what is valid in theglobalGBD population is also valid in SAdiagnostic criteria are actually equivalentthe distribution of FPG in the population is log-normal 1.Model diabetes prevalence for small scale studies(age*population groupfixed effects)2.Cross-walk equation from prevalence of diabetes and mean FPG 2.Equivalence of diagnostic criteria+3.evidence of the distribution shape Data:different number of measurements across surveys Cois A.Understanding Blood Pressure Dynamics inthe South African Population:A Latent VariablesApproach to the Analysis and Comparison of Datafrom Multiple Surveys.2017.http://hdl.handle.net/11427/25196 Example 4:Distribution of systolic Blood pressure in the adult population Resources:relationship between readings formcompletesurveysAnother form ofinternal cross-walk,but the cross-walk isestimates concurrently with the rest of the model SBP1 Reading 1 Reading 2 Reading 3 Reading 1 Reading 1 Reading 3 SBP2 Structural EquationModelMissing dataML estimator Main (untestable)assumptions:the relationship between thetrueBP and the sequence of readings is constant across surveysAssumptions regarding the joint distribution of variables are correct CreditsTheSACRA2 study wasfunded by the South African Medical ResearchCouncil’s Flagships Awards Project (SAMRC-RFA-IFSP-01-2013/SA CRA 2).D Bradshaw, JD Joubert, V Pillay-vanWyk, R Pacella, RMatzopoulosand all members of the SACRA2 collaborative group The comparative risk assessment study Examples A “principled” meta-regression approach Sparse and heterogeneous data sources Notes, questions, comments The comparative risk assessment study A “principled” meta-regression approach Notes, questions, comments Examples Sparse and heterogeneous data sources ComparativeRiskAssessmentTheCRAmethodisastandardisedandsystematicapproachtoestimatethecontributionofindividualriskfactorstotheobservedburdenofdisease.Itcomparestheobservedburdenofdiseaseduetoanexposurewithahypotheticaldistributioninapopulation,makinguseofthelevelofexposureinthepopulationandtheepidemiologicalrelationshipbetweenariskfactorandhealthoutcomes.Murray CJ,EzzatiMet al.Popul. HealthMetr. 2003;1(1):1 RRDEATHSDALYsATTRIBUTABLE DALYsATTRIBUTABLE DEATHS The 2ndComparative Risk Assessment for South Africaaimed to estimate the temporal trend ofburden attributable to aseries of 18 risk factorsbetween 2000 and 2012. NCD clusterHighsystolicbloodpressureHighbodymassindexHighfastingplasmaglucoseHighLDLcholesterolLowfruitintakeLowvegetableintakeHighsodiumintakeLowphysicalactivityTobaccosmokingAlcoholconsumption Addictive substance use Undernutrition clusterChildhoodundernutritionIrondeficiencyUnsafesexInterpersonalviolence Social behaviour clusterAmbientairpollution-PM2.5Ambientairpollution-ozoneHouseholdairpollutionUnsafewater,sanitationandhygiene Environmental cluster http://www.samj.org.za/index.php/samj SACRA2 1998201620002012 ? 1998 2016 20122000 Dutton, D.J., McLaren, L.BMC Public Health14,430 (2014) Small sampleLarge sampleBiased sample Data sources in general differs interms of … Target population Sample realisation Sampling weights, calibration Diagnostic criteria Devices/measurement protocol Sampling error Representativeness Measurement Olsen SJ, Azziz-Baumgartner E et al. MMWRMorbMortalWklyRep. 2020Sep 18;69(37):1305-1309.doi: 10.15585/mmwr.mm6937a6. PMID:32941415; PMCID: PMC7498167. E1 RECODING, UNIFORM CLEANING,… SOURCE-SPECIFIC ESTIMATION(tacking into account samplingstrategy) E2 E3123 Expert opinions “shape” of temporal trends “shape” of age trends Relationships with other variables/risk factors ……………… ESTIMATED MODEL FINAL ESTIMATES PREDICTION RECODING, UNIFORM CLEANING,… SOURCE-SPECIFIC ESTIMATION(tacking into account samplingstrategy) WEIGHTED ESTIMATION E1 RECODING, UNIFORM CLEANING,… SOURCE-SPECIFIC ESTIMATION(tacking into account samplingstrategy) E2 E3123 Expert opinions “shape” of temporal trends “shape” of age trends Relationships with other variables/risk factors ……………… ESTIMATED MODEL FINAL ESTIMATES PREDICTION RECODING, UNIFORM CLEANING,… SOURCE-SPECIFIC ESTIMATION(tacking into account samplingstrategy) WEIGHTED ESTIMATION Principled ExplicitQuantify uncertaintybeyond random error As much as possible! Administrative dataRelationships with othervariables/risk factorsExperiences/estimatesfrom other populationsRelationships betweenepidemiological measuresSmooth (slow)transitionExpertopinions Example 1: Prevalence of IDA among women of reproductive age 2000 data VS Cross-walk/1 Resources:cross-walk equationEstimates of % of pregnant women from DHS Main (untestable) assumptions:what is valid in the “global” GBD population is also valid in SAStevens GA, Finucane MM et al. LancetGlob Health. 2013 Jul;1(1):e16-25. 2003 data:“quantity questions” alone (as compared to “frequency-quantity” questions)Does not differentiate between fruit & fruit juice Example 2: Daily consumption of fruit & Vegetables in the adult population + Resources:2012 survey includes both frequency and quantity questions 0 0.2 0.4 0.6 0.8 1 1.2 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (vegetables) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (fruit) Main (untestable) assumptions:no change in ratio fruit/fruit juice between 2003 and 2012no difference on the effect of question wording between2012 and 2003Cross-walk/2 Data: multiple surveys (various populations) different diagnostic criteriadata generally not available for FPGExample 3:Distribution of FPPG in the adult population Resources:Known relationships between different diagnostic criteriaShape of the FPG distributionCross-walk equation between mean FPG and diabetes prevalence Main (untestable) assumptions:what is valid in the “global” GBD population is also valid in SAdiagnostic criteria are actually equivalentthe distribution of FPG in the population is log-normal 1.Model diabetes prevalence for small scale studies (age*population groupfixed effects)2.Cross-walk equation from prevalence of diabetes and mean FPG 2.Equivalence of diagnostic criteria +3.evidence of the distribution shape Resources:relationship between readings form “complete” surveysAnother form of “internal cross-walk”, but the cross-walk isestimates concurrently with the rest of the model SBP1 Reading 1 Reading 2 Reading 3 Reading 1 Reading 1 Reading 3 SBP2 Structural EquationModelMissing dataML estimator Main (untestable) assumptions:the relationship between the “true’ BP and the sequence of readings is constant across surveysAssumptions regarding the joint distribution of variables are correct Data:multiple surveys with self-report dataconsumption recorded as ‘intervals’ of number of drinksExample 5:Average consumption of alcohol among drinkers Resources:Administrative data on alcohol sales and import/exportEstimates of self-produced alcoholEvidence of the shape of the distribution of alcohol consumption across populations Main (untestable) assumptions:Level of underreporting of alcohol consumption is approximately constant across ages, sexesSmoothness of the consumption trend across time and ageLong-term consumption above 150 g/day are very unlikely ProbstC,ShuperPA, Rehm,J.(2017).Addiction,112:705710 Distribution shape Interval data Total consumption matched admin data + self-productionBayesian model Consumption above 150 g/day areunlikely Data:different number of measurements across surveys Cois A. Understanding Blood Pressure Dynamics inthe South African Population: A Latent VariablesApproach to the Analysis and Comparison of Datafrom Multiple Surveys. 2017.http://hdl.handle.net/11427/25196 Example 4: Distribution of systolic Blood pressure in the adult population Quality effects weights Quality Score Variance Relative biasacross age/sex groupsUnrecordedconsumptionAverage dailyconsumption(individual) Analytical solution (rare) Resampling MontecarloQuantifying uncertainty Conclusions?Look beyond your datasetMake your evidence (andassumptions) explicitFormalise uncertaintySampling error is not everything Cois, A., Matzopoulos, R., Pillay-vanWyk, V.et al.PopulHealthMetrics19,43 (2021Recalibration of sampling weights Quality effects weights Quality Score Variance Insights from the 2ndSouth African Comparative Risk Assessment Study Estimating health risk factors distributions from sparse and heterogeneous data sources Annibale Cois Division of Health Systems and Public HealthStellenbosch University ILD Public LectureThu 07/21/22 5:30 PMQM369 Queen Mary CourtGreenwich Campus Data:multiple surveys with self-report dataconsumption recorded asintervalsof number of drinksExample5:Average consumption of alcohol among drinkers Resources:Administrative data on alcohol sales and import/exportEstimates of self-produced alcoholEvidence of the shape of the distribution of alcohol consumption across populations Main (untestable)assumptions:Level of underreporting of alcohol consumption is approximately constant across ages,sexesSmoothness of the consumption trend across time and ageLong-term consumption above150g/day are very unlikely ProbstC,ShuperPA,Rehm,J.(2017).Addiction,112:705710 Distribution shape Interval data Total consumption matched admin data +self-productionBayesian model Consumption above 150g/day areunlikely Example 1:Prevalence of IDA among women of reproductive age 2000 data VS Cross-walk/1 Resources:cross-walk equationEstimates of%of pregnant women from DHS Main (untestable)assumptions:what is valid in theglobalGBD population is also valid in SAStevens GA,Finucane MM et al.LancetGlob Health.2013Jul;1(1):e16-25. 2003 data:quantity questionsalone(as compared tofrequency-quantityquestions)Does not differentiate between fruit&fruit juice Example 2:Daily consumption of fruit&Vegetables in the adult population + Resources:2012survey includes both frequency and quantity questions 0 0.2 0.4 0.6 0.8 1 1.2 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (vegetables) 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 18-24 25-34 35-44 45-54 55-64 65-74 75+ Males Females Correction factor (fruit) Main (untestable)assumptions:no change in ratio fruit/fruit juice between2003and2012no difference on the effect of question wording between2012and2003Cross-walk/2 Data:multiple surveys(various populations)different diagnostic criteriadata generally not available for FPGExample3:Distribution of FPPG in the adult population Resources:Known relationships between different diagnostic criteriaShape of the FPG distributionCross-walk equation between mean FPG and diabetes prevalence Main (untestable)assumptions:what is valid in theglobalGBD population is also valid in SAdiagnostic criteria are actually equivalentthe distribution of FPG in the population is log-normal 1.Model diabetes prevalence for small scale studies(age*population groupfixed effects)2.Cross-walk equation from prevalence of diabetes and mean FPG 2.Equivalence of diagnostic criteria+3.evidence of the distribution shape Data:different number of measurements across surveys Cois A.Understanding Blood Pressure Dynamics inthe South African Population:A Latent VariablesApproach to the Analysis and Comparison of Datafrom Multiple Surveys.2017.http://hdl.handle.net/11427/25196 Example 4:Distribution of systolic Blood pressure in the adult population Resources:relationship between readings formcompletesurveysAnother form ofinternal cross-walk,but the cross-walk isestimates concurrently with the rest of the model SBP1 Reading 1 Reading 2 Reading 3 Reading 1 Reading 1 Reading 3 SBP2 Structural EquationModelMissing dataML estimator Main (untestable)assumptions:the relationship between thetrueBP and the sequence of readings is constant across surveysAssumptions regarding the joint distribution of variables are correct CreditsTheSACRA2 study wasfunded by the South African Medical ResearchCouncil’s Flagships Awards Project (SAMRC-RFA-IFSP-01-2013/SA CRA 2).D Bradshaw, JD Joubert, V Pillay-vanWyk, R Pacella, RMatzopoulosand all members of the SACRA2 collaborative groupThank you!
1
  1. Title
  2. CRA_0
  3. CRA_1
  4. CRA_2
  5. CRA_3
  6. DATA_0
  7. DATA_1
  8. DATA_2
  9. DATA_3
  10. DATA_4
  11. APPR_0
  12. APPR_1
  13. APPR_2
  14. APPR_3
  15. APPR_4
  16. APPR_5
  17. APPR_6
  18. APPR_7
  19. EX_0
  20. EX_1A
  21. EX_1B
  22. EX_2A
  23. EX_2B
  24. EX_3A
  25. EX_3B
  26. EX_4A
  27. EX_4B
  28. EX_5A
  29. EX_5B
  30. EX_5C
  31. EX_5D
  32. EX_5E
  33. EX_6
  34. EX_7
  35. EX_7
  36. CON_0
  37. Thanks
  38. Credits