The I.Family statistics panel: addressing the challenges of complex data
I.Family researchers are faced with a complex data structure that has to be statistically accounted for using sophisticated methods. This is of utmost importance since our results will have high impact for families and their children and therefore it has to be ensured that people can trust our results.
I.Family is based on the IDEFICS cohort of 16,228 children aged 2 to 9 years at baseline, meaning that repeated observations over time are available for the same individuals (waves 1-3: IDEFICS; wave 4,5: I.Family). Measurement times are not equidistant and numbers of repetitions as well as ages at measurement vary between participants. Moreover, the I.Family survey was extended to the siblings and parents of the IDEFICS children.
This specific data structure necessitates innovative and well thought-of methods for statistical modeling. Values measured within the same individual over time are likely to be correlated and individuals within a family are genetically and environmentally more similar to each other than unrelated individuals. For both reasons, our observations are not independent, meaning that standard statistical models like e.g. analysis of variance (ANOVA), ordinary linear or logistic regression models are not applicable. Furthermore, our data cover a wide age span (2 to 15 year old children/adolescents + parents) and hence another challenge arises from age dependencies of exposure and outcome measures. In particular, growth-related changes throughout childhood (e.g. changes in energy intake) make it difficult to distinguish unfavorable behavior from “normal” age-specific behavior and for many parameters, reference values in children are lacking.
The selection of statistical methods to appropriately analyse our data requires an in depth understanding of the assumptions inherent in various standard methods. For specific research purposes, adaptations or the development of new statistical approaches that fit to the situation at hand are also required. To cope with these complex data issues, a statistic panel consisting of statisticians and mathematicians of the I.Family consortium has been initiated. The panel meets routinely to encourage discussions, raise awareness, solve upcoming statistical problems, and to bring forward statistical methodology to adequately analyse the I.Family data.
By Prof Dr Iris Pigeot, I.Family Study Deputy Co-ordinator, & Claudia Börnhorst, BIPS