mdd-rmUKBB

Psychiatric Genomics Consortium (PGC) Major Depressive Disorder (MDD) genome-wide association study meta-analysis removing individual overlap with UK Biobank.

Project overview

Many uses of genome-wide summary statistics require that there be no sample overlap between the discovery and testing datasets. UK Biobank is an open health data set which has been included in previous PGC Major Depressive Disorder GWAS (Wray et al 2018, Howard et al 2019). Because UK Biobank (UKBB) is used by many researchers, we have conducted and released GWAS summary stastics where overlap with UKBB has been minimised.

Datasets used are individual level data from the MDD Wave2 cohorts and summary statistics from the additional MDD cohorts (deCODE, GenScot, GERA, iPsych, 23andMe).

The analysis excludes 335 participants from 12 PGC MDD cohorts and 622 participants from the Generation Scotland cohort. It retains two individuals overlapping with UK Biobank from one cohort (shp0) that we are not able to exclude presently.

Data for this project are held on LISA in the directories listed in the README.mddw2sum and README.mdd00001 files in your LISA home directory. Preimputation QC and imputation was performed previously using the RICOPILI modules.

Project index

Project outline

Step 1: Genotype checksums

Checksums were used to identify potentially identical individuals between UKBB and PGC MDD samples. See Section 2 of GWAS

Step 2: Prepare phenotypes

Phenotypes were prepared by copying case/control status from each PGC MDD cohorts .fam file and setting the phenotype of individuals overlapping with UKBB to -9. See Section 3 of GWAS.

Step 3: Conduct GWAS removing UKBB overlap (rmUKBB)

GWAS was performed using the updated phenotype files using the RICOPILI postimp_navi command. See Section 4 of GWAS.

Step 4: Conduct meta-analytic GWAS

Meta analysis was first conducted on the 29 PGC MDD cohorts using the rmUKBB summary statistics. These meta-analytic results were then meta-analyzed with the additional cohorts (deCODE, GenScot, GERA, iPsych, 23andMe). See Section 5 of GWAS.

Data Availability

Meta-analyzed summary statistics excluding 23andMe will be available for download from the PGC as “PGC MDD No UKB / No 23andMe”. Results including 23andMe will be available by contacting the PGC Data Access Committee

Code availability

Code listed for this project can be cloned from https://github.com/psychiatric-genomics-consortium/mdd-rmUKBB

Requirements

Analysts

Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Contact

References