Variants associated with Body Mass Index (BMI)

Start by loading {gwasrapidd}, and {dplyr} and {tidyr}:


Let’s say you want to retrieve all variants associated with the phenotype Body Mass Index (BMI). Moreover, you want to sort them by their risk allele (minor allele), as well as the effect size (beta coefficient) and p-value.

First we start by finding the Experimental Factor Ontology (EFO) identifier(s) corresponding to BMI. To do this, we start by downloading all traits in the GWAS Catalog.

all_traits <- get_traits()

Then look for ‘BMI’ in the trait description column.

dplyr::filter(all_traits@traits, grepl('BMI', trait, = TRUE))
#> # A tibble: 10 × 3
#>    efo_id      trait                                          uri                                 
#>    <chr>       <chr>                                          <chr>                               
#>  1 EFO_0005937 longitudinal BMI measurement         
#>  2 EFO_0007737 BMI-adjusted adiponectin measurement 
#>  3 EFO_0007788 BMI-adjusted waist-hip ratio         
#>  4 EFO_0007789 BMI-adjusted waist circumference     
#>  5 EFO_0007793 BMI-adjusted leptin measurement      
#>  6 EFO_0008036 BMI-adjusted fasting blood glucose measurement
#>  7 EFO_0008037 BMI-adjusted fasting blood insulin measurement
#>  8 EFO_0008038 BMI-adjusted hip bone size           
#>  9 EFO_0008039 BMI-adjusted hip circumference       
#> 10 EFO_0011044 BMI-adjusted neck circumference      

So there are several phenotypes whose description includes the keyword ‘BMI’. However, only the EFO trait 'EFO_0005937' (‘longitudinal BMI measurement’) really corresponds to BMI as a phenotypic trait. All other traits are adjusted for BMI but are not BMI traits per se (you can further confirm this by looking at each trait description, just by opening your web browser with each respective URI).

To get statistical association data for the trait ‘longitudinal BMI measurement’ ('EFO_0005937'), as well as associated variants and effect sizes, we use the gwasrapidd get_associations() function:

bmi_associations <- get_associations(efo_id = 'EFO_0005937')

The S4 object bmi_associations contains several tables, namely 'associations', 'loci', 'risk_alleles', 'genes', 'ensembl_ids' and 'entrez_ids':

#> [1] "associations" "loci"         "risk_alleles" "genes"        "ensembl_ids"  "entrez_ids"

From table 'associations' we can extract the variables:

whereas from table 'risk_alleles' we can obtain:

We extract all these variables and combine them into one single dataframe (bmi_variants), using 'association_id' as the matching key:

tbl01 <- dplyr::select(bmi_associations@risk_alleles, association_id, variant_id, risk_allele)
tbl02 <- dplyr::select(bmi_associations@associations, association_id, pvalue, beta_number, beta_unit, beta_direction)

bmi_variants <- dplyr::left_join(tbl01, tbl02, by = 'association_id') %>%
  tidyr::drop_na() %>%
  dplyr::arrange(variant_id, risk_allele)

The final results show 42 associations. Note that some variant/allele combinations might be repeated as the same variant/allele combination might have been assessed in more than one GWAS study.

#> # A tibble: 42 × 7
#>    association_id variant_id  risk_allele     pvalue beta_number beta_unit beta_direction
#>    <chr>          <chr>       <chr>            <dbl>       <dbl> <chr>     <chr>         
#>  1 10066323       rs10041997  A           0.0000004        0.138 unit      increase      
#>  2 61729722       rs10070777  A           0.000004         0.879 unit      increase      
#>  3 61729714       rs10278819  C           0.000002         0.873 unit      increase      
#>  4 61729787       rs10426669  G           0.000006         0.765 unit      increase      
#>  5 61729731       rs1048163   G           0.000005         0.450 unit      increase      
#>  6 61729771       rs1048164   A           0.000009         0.436 unit      increase      
#>  7 55309232       rs10515235  A           0.000002         0.05  unit      increase      
#>  8 55309249       rs10938397  G           0.00000003       0.06  unit      increase      
#>  9 61729763       rs112045010 C           0.000005         0.906 unit      increase      
#> 10 61729759       rs11979775  C           0.000005         0.501 unit      increase      
#> # … with 32 more rows