4.4. Finding the subset of SNPs within DHSes¶

In this example, we would like to identify the set of SNPs that are within a DHS, printing out both the SNP element and the DHS it is contained within.

4.4.1. BEDOPS tools in use¶

We use bedmap to answer this question, as it traverses a reference BED file (in this example, SNPs), and identifies overlapping elements from the mapping BED file (in this example, DHSs).

4.4.2. Script¶

SNPs are in a BED-formatted file called SNPs.bed sorted lexicographically with sort-bed. The DNase-hypersensitive sites are stored in a sorted BED-formatted file called DHSs.bed. These two files are available in the Downloads section.

bedmap --skip-unmapped --echo --echo-map SNPs.bed DHSs.bed \
  > subsetOfSNPsWithinAssociatedDHS.bed

4.4.3. Discussion¶

The output of this bedmap statement might look something like this:

chr1    10799576    10799577    rs12.4.378  Systolic_blood_pressure Cardiovascular|chr1 10799460    10799610    MCV-1   9.18063

The output is delimited by pipe symbols (|), showing the reference element (SNP) and the mapped element (DHS).

If multiple elements are mapped onto a single reference element, the mapped elements are further separated by semicolons, by default.

4.4.4. Downloads¶

SNP elements
DNase-hypersensitive elements

The bedmap tool can operate directly on Starch-formatted archives. Alternatively, use the unstarch tool to decompress Starch data files to sorted BED format.

BEDOPS v2.4.37

4.4. Finding the subset of SNPs within DHSes¶

4.4.1. BEDOPS tools in use¶

4.4.2. Script¶

4.4.3. Discussion¶

4.4.4. Downloads¶

Table of Contents