7. Summary¶

These tables summarize BEDOPS utilities by option, file inputs and BED column requirements.

7.1. Set operation and statistical utilities¶

7.1.1. `bedextract`¶

Efficiently extracts features from BED input.
BEDOPS bedextract documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
`--list-chr`	Print every chromosome found in `input.bed`	1	1	3
`<chromosome>`	Retrieve all rows for specified chromosome, e.g. `bedextract chr8 input.bed`	1	1	3
`<query> <reference>`	Grab elements of `query` that overlap elements in reference. Same as `bedops -e -1 query reference`, except that this option fails when `query` contains fully-nested BED elements. May use `-` to indicate `stdin` for `reference` only.	2	2	3

7.1.2. `bedmap`¶

Maps source signals from map-file onto qualified target regions from ref-file. Calculates an output for every ref-file element.
BEDOPS bedmap documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
`--bases`	Reports the total number of bases from `map-file` that overlap the `ref-file` ‘s element.	1	2	3
`--bases-uniq`	Reports the number of distinct bases from `ref-file` ‘s element overlapped by elements in `map-file`.	1	2	3
`--bases-uniq-f`	Reports the fraction of distinct bases from `ref-file` ‘s element elements in `map-file`.	1	2	3
`--bp-ovr <int>`	Require `<int>` bases of overlap between elements of input files.	1	2	3
`--chrom <chromosome>`	Process data for given `<chromosome>` only.	1	2	3
`--count`	Reports the number of overlapping elements in `map-file`.	1	2	3
`--cv`	Reports the Coefficient of Variation: the result of `--stdev` divided by the result of `--mean`.	1	2	5
`--ec`	Error-check all input files (slower).	1	2	3
`--echo`	Echo each line from `ref-file`.	1	2	3
`--echo-map`	Reports the overlapping elements found in `map-file`.	1	2	3
`--echo-map-id`	Reports the IDs (4th column) from overlapping `map-file` elements.	1	2	4
`--echo-map-id-uniq`	List unique IDs from overlapping `map-file` elements.	1	2	4
`--echo-map-range`	Reports the genomic range of overlapping elements from `map-file`.	1	2	3
`--echo-map-score`	Reports the scores (5th column) from overlapping `map-file` elements.	1	2	5
`--echo-map-size`	Calculates difference between start and stop coordinates (or size) of each mapped element.	1	2	3
`--echo-overlap-size`	Calculates size of overlap between each mapped element and its reference element.	1	2	3
`--echo-ref-name`	Reports the first 3 fields of `ref-file` element in chrom:start-end format.	1	2	3
`--echo-ref-size`	Reports the length of the `ref-file` element.	1	2	3
`--faster`	(Advanced) Strong input assumptions are made. Review documents before use. Compatible with `--bp-ovr` and `--range` overlap options only.	1	2	5
`--fraction-ref <val>`	The fraction of the element’s size from `ref-file` that must overlap the element in `map-file`. Expects `0 < val <= 1`.	1	2	5
`--fraction-map <val>`	The fraction of the element’s size from `map-file` that must overlap the element in `ref-file`. Expects `0 < val <= 1`.	1	2	5
`--fraction-both <val>`	Both `--fraction-ref <val>` and `--fraction-map <val>` must be true to qualify as overlapping. Expects `0 < val <= 1`.	1	2	5
`--fraction-either <val>`	Both `--fraction-ref <val>` and `--fraction-map <val>` must be true to qualify as overlapping. Expects `0 < val <= 1`.	1	2	5
`--exact`	Shorthand for `--fraction-both 1`. First three fields from `map-file` must be identical to `ref-file` element.	1	2	5
`--indicator`	Reports the presence of one or more overlapping elements in `map-file` as a binary value (`0` or `1`).	1	2	3
`--kth <val>`	Reports the value at the k th fraction. A generalized median-like calculation, where `--kth 0.5` is the median. (`0 < val <= 1`)	1	2	5
`--mad <mult=1>`	Reports the ‘median absolute deviation’ of overlapping elements in `map-file`, multiplied by `<mult>`.	1	2	5
`--max`	Reports the highest score from overlapping elements in `map-file`.	1	2	5
`--max-element`	The lexicographically “smallest” element with the highest score from overlapping elements in `map-file`. If no overlapping element exists, `NAN` is reported (unless `--skip-unmapped` is used).	1	2	5
`--max-element-rand`	A randomly-chosed element with the highest score from overlapping elements in `map-file`. If no overlapping element exists, `NAN` is reported (unless `--skip-unmapped` is used).	1	2	5
`--mean`	Reports the average score from overlapping elements in `map-file`.	1	2	5
`--median`	Reports the median score from overlapping elements in `map-file`.	1	2	5
`--min`	Reports the lowest score from overlapping elements in `map-file`.	1	2	5
`--min-element`	The lexicographically “smallest” element with the lowest score from overlapping elements in `map-file`. If no overlapping element exists, `NAN` is reported (unless `--skip-unmapped` is used).	1	2	5
`--min-element-rand`	A randomly-chosed element with the lowest score from overlapping elements in `map-file`. If no overlapping element exists, `NAN` is reported (unless `--skip-unmapped` is used).	1	2	5
`--skip-unmapped`	Omits printing reference elements which do not associate with any mapped elements.	1	2	3
`--stdev`	Reports the square root of the result of `--variance`.	1	2	5
`--sum`	Reports the accumulated value from scores of overlapping elements in `map-file`.	1	2	5
`--sweep-all`	Reads through entire `map-file` dataset to avoid early termination that may cause SIGPIPE or other I/O errors.	1	2	3
`--tmean <low> <hi>`	Reports the mean score from overlapping elements in `map-file`, after ignoring the bottom `<low>` and top `<hi>` fractions of those scores. (`0 <= low <= 1`, `0 <= hi <= 1`, `low + hi <= 1`).	1	2	5
`--variance`	Reports the variance of scores from overlapping elements in `map-file`.	1	2	5

7.1.3. `bedops`¶

Offers set and multiset operations for files in BED format.
BEDOPS bedops documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
`--chrom <chromosome>`	Process data for given `chromosome` only.	1	No imposed limit	3
`--complement`, `-c`	Reports the intervening intervals between the input coordinate segments.	1	No imposed limit	3
`--chop`, `-w`	Breaks up merged regions into fixed-size chunks, optionally anchored on start coordinates a fixed distance apart.	1	No imposed limit	3
`--difference`, `-d`	Reports the intervals found in the first file that are not present in any other input file.	2	No imposed limit	3
`--ec`	Error-check input files (slower).	1	No imposed limit	3
`--element-of`, `-e`	Reports rows from the first file that overlap, by a specified percentage or number of base pairs, the merged segments from all other input files.	2	No imposed limit	3
`--header`	Accept headers (VCF, GFF, SAM, BED, WIG) in any input file.	1	No imposed limit	3
`--intersect`, `-i`	Reports the intervals common to all input files.	2	No imposed limit	3
`--merge`, `-m`	Reports intervals from all input files, after merging overlapping and adjoining segments.	1	No imposed limit	3
`--not-element-of`, `-n`	Reports exactly everything that `--element-of` does not, given the same overlap criterion.	2	No imposed limit	3
`--partition`, `-p`	Reports all disjoint intervals from all input files. Overlapping segments are cut up into pieces at all segment boundaries.	1	No imposed limit	3
`--range L:R`	Add `L` bases to all start coordinates and `R` base to end coordinates. Either value may be positive or negative to grow or shrink regions, respectively. With the `-e` or `-n` operation, the first (reference) file is not padded, unlike all other files.	1	No imposed limit	3
`--range S`	Pad input file(s) coordinates symmetrically by `S` bases. This is shorthand for `--range -S:S`.	1	No imposed limit	3
`--symmdiff`, `-s`	Reports the intervals found in exactly one input file.	2	No imposed limit	3
`--everything`, `-u`	Reports the intervals from all input files in sorted order. Duplicates are retained in the output.	1	No imposed limit	3

7.1.4. `closest-features`¶

For every element in input-file, find those elements in query-file nearest to its left and right edges.
BEDOPS closest-features documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	2	2	3
`--chrom <chromosome>`	Process data for given `<chromosome>` only.	2	2	3
`--dist`	Output includes the signed distances between the `input-file` element and the closest elements in `query-file`.	2	2	3
`--ec`	Error-check all input files (slower).	2	2	3
`--no-overlaps`	Do not consider elements that overlap. Overlapping elements, otherwise, have highest precedence.	2	2	3
`--no-ref`	Do not echo elements from `input-file`.	2	2	3
`--closest`	Choose the nearest element from `query-file` only. Ties go to the leftmost closest element.	2	2	3

7.2. Sorting¶

7.2.1. `sort-bed`¶

Sorts input BED file(s) into the order required by other utilities. Loads all input data into memory.
BEDOPS sort-bed documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	1	1000	3
`--max-mem <val>`	`<val>` specifies the maximum memory usage for the sort-bed process, which is useful for very large BED inputs. For example, `--max-mem` may be `8G`, `8000M`, or `8000000000` to specify 8 GB of memory.	1	1000	3
`--unique`	Report unique elements (those which only occur once) in output.	1	1000	3
`--duplicates`	Report duplicate elements (those which occur 2+ times) in output.	1	1000	3

7.3. Compression and extraction¶

7.3.1. `starch`¶

Lossless compression of any BED file.
BEDOPS starch documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	1	1	3
`--bzip2` or `--gzip`	The internal compression method. The default `--bzip2` method favors storage efficiency, while `--gzip` favors compression and extraction time performance.	1	1	3
`--note="foo bar..."`	Append note to output archive metadata (optional).	1	1	3
`--report-progress=N`	Write progress to standard error stream for every N input elements.	1	1	3

7.3.2. `unstarch`¶

Extraction of a starch archive or attributes.
BEDOPS unstarch documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	1	1	NA
`--archive-type`	Show archive’s compression type (either `bzip2` or `gzip`).	1	1	NA
`--archive-version`	Show archive version (at this time, either 1.x or 2.x).	1	1	NA
`--archive-timestamp`	Show archive creation timestamp (ISO 8601 format).	1	1	NA
`--bases <chromosome>`	Show total, non-unique base counts for optional `<chromosome>` (omitting `<chromosome>` shows total non-unique base count).	1	1	NA
`--bases-uniq <chromosome>`	Show unique base counts for optional `<chromosome>` (omitting `<chromosome>` shows total, unique base count).	1	1	NA
`<chromosome>`	Decompress information for a single `<chromosome>` only.	1	1	NA
`--duplicatesExist` or `--duplicatesExistAsString` with `<chromosome>`	Report if optional `<chromosome>` or chromosomes contain duplicate elements as 0/1 numbers or false/true strings	1	1	NA
`--elements <chromosome>`	Show element count for optional `<chromosome>` (omitting `<chromosome>` shows total element count).	1	1	NA
`--elements-max-string-length`	Show element maximum string length for optional `<chromosome>` (omitting `<chromosome>` shows maximum string length over all chromosomes).	1	1	NA
`--is-starch`	Test if the <starch-file> is a valid starch archive, returning 0/1 for a false/true result	1	1	NA
`--list` or `--list-json`	Print the metadata for a `starch` file, either in tabular form or with JSON formatting.	1	1	NA
`--list-chr` or `--list-chromosomes`	List all chromosomes in `starch` archive (similar to `bedextract --list-chr`).	1	1	NA
`--nestedsExist` or `--nestedsExistAsString` with `<chromosome>`	Report if optional `<chromosome>` or chromosomes contain nested elements as 0/1 numbers or false/true strings	1	1	NA
`--note`	Show descriptive note (if originally added to archive).	1	1	NA
`--signature` with `<chromosome>`	Show SHA-1 signature of specified chromosome (Base64-encoded) or all signatures if chromosome is not specified.	1	1	NA
`--verify-signature` with `<chromosome>`	Compare SHA-1 signature of specified chromosome with signature that is stored in the archive metadata, reporting error is mismatched.	1	1	NA

7.3.3. `starchcat`¶

Merge multiple starch archive inputs into one starch archive output.
BEDOPS starchcat documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	1	No imposed limit	NA
`--bzip2` or `--gzip`	The internal compression method. The default `--bzip2` method favors storage efficiency, while `--gzip` favors compression and extraction time performance.	1	No imposed limit	NA
`--note="foo bar..."`	Append note to output archive metadata (optional).	1	No imposed limit	NA
`--report-progress=N`	Write progress to standard error stream for every N input elements.	1	No imposed limit	NA

7.3.4. `starchstrip`¶

Extract or filter a starch archive by one or more specified chromosome names.
BEDOPS starchstrip documentation.

option	description	min. file inputs	max. file inputs	min. BED columns
(no option)	NA	1	No imposed limit	NA
`--include` or `--exclude` with <chromosomes>	Writes output with inclusion or exclusion of specified chromosome name records (comma-delimited string).	NA	No imposed limit	NA

BEDOPS v2.4.41

7. Summary¶

7.1. Set operation and statistical utilities¶

7.1.1. `bedextract`¶

7.1.2. `bedmap`¶

7.1.3. `bedops`¶

7.1.4. `closest-features`¶

7.2. Sorting¶

7.2.1. `sort-bed`¶

7.3. Compression and extraction¶

7.3.1. `starch`¶

7.3.2. `unstarch`¶

7.3.3. `starchcat`¶

7.3.4. `starchstrip`¶

Table of Contents

7. Summary¶

7.1. Set operation and statistical utilities¶

7.1.1. bedextract¶

7.1.2. bedmap¶

7.1.3. bedops¶

7.1.4. closest-features¶

7.2. Sorting¶

7.2.1. sort-bed¶

7.3. Compression and extraction¶

7.3.1. starch¶

7.3.2. unstarch¶

7.3.3. starchcat¶

7.3.4. starchstrip¶

7.1.1. `bedextract`¶

7.1.2. `bedmap`¶

7.1.3. `bedops`¶

7.1.4. `closest-features`¶

7.2.1. `sort-bed`¶

7.3.1. `starch`¶

7.3.2. `unstarch`¶

7.3.3. `starchcat`¶

7.3.4. `starchstrip`¶