⚠️ This is a short blog post, more like an overview of the bcftools in general. For those unfamiliar with the tool, bcftools is a suite of tools used to work with variant call format (VCF) and the binary variant call format (BCF), which is the binary version of VCF files.
ℹ️ VCF and BCF files are used to store genetic variation data. As one might expect, bcftools is widely used in genomics and bioinformatics for different purposes.
ℹ️ Bcftools is a command-line tool and can run on various operating systems (Linux, macOS, and Windows).
👨🏼🏫 My BCFTOOLS Tutorials
Here is the list of bcftools tutorials that I have created so far:
bcftools query: How do you extract specific fields from a VCF file into a text file?
How to Add/Remove/Annotate VCF Columns and Corresponding Field bcftools annotate: Concrete Examples
How To Combine Multiple VCF/BCF Files Into a Single VCF File Using the Bcftools Concat Command?
🧬👩🏽💻 A Guide to Merging VCF and BCF Files: Your bcftools merge Tutorial
What’s the Difference Between Bcftools Concat and Bcftools Merge?
🤖 What Does bcftools Do?
bcftools provides a range of capabilities for manipulating and analyzing VCF and BCF files, including, among other things:
1️⃣ Converting between VCF and BCF formats
2️⃣ Viewing and filtering variant data stored in VCF and BCF tools
3️⃣ Performing data manipulation operations like merging and intersecting variant sets
4️⃣ Annotating variant data with additional information, such as gene and functional impact
5️⃣ Performing population-level analyses, such as calculating allele frequencies and Hardy-Weinberg equilibrium
🖥️ BCFTOOLS Commands
As I mentioned above, bcftools is a suite of tools or utilities (as those are described in bcftools documentation). Currently, there are 22 bcftools commands/utilities, and those are grouped into three main groups:
1️⃣ indexing command
2️⃣ VCF/BCF manipulation commands, and
3️⃣ VCF/BCF analysis commands.
Indexing tools contain only one command, which is bcftools index. To learn more about this command, check out my tutorial about the bcftools index here.
The group of VCF/BCF manipulation commands within which there are 11 commands, all listed below:
bcftools annotate: annotate and edit VCF/BCF files
bcftools concat: concatenate VCF/BCF files from the same set of samples
bcftools convert: convert VCF/BCF files to different formats and back
bcftools isec: intersections of VCF/BCF files
bcftools merge: merge VCF/BCF files files from non-overlapping sample sets
bcftools norm: left-align and normalize indels
bcftools plugin: user-defined plugins
bcftools query: transform VCF/BCF into user-defined formats
bcftools reheader: modify VCF/BCF header, change sample names
bcftools sort: sort VCF/BCF file.
bcftools view: VCF/BCF conversion, view, subset, and filter VCF/BCF files
The group of VCF/BCF analysis commands within which there are 10 commands, all listed below:
bcftools call: SNP/indel calling
bcftools consensus: create a consensus sequence by applying VCF variants
bcftools cnv: HMM CNV calling
bcftools csq: call variation consequences
bcftools filter: filter VCF/BCF files using fixed thresholds
bcftools gtcheck: check sample concordance, detect sample swaps and contamination
bcftools mpileup: multi-way pileup producing genotype likelihoods
bcftools polysomy: detect number of chromosomal copies
bcftools roh: identify runs of autozygosity (HMM)
bcftools stats: produce VCF/BCF stats
This was the first blog post where I wanted to give you a short introduction. In every next blog post, I will present one of the above commands with practical examples. Also, as we go, I will probably expand this article with additional information and explanations.