Quantcast
Channel: Frequently Asked Questions — GATK-Forum
Browsing all 97 articles
Browse latest View live

About "Frequently Asked Questions"

This section lists (and answers!) frequently asked questions. These documentation articles cover specific points of clarification about the following: details of how the GATK tools work and how they...

View Article


How can I turn on or customize forum notifications?

By default, the forum does not send notification messages about new comments or discussions. If you want to turn on notifications or cutomize the type of notifications you want, you need to do the...

View Article


Why didn't the Unified Genotyper call my SNP? I can see it right there in IGV!

Just because something looks like a SNP in IGV doesn't mean that it is of high quality. We are extremely confident in the genotype likelihoods calculations in the Unified Genotyper (especially for...

View Article

What is a GATKReport ?

A GATKReport is simply a text document that contains well-formatted, easy to read representation of some tabular data. Many GATK tools output their results as GATKReports, so it's important to...

View Article

What is "Phone Home" and how does it affect me?

1. What it is and how it helps us improve the GATK Since September, 2010, the GATK has had a "phone-home" feature that sends us information about each GATK run via the Broad filesystem (within the...

View Article


How can I submit a patch to the GATK codebase?

The GATK is an open source project that has greatly benefited from the contributions of outside users. The GATK team welcomes contributions from anyone who produces useful functionality in line with...

View Article

Collected FAQs about VCF files

1. What file formats do you support for variant callsets? We support the Variant Call Format (VCF) for variant callsets. No other file formats are supported. 2. How can I know if my VCF file is valid?...

View Article

Collected FAQs about interval lists

1. What file formats do you support for interval lists? We support three types of interval lists, as mentioned here. Interval lists should preferentially be formatted as Picard-style interval lists,...

View Article


How does the GATK handle these huge NGS datasets?

Imagine a simple question like, "What's the depth of coverage at position A of the genome?" First, you are given billions of reads that are aligned to the genome but not ordered in any particular way...

View Article


Where can I get more information about next-generation sequencing concepts...

The following links should be help as a review or an introduction to concepts and terminology related to next-generation sequencing: DNA sequencing (Wikipedia) A basic review of the sequencing process....

View Article

Collected FAQs about BAM files

1. What file formats do you support for sequencer output? The GATK supports the BAM format for reads, quality scores, alignments, and metadata (e.g. the lane of sequencing, center of origin, sample...

View Article

Why are some of the annotation values different with VariantAnnotator...

As featured in this forum question. Two main things account for these kinds of differences, both linked to default behaviors of the tools: 1. The tools downsample to different depths of coverage 2. The...

View Article

What is Map/Reduce and why are GATK tools called "walkers"?

Overview One of the key challenges of working with next-gen sequence data is that input files are usually very large. We can’t just make the program open the files, load all the data into memory and...

View Article


What are the prerequisites for running GATK?

1. Operating system The GATK runs natively on most if not all flavors of UNIX, which includes MacOSX, Linux and BSD. It is possible to get it running on Windows using Cygwin, but we don't provide any...

View Article

How do I submit a detailed bug report?

Note: only do this if you have been explicitly asked to do so. Scenario: You posted a question about a problem you had with GATK tools, we answered that we think it's a bug, and we asked you to submit...

View Article


Image may be NSFW.
Clik here to view.

What is GATK-Lite and how does it relate to "full" GATK 2.x?

You probably know by now that GATK-Lite is a free-for-everyone and completely open-source version of the GATK (licensed under the original MIT license). But what's in the box? What can GATK-Lite do --...

View Article

Image may be NSFW.
Clik here to view.

Which datasets should I use for reviewing or benchmarking purposes?

New WGS and WEx CEU trio BAM files We have sequenced at the Broad Institute and released to the 1000 Genomes Project the following datasets for the three members of the CEU trio (NA12878, NA12891 and...

View Article


What should I use as known variants/sites for running tool X?

1. Notes on known sites Why are they important? Each tool uses known sites differently, but what is common to all is that they use them to help distinguish true variants from false positives, which is...

View Article

How can I use parallelism to make GATK tools run faster?

This document provides technical details and recommendations on how the parallelism options offered by the GATK can be used to yield optimal performance results. Overview As explained in the primer on...

View Article

How can I prepare a FASTA file to use as reference?

The GATK uses two files to access and safety check access to the reference files: a .dict dictionary of the contig names and sizes and a .fai fasta index file to allow efficient random access to the...

View Article
Browsing all 97 articles
Browse latest View live