Thursday, October 18, 2012

UCSC Genome Browser: A few useful tips

Most of my research revolves around analyzing next-generation sequencing data such as whole-exome sequencing data.  As a statistician, I always appreciate finding useful bioinformatic tricks/tips from various tools that make me more efficient.  Here are three examples of using the UCSC Genome Browser that I've found helpful.

Tip #1:  How do you find a list of chromosome positions given a list of dbSNP identifiers? (Taken from the Guide to the UCSC Genome Browser FAQ by Nature)
Use the 'Variation and Repeats' group in Table Browser and the SNPs track of choice.  Just specify the genome (e.g. Human) and assembly (e.g. hg19).  For 'Region', if you want the chromosomal positions for a specific regions, click position and specify the region OR click genome and upload a list of dbSNP identifiers. Finally, choose your output format (e.g. GTF, BED) and click 'get output'.

Tip #2:  How do you lift over a set of genomic coordinates from hg18 to hg19? 
Use the Batch Coordinate Conversion (liftOver) tool in Utilities. Selected Original and New assemblies. Upload the original genomic coordinates (in a BED format) and submit.  

Tip #3: How can you extract data from the UCSC browser and use it in R? 
For this, we need to install the package rtracklayer.  Here is an example on how to extract recombination rates: 

library(rtracklayer)
my.session <- browserSession()
genome(my.session) <- "hg19"
recomb.rates <- getTable(ucscTableQuery(my.session, "recombRate"))

No comments:

Post a Comment