Genome Informatics Pre-Meeting (19th September 2011)

09.15 – 09.45
Managing Next-Gen Data and Process in Cancer Research
Accurate, complete human genome sequences provide a powerful method for identifying variants, both somatic and germline, involved in clinically relevant diseases and phenotypes. We have developed a technology platform and sequencing center capable of cost-effectively sequencing more than 600 human genomes per month at high-depth (minimum 40x, typically much higher). To help discriminate causal variant(s) from the millions of other variants in any genome sequenced, we recently generated high-depth complete human genome data on 69 ethnically diverse non-tumor cell lines from the NIGMS and NHGRI (CEPH/Utah, HapMap and 1000 Genomes Project) collections.
This gender-balanced diversity panel includes individuals of European, Asian, and African descent, as
well as admixed individuals. Bioinformatic analysis was performed using our local de novo assembly based pipeline which detects SNVs, indels, and substitutions in addition to copy number and structural variants. These data have all been released on ftp2.completegenomics.com and we expect to make further updates and additions available. Comparisons against HapMap and 1000 Genomes Project data, in addition to validation by traditional methods, show high genotype concordance. The frequency and/or novelty of variations observed in a set of genomes of biomedical interest, for example from families or populations segregating for a particular phenotype, can be estimated by comparison of this public repository and other similar reference data sets.

17.00 – 17.30
Presenter: Matthew Keyser, DNASTAR
GWAS on a Desktop: Using DNASTAR Software to Assemble and Analyze Next-Gen Sequencing Data on Your Desktop Computer
Using DNASTAR Lasergene, we will present a desktop computer workflow that begins with next-generation DNA sequence assembly for one or many multi-plexed samples on any next-gen platform; supports probabilistic identification of SNPs, small indels and genotype calls with known variants correlated to their dbSNP IDs; includes identification of structural variations; allows review of SNPs from multiple samples within a single project; and, for large multi-sample projects with hundreds of individual data sets, highlights tools for SNP quantitation, filtering, set comparison and clustering as needed for Genome Wide Association Studies (GWAS).
Day 1 – Cancer (20th September 2011)

14.30 – 15.00
Presenter: Boris Umylny, PhD. Head Bioinformatics at GENEWIZ
Managing Next-Gen Data and Process in Cancer Research
With the increasing quantity and complexities of molecular data, bioinformatics has gained significant prominence and importance. Substantial resources have been invested in research and development of new algorithms, with the corresponding increase in our knowledge and understanding. Continual improvements in the efficiency of high throughput sequencers also began to stress basic infrastructure required to handle the massive quantities of data produced by modern devices. In this presentation we offer a review of the infrastructure issues and offer a scalable approach for developing bioinformatic infrastructure required to handle modern high throughput data.

17.00 – 17.30
“High performing and simple targeted resequencing protocols are required for both high throughput and bench top sequencing instruments, respectively. HaloPlex next generation PCR technology enables parallel amplification of up to 100,000 PCRs in one single reaction tube. HaloPlex PCR directly incorporates sequencing primers and sample barcodes, and allows manual processing of 48 samples per day ready for pooling and sequencing. The technology eliminates the need for any auxiliary instrumentation such as DNA shearing instruments, and only requires a standard PCR instrument.
In addition to the simple and scalable protocol, HaloPlex PCR provides superior specificity and high coverage. Data from an ongoing study involving amplification and deep sequencing of 181 genes in a patient cohort of 180 CLL patients will be presented. The second project that will be presented involves a patient cohort with hypertension where three regions spanning over 200 kbp has been identified in a GWAS study and subsequently targeted with high coverage using HaloPlex PCR.”
Day 2 – Exome (21st September 2011)

09.15 – 09.45
Automated Liquid Handling for Next Generation Sequencing Sample Preparation Applications
Next Generation sequencing technologies are enabling science at unprecedented rates. As the cost reduction and throughput of sequencing increases, the number of applications for which sequencing data can be used is also rapidly increasing. To support the growing numbers of samples and applications, Caliper has developed a pipeline that leverages automation and microfluidics to improve reproducibility, throughput and sample tracking. Key considerations for sample preparation applications for next generation sequencing include minimal user intervention, elimination of cross-contamination and user error, and flexibility for future application support.