Tuesday, February 26, 2013
360 Binney Street
One Kendall Square, Building 1000
Cambridge, Massachusetts 02141
+1 (617) 444-5000
1:00 - 5:00pm
|NCBI Workshop: "Entrez and Blast searching at NCBI"||Peter Cooper
9:00 - 11:00am
|STN Workshop: "Searching Uncommon Sequences on STN"||Jim Brown||Endeavor Room at Marriott Cambridge|
12:00 - 2:00pm
|Minesoft Workshop: "Effective International Patent Searching on PatBase" (lunch provided)||Phil Ostanock||Endeavor Room at Marriott Cambridge|
2:00 - 4:00pm
|David Milward||Discovery Room at Marriott Cambridge|
Part 1: Using the Entrez System to find Biomolecular Data at the NCBI
Presented by Peter S. Cooper, Ph.D., The National Center for Biotechnology Information
In this workshop you will explore the NCBI Entrez integrated biomedical literature and molecular database system and learn how to effectively use the Entrez system to find records of interest. You will understand the scope and content of the sequence databases (Nucleotide, GSS, EST, Protein and SRA). You will also explore aspects of the literature databases (PubMed and PubMed Central) with emphasis on the linkages between the literature and molecular data. After learning about the types and sources of molecular sequence data (GenBank, RefSeq, SRA) and sequence-related Entrez information hubs such as Taxonomy, HomoloGene and Gene; you will see in a live demonstration how to use the Entrez interface including filters, the advanced search page, and search strategies to collect and download a specific set of records, to narrow the search, and to use the precomputed relationships available in the Entrez system find related sequences, genomic regions, genomic maps, homologous genes and proteins, pathways and expression information. In addition to the sequence databases, you will also learn how to access related information in HomoloGene, UniGene and Gene Expression Omnibus (GEO), how to use precomputed BLAST (BLink) to find homologous proteins and genes, and how to use the Graphical Sequence Viewer as a tool for exploring large and complex genomic sequences and their annotations.
Part 2: BLAST
Presented by David L. Osterbur, Ph.D., Harvard Medical School
The second part of this workshop will concentrate on the use of NCBI's BLAST for sequence searching and some of the ancillary tools that are integrated with BLAST at NCBI but that no one usually sees. We will examine how BLAST functions and what changes in the NCBI BLAST pages "below the fold" make a difference in the results of sequence searching.
Searching for uncommon sequences can be a challenge for even the most experienced searchers. Oftentimes the only way to capture these uncommon sequences is to use the value-added features of the sequence databases on STN. This workshop will explain how to search sequences with uncommon amino acid residues, sequences that have been modified, and sequences that use variables for amino acid or nucleotide residues. This workshop is geared towards the intermediate-to-advanced sequence searcher but any STN searcher with basic sequence searching skills will find this workshop beneficial. This workshop is free but registration is strongly encouraged.
PatBase is a searchable patent database covering over 45 million patent families, with historical information dating back to the early 1900s. Our users rely on PatBase to offer them a robust platform from which to search, review, share and analyze business-critical patent information.
Advanced highlighting schemes, Keyword-in-Context displays and sophisticated Image viewing options allow you to efficiently review large numbers of documents – a vital capability when performing broad biotechnology searches spanning several fields. An API is also available that delivers data sets from PatBase directly to you in XML format – ideal for use in text analytics tools. Cross-referencing with USGENE is possible, with direct links from the well-known genetic sequence database to corresponding PatBase Express records. In addition, PatBase can be searched by all the main Classification schemes in order to really pinpoint the concepts you are looking for. The new CPC has been implemented in PatBase, including fully searchable definitions that can be navigated using the Classification Finder tool.
Join this Workshop to discover more about the unique capabilities that PatBase offers for the professional patent searcher.
Advanced search interfaces provide a well-known set of techniques such as truncation and boolean operators to filter to a relevant set of patents. Agile text mining provide a more extensive set of filtering techniques. These include the use of terminologies to look for any document that talks about e.g. any immunosuppressant drug or any kinase or any cancer, regular expressions to look for any mention of a microRNA, chemical similarity or substructure search, and linguistic constructions that can accurately capture the many ways people might express a single concept. It also enables high-throughput searching e.g. looking simultaneously for 500 specific genes to find the diseases they are known to relate to.
We will discuss these techniques in the context of biotechnology patent search, and also show how text mining can be used to more systematically develop search strategies, including use of the data itself to suggest candidate terminology.
Participants of the workshop will be encouraged to submit queries that they find time consuming or impossible with standard search engines two weeks before the PIUG meeting.
Organizations hosting workshops must complete the PDF document for the 2013 Biotech Workshop Agreement.
Additional information is available under Sponsorships.