The Hitachi Software MiraiBio Group Blog

What’s going on at MiraiBio

Subscribe to The Hitachi Software MiraiBio Group Blog
Posted by aliu under SmartNote

NCBI BLAST, the Basic Local Alignment Search Tool (BLAST) is a suite of programs designed to search all available sequence databases for similarities between a protein or DNA query and known sequences. BLAST allows quick matching of near and distant sequence relationships, providing scores that allow the user to distinguish real matches from background hits with a high degree of statistical accuracy.

Focusing on local alignments, BLAST uses a heuristic algorithm to detect relationships between sequences that may only share isolated regions of similarity. BLAST results take sequence length and the nucleotide/peptide compositions of the query into account when assigning alignment scores. For sequences shorter than 200 residues, an effective length is used to compensate for “edge effects”. Sequence alignment scores are reported by BLAST programs as E-values that reflect the strength of alignment between a given sequence in the database and a query. E-values are reported instead of the traditional P-value, to improve resolution between low scoring alignments, but for closely related sequences (P < 0.01), these values are nearly equal.

For more detailed information on how BLAST scores are calculated, visit:

http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html

For most first-time users of BLAST, choosing the right sub-program may be difficult. BLAST offers a variety of search tools for different types of queries. In general, the best choice of program depends upon the sequence length, the database being searched, and the information requested in the search.

Nucleotide BLAST is a collection of programs allowing users to compare a query sequence against other nucleotides in the database. BLAST accepts sequences in a variety of formats, including FASTA, GenBank, and Accession/GI numbers, and compares these with the NCBI databases. MEGABLAST is a concatenating algorithm for quickly aligning sequences longer than 28 residues. For shorter sequences, such as primers, standard nucleotide-nucleotide BLAST offers automatic parameter settings suited to these queries.

Protein BLAST is a collection of programs used to find protein sequences similar to a query. These programs accept sequences in the same file formats as Nucleotide BLAST. PSI-BLAST is a position specific, iterating algorithm that searches sequences from each round as the basis for scoring sequences searched in the next round. It distinguishes between highly and weakly conserved positions in the sequence, resulting in increased sensitivity with each iteration. PSI BLAST also offers the option of including regular expression patterns in the search, allowing users to identify sequences that include a pattern and are homologous to the query protein sequence. As with Nucleotide BLAST, Protein BLAST includes automatic parameter settings for shorter sequences.

Translating BLAST operates in a similar fashion to both the nucleotide and protein search routines. BLASTX translates nucleotide sequences into protein sequences in each of the 6 reading frames, prior to comparing the query to the protein databases. TBLASTN compares a protein sequence query against a database of nucleotide sequences previously translated in each of the 6 reading frames.

Users can refer to the NCBI BLAST program selection guide for more information:
http://www.ncbi.nlm.nih.gov/blast/producttable.shtml
.

Users can access BLAST tools directly through the web, or through a variety of software applications, such as MiraiBio’s DNASIS SmartNote, which helps users find and organize sequences, and automatically submit them to the BLAST programs. DNASIS SmartNote has the additional ability to BLAST multiple sequences “in batch” without tediously copying/pasting each sequence and waiting for each result to come back.

To learn more about DNASIS SmartNote, visit http://smartnote.miraibio.com.

Posted by aliu under SmartNote

By default, DNASIS SmartNote will track PubMed articles related to up to 10 of your most recently imported sequences. You can view these regularly updated articles by clicking on the “Articles” tab. From here, you can also clip individual articles into your notebook, as well as print and email them.

If you want to receive regular email updates for new articles, you’ll need to explicitly mark at least one of your sequences listed in the “My Seq’s” tab. You will then start receiving regular emails with any newly published articles related to your sequences. Since you can only track 10 sequences, the sequences you have explicitly marked will get higher priority.

How does DNASIS SmartNote find related articles? If the sequence was imported as a GenBank file, DNASIS SmartNote looks for gene names in the annotations. Otherwise, it defaults to a plain text search of words in the sequence’s description.

We invite you to try out this new feature and let us know if we can do anything to improve it for you.

Feb-12-2008

Annotations

Posted by aliu under SmartNote

By popular demand, DNASIS SmartNote now supports sequence annotation. Annotations in GenBank files are automatically imported, and you can also add annotations manually by clicking the “Edit” link next to each sequence in your database, and specifying name, type, range and orientation.

Posted by aliu under SmartNote

New this week:

1. By popular demand, you can now import GenBank and multi-GenBank files into SmartNote, including those exported from other programs, such as DNASIS Max and Vector NTI.

2. We have added a new option for displaying the predicted amino acid sequence aligned with a DNA sequence. Use the ExPASy Translate tool and specify “Includes Nucleotide Sequence” for the “Output Format” option. Thank you, Jessica Min for this suggestion. This should help users who need to design primers for mutagenesis.

3. Support for IUPAC codes in sequences is also being added this week.

Please keep the feedback coming!

Posted by aliu under SmartNote

DNASIS SmartNote was recently released to a limited group of scientists and attracted some great feedback. Here’s a quick summary:

Some users said they needed more help getting started using it. This is not surprising, since DNASIS SmartNote’s design is innovative in two ways - it’s a meta-application (an application that uses other applications) and it’s also the first bioinformatics tool we know of that combines a lab notebook, sequence analysis and social networking.

Others said they want to be able to import sequences in formats other than FASTA. I’m happy to report that GenBank support will soon be available on the live site.

Finally, we are starting to get requests to support specific workflows, such as designing primers for mutagenesis. One user suggested adding a view that shows both DNA and protein sequences. We would love to hear from other DNASIS SmartNote users so we know where to focus our efforts to best meet users’ needs. So please send us feedback or post to this shared blog. Together, we can make DNASIS SmartNote even better!