The Hitachi Software MiraiBio Group Blog

What’s going on at MiraiBio

Subscribe to The Hitachi Software MiraiBio Group Blog
Posted by aliu under SmartNote

NCBI BLAST, the Basic Local Alignment Search Tool (BLAST) is a suite of programs designed to search all available sequence databases for similarities between a protein or DNA query and known sequences. BLAST allows quick matching of near and distant sequence relationships, providing scores that allow the user to distinguish real matches from background hits with a high degree of statistical accuracy.

Focusing on local alignments, BLAST uses a heuristic algorithm to detect relationships between sequences that may only share isolated regions of similarity. BLAST results take sequence length and the nucleotide/peptide compositions of the query into account when assigning alignment scores. For sequences shorter than 200 residues, an effective length is used to compensate for “edge effects”. Sequence alignment scores are reported by BLAST programs as E-values that reflect the strength of alignment between a given sequence in the database and a query. E-values are reported instead of the traditional P-value, to improve resolution between low scoring alignments, but for closely related sequences (P < 0.01), these values are nearly equal.

For more detailed information on how BLAST scores are calculated, visit:

http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html

For most first-time users of BLAST, choosing the right sub-program may be difficult. BLAST offers a variety of search tools for different types of queries. In general, the best choice of program depends upon the sequence length, the database being searched, and the information requested in the search.

Nucleotide BLAST is a collection of programs allowing users to compare a query sequence against other nucleotides in the database. BLAST accepts sequences in a variety of formats, including FASTA, GenBank, and Accession/GI numbers, and compares these with the NCBI databases. MEGABLAST is a concatenating algorithm for quickly aligning sequences longer than 28 residues. For shorter sequences, such as primers, standard nucleotide-nucleotide BLAST offers automatic parameter settings suited to these queries.

Protein BLAST is a collection of programs used to find protein sequences similar to a query. These programs accept sequences in the same file formats as Nucleotide BLAST. PSI-BLAST is a position specific, iterating algorithm that searches sequences from each round as the basis for scoring sequences searched in the next round. It distinguishes between highly and weakly conserved positions in the sequence, resulting in increased sensitivity with each iteration. PSI BLAST also offers the option of including regular expression patterns in the search, allowing users to identify sequences that include a pattern and are homologous to the query protein sequence. As with Nucleotide BLAST, Protein BLAST includes automatic parameter settings for shorter sequences.

Translating BLAST operates in a similar fashion to both the nucleotide and protein search routines. BLASTX translates nucleotide sequences into protein sequences in each of the 6 reading frames, prior to comparing the query to the protein databases. TBLASTN compares a protein sequence query against a database of nucleotide sequences previously translated in each of the 6 reading frames.

Users can refer to the NCBI BLAST program selection guide for more information:
http://www.ncbi.nlm.nih.gov/blast/producttable.shtml
.

Users can access BLAST tools directly through the web, or through a variety of software applications, such as MiraiBio’s DNASIS SmartNote, which helps users find and organize sequences, and automatically submit them to the BLAST programs. DNASIS SmartNote has the additional ability to BLAST multiple sequences “in batch” without tediously copying/pasting each sequence and waiting for each result to come back.

To learn more about DNASIS SmartNote, visit http://smartnote.miraibio.com.

Posted by aliu under SmartNote

Congratulations to James Galen and Jessica Min for winning the Top Feedback contest for $25 Amazon.com gift certificates. We value our user’s input as it helps make the application better for everyone.

New this week:

1. You can now share sequences with colleagues via the new Friend’s Sequences tab. This will allow you to easily view or copy any sequence from anyone in your friends list.

2. Do you need to export your sequences? A new feature has been added that will export your sequences in FASTA format. If you have multiple sequences selected, they will all be in a single file in multi-FASTA format.

The Top Feedback contest is still going so please take a minute or two and tell us about your experience with SmartNote.

Posted by aliu under SmartNote

DNASIS SmartNote was recently released to a limited group of scientists and attracted some great feedback. Here’s a quick summary:

Some users said they needed more help getting started using it. This is not surprising, since DNASIS SmartNote’s design is innovative in two ways - it’s a meta-application (an application that uses other applications) and it’s also the first bioinformatics tool we know of that combines a lab notebook, sequence analysis and social networking.

Others said they want to be able to import sequences in formats other than FASTA. I’m happy to report that GenBank support will soon be available on the live site.

Finally, we are starting to get requests to support specific workflows, such as designing primers for mutagenesis. One user suggested adding a view that shows both DNA and protein sequences. We would love to hear from other DNASIS SmartNote users so we know where to focus our efforts to best meet users’ needs. So please send us feedback or post to this shared blog. Together, we can make DNASIS SmartNote even better!