Documentation
User Guide

GPU-HMMER implements the hmmsearch portion of the HMMER sequence analysis suite. All other tools (hmmpfam, standard hmmsearch, etc.) remain available to the user and are unmodified. The GPU portion consists of cuda_hmmsearch and its helper utility hmmsort. In this guide we walk the user through the steps needed to successfully run the cuda_hmmsearch application.


Using GPU-HMMER

Before using cuda_hmmsearch, the user is urged to first sort the sequence database. This is not strictly necessary, but performance will be quite poor if it is not performed. This step only needs to be performed once, and the database may be reused many times thereafter.

./hmmsort <input_database> <output_database>

When the function returns, the sorted database will be returned as <output_database>.

This may take some time, depending on the size of your database. Also note that the current implementation will read the entire database into memory first, then sort it, so you might see some swapping if you are low on memory or have large databases.

Once you have sorted the database, you can use cuda_hmmsearch to search against this this sorted database. To do so, the basic command is:

./cuda_hmmsearch /path/to/HMM/file  /path/to/sequence/
database

This will output all data in the same format as traditional hmmsearch. Note that the output will not be 100% identical to the traditional hmmsearch. We have verified that scores are correct within a small tolerance, but due to hardware differences there may be some differences, especially in the histogram output.


Options

GPU-HMMER 0.91 and above includes several command line options, most of which are designed to facilitate the use of multiple GPUs within a single host. Note that GPU designation is identical to the numeric ID given by CUDA's deviceQuery :

  • --gpu <n> : instructs cuda_hmmsearch to use <n> GPUs (default 1). It is assumed that each GPU is more or less identical, though we have had some success mixing GTX 260s with GTX 280. However, the code must target the GTX 260 (in terms of number of threads, etc.)

  • --force-GPU <n1,n2,...> : instructs cuda_hmmsearch to use GPUs n1, n2, etc. Only the GPUs indicated in the comma separated list are used. This is useful, for example, in systems that include multiple GPUs where one or more are underpowered. Using this option, the user may choose to ignore one or more GPUs. If --force-GPU is used, it will override the --gpu option.

  • --verify : this option instructs cuda_hmmsearch to assign sequences to GPUs in a predictable fashion. When using multiple GPUs, there is a possibility of slight reordering in the case of multiple hits that result in identical scores. This is a debugging feature, and is designed to be used with either --gpu or --force-GPU options.