How to search data?
A powerful search tool is designed on top right of every web page. The tool allows user to perform queries on single or multiple strains, and supports searching for keywords in multiple types, including sequence identifier, description and annotation. For example, user can type "dehydrogenase" in the input box and select species of "Pasteurella multocida unmsm", then click the Search button (Fig. 1A). Besides, the search tool can also be used to fetch sequences with given required location information in format of “ID:start..end:strand”, like LGRE01000001:1248..1988:-. In a result, search results are listed in a table which can be sorted and filtered for further searches, as well as copied, printed and exported in formats of XLS, PDF and JSON, respectively (Fig. 1B). Similarly, clicking on a feature id brings you to the feature details page (Fig. 1C).
How to browse data?
Genomic features, like genes, transposable elements, protein domains, protein subcellular localizations, can be directly accessed from links under each Species name on the front page (Fig. 2A). Items of each genomic feature, including genes, transposable elements, protein domains and protein subcellular localizations, are listed in a responsive table, which subsequently can be searched, selected, copied and exported in JSON, XLS and PDF format, respectively (Fig. 2B). Besides, the GO browser (PamulGO) allows users to list genes and export sequences based on GO terms. Genome features can also be viewed with the JBrowse (Figure 1F), which supports the visualization of feature position, sequence, G+C content, functional annotation and relationship among features. Clicking a feature ID in each feature table and the GO browser brings you to a responsive webpage, where details of each feature, like sequence length, molecular weight, genomic position, GO, publications, gene family, phylogeny, protein domain and subcellular localization, can be selectively viewed (Fig. 2C).
How to analyze data?
PamulDB provides two tools for analyzing homologous sequences. One is PamulDB BLAST (Fig. 3A) which is built with NCBI BLAST+ 2.7.1 and supports search against multiple database (Fig. 3B). The query can be both sequences in FASTA format and identifiers in the database. The results are given in standard and tabular formats from which outputs can be further searched, ordered, filtered, printed and exported in multiple formats. Besides, both matched targets and regions can also be directly exported (Fig. 3C).
The other homologous search tool is HMMER (Fig. 4A) which implements methods using probabilistic models called profile hidden Markov models. Similarly, the results of HMMER are also listed in a functional table which can be sorted, filtered and exported (Fig. 4B).
PamulDB also provides tool to predict subcellular localization for proteins. The SubLocPred (Figure 5A) is built with PSORTb 3.0 (29) which supports three models, Gram-positive bacteria, Gram-negative bacteria and archaea. For Gram-negative bacteria, probable values are calculated for localizations of cytoplasm, cytoplasmic membrane, extracellular part, outer membrane and periplasmic region, respectively (Fig. 5B).
How to download data?
Genomic sequences, including assembly, gene, CDS, protein, rRNA and tRNA, can be directly downloaded in bulk via the Download Datasets drop down menu at the top of every webpage. Clicking the “Datasets” under the “Downloads” menu brings you to a datasets webpage, where P. multocida strains are grouped by host and serotype, respectively (Figure 6). In each group there is a table providing not only download hyperlinks but also basic information for dataset of each strain.