qPrimerDB - qPCR Primer Database

A resource for real-time quantitative PCR primers

Version: 1.1

Birds_and_Reptiles

Anas platyrhynchos
Gallus gallus
Meleagris gallopavo
Pelodiscus sinensis
Taeniopygia guttata

Fish

Astyanax mexicanus
Danio rerio
Gadus morhua
Gasterosteus aculeatus
Latimeria chalumnae
Lepisosteus oculatus
Oreochromis niloticus
Oryzias latipes
Petromyzon marinus
Poecilia formosa
Takifugu rubripes
Xiphophorus maculatus

Insecta

Acromyrmex echinatior
Aedes aegypti
Anopheles coluzzii
Anopheles stephensi
Apis dorsata
Apis florea
Apis mellifera
Athalia rosae
Bactrocera cucurbitae
Bactrocera dorsalis
Bombus terrestris
Bombyx mori
Camponotus floridanus
Cerapachys biroi
Ceratitis capitata
Chilo suppressalis
Danaus plexippus
Diaphorina citri
Drosophila ananassae
Drosophila erecta
Drosophila grimshawi
Drosophila melanogaster
Drosophila mojavensis
Drosophila sechellia
Drosophila simulans
Drosophila virilis
Drosophila willistoni
Drosophila yakuba
Fopius arisanus
Harpegnathos saltator
Heliconius melpomene
Linepithema humile
Macrocentrus cingulum
Megachile rotundata
Megaselia scalaris
Melitaea cinxia
Microplitis demolitor
Musca domestica
Nasonia vitripennis
Pediculus humanus
Plutella xylostella
Pogonomyrmex barbatus
Rhodnius prolixus
Solenopsis invicta
Tribolium castaneum
Vollenhovia emeryi
Wasmannia auropunctata
Zootermopsis nevadensis

Mammalia

Ailuropoda melanoleuca
Bos taurus
Callithrix jacchus
Canis familiaris
Cavia porcellus
Chlorocebus sabaeus
Choloepus hoffmanni
Dasypus novemcinctus
Dipodomys ordii
Echinops telfairi
Equus caballus
Erinaceus europaeus
Felis catus
Gorilla gorilla
Homo sapiens
Ictidomys tridecemlineatus
Loxodonta africana
Macaca mulatta
Microcebus murinus
Monodelphis domestica
Mus musculus
Myotis lucifugus
Nomascus leucogenys
Ornithorhynchus anatinus
Oryctolagus cuniculus
Otolemur garnettii
Ovis aries
Pan troglodytes
Papio anubis
Pongo abelii
Procavia capensis
Pteropus vampyrus
Rattus norvegicus
Sarcophilus harrisii
Sus scrofa
Tarsius syrichta
Tupaia belangeri
Tursiops truncatus
Vicugna pacos

Other_animals

Anolis carolinensis
Caenorhabditis briggsae
Caenorhabditis elegans
Ciona intestinalis
Ciona savignyi
Ixodes scapularis
Lottia gigantea
Mnemiopsis leidyi
Octopus bimaculoides
Sarcoptes scabiei
Strongylocentrotus purpuratus
Strongyloides ratti
Tetranychus urticae
Xenopus tropicalis

Fungi

Saccharomyces cerevisiae

Eudicotyledons

Aethionema arabicum
Amaranthus hypochondriacus
Arabidopsis halleri
Arabidopsis lyrata
Arabidopsis thaliana
Arachis duranensis
Arachis ipaensis
Beta vulgaris
Boechera stricta
Brassica napus
Brassica oleracea capitata
Brassica oleracea italica
Brassica rapa
Camellia sinensis
Capsella grandiflora
Capsella rubella
Carica papaya
Chenopodium quinoa
Cicer arietinum
Citrus clementina
Citrus sinensis
Coffea canephora
Cucumis melo
Cucumis sativus
Cucurbita moschata
Daucus carota
Eucalyptus grandis
Eutrema salsugineum
Fragaria vesca
Glycine max
Gossypium hirsutum
Gossypium raimondii
Kalanchoe fedtschenkoi
Kalanchoe laxiflora
Linum usitatissimum
Lotus japonicus
Malus domestica
Manihot esculenta
Medicago truncatula
Mimulus guttatus
Nicotiana tabacum
Phaseolus vulgaris
Populus deltoides
Populus trichocarpa
Prunus persica
Ricinus communis
Salix purpurea
Sida fallax
Solanum lycopersicum
Solanum tuberosum
Thellungiella parvula
Theobroma cacao
Trifolium pratense
Vigna angularis
Vigna radiata
Vitis vinifera

Monocotyledons

Aegilops tauschii
Ananas comosus
Brachypodium distachyon
Brachypodium stacei
Hordeum vulgare
Leersia perrieri
Musa acuminata
Oropetium thomaeum
Oryza barthii
Oryza brachyantha
Oryza glaberrima
Oryza glumaepatula
Oryza indica
Oryza longistaminata
Oryza meridionalis
Oryza punctata
Oryza rufipogon
Oryza sativa
Panicum hallii
Setaria italica
Setaria viridis
Sorghum bicolor
Spirodela polyrhiza
Triticum aestivum
Triticum urartu
Zea mays
Zostera marina

Other_plants

Amborella trichopoda
Chlamydomonas reinhardtii
Cyanidioschyzon merolae
Marchantia polymorpha
Micromonas
Micromonas pusilla
Ostreococcus lucimarinus
Physcomitrella patens
Selaginella moellendorffii
Volvox carteri

Pipeline

The workflow used for gene-specific qPCR primer design

workflow
  • Step 1 Download the coding gene, mNRA and genome sequence for target species for which complete genome sequences have been assembled, annotated and published. It is recommended to retrieve above-mentioned sequences from integrative databases, such as NCBI (https://www.ncbi.nlm.nih.gov), Ensembl (http://www.ensembl.org), USCS (http://genome.ucsc.edu), Phytozome (https://phytozome.jgi.doe.gov/pz/portal.html) and Gramene (http://www.gramene.org). Researchers also could obtain required sequences from specific genome databases or prepared their own de novo assembled transcriptome sequence for gene-specific qPCR primer design. For each gene, only the longest coding/mRNA sequence for each gene was retained as representative coding/mNRA sequence, all duplicated coding/mNRA sequences will be removed.
  • Step 2 To design as many qPCR primers as possible for each gene, all sequences retained in Step 1 are shredded into a series of short, overlapping template fragments using BBMap (http://jgi.doe.gov/data-and-tools/bbtools) with the specified window size and step size currently set to 300 bp and 50 bp, respectively. The window and step size can be adjusted for organisms with complex genomes, but this will dramatically increase the computation time.
  • Step 3 Step 3 performs automatic primer design for all of the shredded template fragments using Primer3 with the following parameters: amplicon size 80–300 bp; amplicon GC content 40–60%, with an optimal GC content of 50%; primer length 18–28 nt, with an optimal length of 22 nt; and melting temperature (Tm) 58–64ºC, with an optimal Tm of 60ºC and maximum Tm difference per primer pair of less than 3ºC. For each gene, duplicated candidate primer pairs generated at neighboring windows are removed.

  • Step 4 The output file of Primer3 were transformed into ePCR input format using an in-house perl script. Then, ePCR was employed for the first-round of specificity check to all designed primer pairs. All primers with one mismatch or indel against any off-target sequence are filtered out. The remaining candidate primer pairs are used for the next round of specificity validation.

  • Step 5 Step 5 indexes the databases for the genome sequence and coding/mRNA sequence of the target organism based on the k-mer index algorithm using the thermodynamics-based gene-specificity checking program MFEprimer-2.0. By default, the k-mer is set to 9, which is sufficient for primer specificity checking; this value can be smaller if more stringent conditions are desired, but much more time will be needed for computation. The specificity of the qPCR primer pairs against genomic sequence and coding/mRNA sequence databases is then estimated, and related parameters such as the binding site in the sequence, the size and GC content of the amplicon, and the Tm and Gibbs free energy (ΔG) of the forward and reverse primers can be saved.

  • Step 6 The primer pairs for each gene are divided into three levels based on PPC and the binding stability of the binding site, ΔG, as follows:

           Standard for selection of level 1 primer pairs: the binding ability index, primer pair coverage (PPC) = 10. The primer pairs will be removed if their PPCs against any non-specific amplicons are higher than 1.

           Standard for selection of level 2 primer pairs: the binding stability index ΔG < −9 kcal/mol. The primer pairs will be filtered out if the ΔG of binding of either the forward or reverse primer to off-target sequences is less than −9 kcal/mo.

           Standard for selection of level 3 primer pairs: the binding stability index Gibbs free energy (ΔG) −11 kcal/mol. The primer pairs will be filtered out if the ΔG of binding of either the forward or reverse primer to off-target sequences is less than −11 kcal/mol

           In addition, all qualified primer pairs are further divided into two categories: the best primers and alternative primers. The former category includes only one best, unique primer pair with the lowest total ΔG of the pair, which is the gene-specific qPCR primer pair recommended for the detection of gene expression. The latter category includes all alternative gene-specific qPCR primer pairs.

  • Step 7 Step 7 carries out a BLAT search using each pair of primers against the genomic sequence. The number of exons spanned and the genomic position of each primer and amplicon can be determined to select primers spanning an exon-intron boundary, and this information can be used to avoid amplifying contaminating genomic DNA

  • After computation, our workflow generated a total of 51,091,785 gene-specific qPCR primer pairs, corresponding to 93.44% of the average coverage ratios per organism and an average of 15.33 primer pairs per gene. In human and mouse, more than 19 and 22 primer pairs per gene were obtained, with a coverage ratio of 78.83 and 90.32%, respectively, indicating that the results of our workflow are in accordance with those in MRPrimerW and PrimerBank. Moreover, it is more difficult to design qPCR primers for plants than for animals. Using the same workflow, the coverage ratios decreased from 93.80% in animals to 93.08% in plants, and the number of primer pairs per gene significantly dropped from 17.20 to 12.55 (permutation test P-value < 0.001). This reduction might have been due to the difficulty in designing qPCR primers for polyploid plants, such as oilseed rape, soybean (Glycine max), potato (Solanum tuberosum), and tomato (Solanum lycopersicum). Hence, our database will greatly facilitate qPCR experiments using organisms with complex genomes

Database implementation

  • qPrimerDB is implemented using PostgreSQL 9.6 (http://www.postgresql.org), PHP 5.5 (http://www.php.net), Apache Web Server 2.4 (http://www.apache.org), and BioPerl 1.6 (http://bioperl.org) on a Linux CentOS 6.8 operating system. Our database is supported by integration with the Chado database schema and Drupal, a popular Content Management System (CMS). There are four layers in the qPrimerDB architecture, as described for SilkPathDB. In the core layer, all qPrimerDB data are stored within the Chado schema and maintained using the PostgreSQL. The second layer governs communications between the core layer and the outer layers, as a configurable layer. The third layer invokes background programs to process and respond to queries, and the Drupal CMS is used for website content management. The outermost layer is a user interface layer that allows the users to directly interact with the database. To provide a friendly interface for users accessing the database from desktop and mobile devices, the Bootstrap framework (http://getboot strap.com) is employed. PHP and BioPerl scripts are used to generate primer index and search results pages by retrieving primer information for query genes and organisms provided by the user. The BLAST database is built with NCBI BLAST+ 2.6.0, with the multiple database search function. Finally, jQWidgets (http://www.jqwidgets.com) is used to organize and present the feature index, search, and BLAST results