Diaphorina citri Genome and Transcriptome



The Asian citrus psyllid, Diaphorina citri Kuwayama (Hemiptera: Psyllidae) is a vector for the causative agent of Huanglongbing(HLB) or citrus greening, which threatens citrus production worldwide. The Asian citrus psyllid originated in Asia but it is now also found in parts of the Middle East, South and Central America, Mexico and the Caribbean. In the United States, this psyllid was first detected in Florida in 1998 and is now also found in Louisiana, Georgia, Arizona, South Carolina and Texas. The psyllid feeds on all varieties of citrus (e.g., oranges, grapefruit, lemons, and mandarins).

Credit: USDA
Diaphorina citri current resources 
The genome assembly, annotations and resources are available through the links below.
Genome sequence Pathways
Annotation Manual curation
De novo transcriptome Genome browser
Expression Atlas Blast
Genome Assembly details 

Diaci 3.0 genome build

We present an improved and highly contiguous de novo Diaci version 3.0 assembly based on proximity ligation-based (Hi-C) scaffolding of a non-redundant unitigs from the previous Diaci version 2 genome assembly. The updated Diaci version 3 assembly has 13 chromosomal length scaffolds with a genome size of 475 Mb and a scaffold N50 of 40.5Mb. The genome and associated annotation were released on December 10, 2019 and described in a preprint publication. Slides describing the genome and annotation are available online. The assembly is available in our Blast databases.
Full-length cDNA transcripts from diseased and healthy tissue at multiple life stages were sequenced with PacBio Iso-Seq technology. Iso-Seq along with diverse Illumina RNAseq expression data were used to predict protein-coding genes in the Official Gene Set beta version 3 (OGSv3) using the MAKER annotation pipeline. This gene set is undergoing extensive manual curation and a final version will be released shortly. We also provide functional annotation of the proteins in Official Gene Set beta version 3 performed using Rapid Functional Annotation pipeline. Slides describing the results of the functional annotation are available online.

Diaci 2.0 genome build

A new sequencing effort led by the Mueller and Brown labs utilized Pacbio and Dovetail technologies to create a high quality assembly. The improved v2.0 reference assembly has 1906 contigs and a contig N50 of 759Kb.

The genome and associated annotation were released on March 5, 2018 and described in a webinar. Slides for the webinar are available online. Please see our slides from the ESA 2017 meeting and our poster from the IRCHLB 2017 meeting for more details. We have released a new Official Gene Set (v2.0) with 20,793 genes based on Illumina RNAseq and Pacbio Isoseq evidence. We have updated the DiaphorinaCyc pathway database with OGS v2 and characterized 172 pathways that include 1477 enzymatic reactions. The ACP expression atlas (Psyllid Expression Network) can be used to identify co-expressed gene sets. Blast is available and you can download the data from our FTP site.


Diaci 1.9 interim genome build

Diaci1.9 is interim version of the Diaphorina citri psyllid genome based on Pacbio long reads. We have generated 36.2Gb of Pacbio long reads from 41 SMRT cells with a coverage of 80X for the 400-450Mb psyllid genome. The canu assembler was used to create an interim assembly with a contig N50 of 115.8kb and 8352 contigs. DIACI v1.9 has 8,352 contigs compared to 161,988 contigs in Diaci1.1 genome assembly. It has a contig N50 of 115.8kb and does not contain any Ns. Please note that this is an interim assembly so we will not be annotating it but you can blast to it. You can download the genome from our FTP site. We will be releasing this soon as Diaci2.0.


Diaci 1.1 genome build

Diaci1.1 is the current version. It is an assembly of the Diaphorina citri psyllid genome based on Diaci1.0, with approximately 12 fold coverage of PacBio reads incorporated into the assembly using PBJelly by Adam English and Stephen Richards at Baylor College of Medicine. The scaffold N50 for this 485,705,082bp assembly is 109.8 kb. It has 161,988 scaffolds and 19,335,169 of N's. The number of contigs is 176,470 and the contig N50 is 34,407bp.

It was submitted to NCBI and called NCBI-diaci1.1. It was annotated with the NCBI Gnomon pipeline. This is used as the reference in WebApollo for manual curation. The Maker annotations from Diaci1.0 were updated to conform to the Diaci1.1 build and are also available within WebApollo. You can find more information about our manual curation effort here. We have created a metabolic pathway database with 185 pathways and 1524 enzymatic reactions based on 12,548 proteins in the genome.


Diaci1.0 genome build

Diaci1.0, assembly was produced by Nan Leng in October, 2011 with Velvet. Contigs were filtered for vector and wolbachia contamination. This assembly contains 163023 scaffolds with a total length of 486.9MB. The N50 is 110293 (1097 scaffolds) with a average scaffold size of 2986.98bp. The assembly was annotated using Maker pipeline.


Transcriptome Assembly details 

De novo transcriptome

The D. citri de novo transcriptome combines transcript evidence from Iso-Seq data as well as short read RNA-Seq data. The transcriptome was released on December 10, 2019 and described in a preprint publication. Slides describing the de novo transcriptome are available online.
De novo transcriptome has 60,261 transcripts with a N50 of 3.6Kbp for 40,637 genes. An identifier was chosen according to the source raw data: DcDT for RNA-Seq transcripts and DcDi for Iso-Seq transcripts. The dataset was filtered using Pfam domains and TransDecoder to retain transcripts with coding regions. The assembly is available in our Blast databases.

MCOT transcriptome 1.1

The D. citri MCOT v1.0 transcriptome is a genome independent transcriptome assembly that provides a comprehensive set of gene models and was performed with the MCOT pipeline where transcripts from Maker, Cufflinks, Trinity and Oases pipelines are combined. The Jiang lab has kindly provided the MCOT annotations. MCOT v1.1 set has 30,562 CDS, transcripts and proteins. Combining gene models from Maker and cufflinks that are based on the genome with transcripts from denovo transcriptome assembly from Trinity and Oases allows the identification of genes which only have transcript evidence from RNAseq.
MCOT v1.1 has been annotated with AHRD and Interproscan. The description includes Pfam domains, GO terms and a description generated using Uniprot, Interproscan and the AHRD pipeline. The mapping file describes the connection between the original NCBI annotations and the OGSv1.0, MCOT, and Drosophila orthologs. Please see the README for more information.

Diaci transcriptome 0.9

Diaci_transcriptome_0.9, a transcriptome assembly was constructed by Nan Leng of Illumina using Velvet and Oasis. This de novo transcriptome assembly was produced using as input RNA reads from adult, egg and nymph tissue. Separate transcriptome assemblies for adult, egg and nymph tissues were also constructed by Nan Leng using Velvet and Oasis using only RNA reads from adult, egg and nymph tissues respectively. This transcriptome assembly was used in the Maker annotation of Diaci1.1 genome build.
Data sets 
The genome assembly, pseudomolecules, annotations and genome browser are available through the links below.
Genome sequence Adult transcriptome
Annotation Nymph transcriptome
MCOT transcriptome Egg transcriptome
diaci 0.9 transcriptome Pathways