Abstract:
To strengthen the research of resources evaluation, protection and identification of endemic and endangered species of
Rhododendron longipedicellatum, and to provide a helpful reference for genetic breeding and improvement of its agronomic traits, the transcriptome was sequenced by using Illumina Hiseq 4 000, in total, 74 092 Unigenes with an average length of 938 nt, N
50 of 1616 nt, Q
20 of 98.22%, Q
30 of 95.20% and GC content of 43.24% were obtained by de novo assembly and cluster with filtered data, and there were 23 879 Unigenes with more than 1 kB. Then, the Unigenes were annotated by 7 functional databases, and finally, 39 876(NR:53.82%),38 065(NT:51.38%), 27 384(Swissprot:36.96%), 16 099(COG:21.73%), 30 401(KEGG:41.03%), 17 518(GO:23.64%), and 29 676(Interpro:40.05%) Unigenes were annotated. The Unigenes were roughly divided into three functional categories(i.e. biological processes, cellular components and molecular function) and 56 sub-categories according to GO function. Most of the genes performed biological processes. KEGG functional annotation analysis showed that Unigenes could be grouped into 6 categories, 32 metabolic pathways. 176 Unigenes relating to human diseases were detected, including endocrine and metabolic diseases(167) and antimicrobial resistance(9). The 39 418 CDS were detected with functional annotation results, and after 3 194 CDS also were predicted by ESTScan with the remaining Unigenes. The 1 488 Transcription Factor(TF) coding Unigenes were predicted and 57 927 SNP polymorphic loci were detected. The analysis of the transcriptome could lay a foundation for further study of functional gene discovery and utilization, resistance mechanism analysis, classification and evolution of genetic resources, molecular marker development and molecular assisted breeding of
R.longipedicellatum and other congeneric species.