Abstract:
Objective To analyze the codon usage bias of Cypripedium calceolus chloroplast genome, and identify the main factors influencing codon usage bias of this species in order to provide reference for the chloroplast genomics research of Orchidaceae species.
Method Downloading the complete chloroplast genome sequence of C. calceolus and screening the protein coding sequences, the EMBOSS online program was used to calculate the GC content of each gene and codon, and the software CondonW was used to calculate the length of amino acid (LAA), effective number of codon (ENC), relative synonymous codon usage (RSCU), frequency of optimal codons (FOP) and the acid base content of the third nucleoside of each gene codon. The software SPSS was used to analyze the correlation among each index, and software Origin was used to plot.
Result The third codon position of C. calceolus chloroplast genome sequence was rich in A and T, and the GC3 content was only 29%. The ENC values varied from 37.92 to 61.00, indicating a relatively weak codon usage bias. The correlation between the number of effective codons and GC3 showed an extremely significant level. There were 34 codons with relative synonymous codon usage greater than 1 and 29 codons ending with A and U. Analysis of neutral plot, ENC-plot and PR2-plot showed that the preference of C. calceolus chloroplast genome codons was mainly influenced by natural selection. Correspondence analysis showed a similar pattern of codon usage bias of the genes encoding photosynthetic system proteins, while other types of genes were quite different. Sixteen codons were finally determined as the optimal codons.
Conclusion This study confirms that natural selection is the main factor affecting codon usage bias of C. calceolus chloroplast genome. The optimal codon of this species is screened. The results can provide a reference for the phylogeny and chloroplast genome codon evolution of Orchidaceae.