Article ID Journal Published Year Pages File Type
4503220 Acta Agronomica Sinica 2011 10 Pages PDF
Abstract

The research on nuclear gene codon composition, usage pattern, and influencing factors in soybean can provide theoretical basis for applying genetic engineering techonology to improve soybean [Glycine max (L.) Merr.] varieties. In this paper, a total of 46 430 high confidence predicted coding sequences obtained from soybean genome database and 2071 full-length transcripts obtained from cDNA libraries were used for analyzing the composition and characteristics of soybean nuclear gene codons. The nucleotide composition, relative synonymous codon usage, and other parameters of soybean genome and full-length transcripts were calculated using CondonW software. The results showed that gene expression levels were significantly and positively correlated with the contents of G+C and GC3s, and genes with high G+C and GC3s contents had high codon preference. UCC and GCC were identified as optimal codons in soybean. Analysis of coding sequences in different lengths showed that codon preference reduced as the coding sequence (CDS) length increased, and longer CDS tended to select codons randomly. The CDS with 400 to 600 bp in length had the highest expression level according to the full-length transcripts data. The codon preference and expression level were almost identical between leaf-specific and seed-specific genes. However, seed-specific genes had significantly higher G+C and GC3s contents than leaf-specific genes, and the contents of aromatic amino acids encoded by seed-specific genes were significantly lower than that encoded by leaf-specific genes.

摘要研究大豆核基因密码子的使用模式, 探讨影响其密码子组成和编码特点的因素, 为运用基因工程技术提高改良大豆提供理论依据。以大豆基因组的46,430个高置信编码基因和2 071条大豆全长转录本序列为数据来源, 应用CodonW软件对大豆全基因组密码子组成、同义密码子使用频率和全长转录组编码区密码子使用各项参数的计算和统计分析发现, 基因的表达水平与编码区G+C和GC3s含量均呈极显著正相关, 且G+C和GC3s含量越高的基因密码子使用偏好性越高, 并确定了UCC和GCC为大豆最优密码子。编码区长度分组分析表明, 密码子使用偏好性随编码区长度的增加而降低, 编码区较长的基因则趋向于随机使用密码子, 且在转录组数据范围内, 编码区长度介于400 to 600 bp的基因表达水平最高。大豆叶片和种子中特异表达基因的密码子使用偏好性和基因表达水平较为接近, 但种子特异表达基因的G+C和GC3s含量均显著高于叶片特异表达基因, 而其芳香族氨基酸含量则极显著低于叶片特异表达基因。

Related Topics
Life Sciences Agricultural and Biological Sciences Agronomy and Crop Science