Research Supervisor Connect

Discovering DNA sequences based on error control codes


Error control codes have been widely used in communication systems to reduce errors in transmission. The general idea is that redundant symbols are first added to the useful data symbols (encoding), to form a transmitted symbol sequence. This symbol sequence passes through a noisy channel, which induces errors. At the receiver, the redundant symbols are utilized to obtain an estimate of the original transmitted symbol sequence (decoding), in the presence of these errors.


Professor Yonghui Li, Professor Branka Vucetic.

Research location

Electrical and Information Engineering

Program type



Error control codes have recently been applied to understanding the structure and generation of DNA. Specifically, to construct DNA sequences, the DNA in a nucleus is first copied (transcribed) to an mRNA sequence, which is then used for protein construction (translation and folding). One problem is that errors may occur during this process, which may result in a protein not based on the original DNA design. However, remarkably, the resulting protein often closely matches with the original design. This suggests some sort of encoding/decoding process occurring during protein construction. Recent results already indicate that certain DNA sequences are generated by BCH codes. The task of the student is to discover which DNA sequence is generated by a particular error control code. Understanding the structure of DNA sequences is crucial in analyzing genetic disorders, which can cause fatal diseases such as cancer.

Want to find out more?

Opportunity ID

The opportunity ID for this research opportunity is 1748

Other opportunities with Professor Yonghui Li