Biosteganography

What is Biosteganography

DNA for Non-Biologists

DNA stands for deoxyribonucleic acid.

Eukaryote is any cell or organism that possesses a clearly defined nucleus. Humans and many other organisms have eukaryotic cells.

In eukaryotes, the DNA is inside the nucleus. DNA is made of chemical building blocks called nucleotides. A DNA molecule consists of two long polynucleotide chains. Each of these chains is known as a DNA chain, or a DNA strand.

The four types of nitrogen bases found in nucleotides are: adenine (A), thymine (T), guanine (G) and cytosine (C). Remember these names, they will be used like the 0 and 1 in computing. As we will see below, the data, instead of being stored in 0s and 1s, will now be stored in the form of A, T, G, C. The order, or sequence, of these bases determines what biological instructions are contained in a strand of DNA.

DNA sequencing is the process of determining the order of the four bases: thymine (T), adenine (A), cytosine (C), and guanine (G) in a molecule of DNA. Comparing healthy and mutated DNA sequences can diagnose diseases and guide patient treatment. It allows for faster and individualized medical care.

The complete genetic information of an organism is called genome. The study of the genome is called genomics. Humans inherit one half of their DNA from their father and one half from their mother.

DNA computing

In DNA computing, computations are performed using biological molecules, not traditional silicon chips. Richard Feynman introduced the idea in 1959, but DNA computing was formally demonstrated in 1994, when Leonard Adleman presented how molecules could be used to solve computational problems.

Three years after Leonard Adleman's huge step, researchers from the University of Rochester developed logic gates made of DNA. This is another huge step, as logic gates convert binary code moving through a computer into signals that the computer uses to perform operations. These logic gates are an important step toward creating a computer that has a structure similar to that of an electronic PC. But there are some important differences: The large supply of DNA makes it a cheap resource, and DNA computers are way smaller than today's computers.

In DNA computing, instead of the 0 and 1 (the binary alphabet used by traditional computers), we use the four-character genetic alphabet, A, G, C, and T - where A is adenine, G is guanine, C is cytosine, and T is thymine. To store a binary digital file as DNA, the individual bits are converted from 0 and 1 to the letters A, C, G, and T that represent adenine, cytosine, guanine, and thymine. The physical storage medium is a synthesized DNA molecule with adenine, cytosine, guanine, and thymine in a sequence corresponding to the order of the bits in the digital file. To recover the data, the sequence A, C, G, and T of the DNA is decoded back into the original sequence of bits 0 and 1.

DNA computing is harnessing the enormous parallel computing ability and high memory density of bio-molecules, and is changing dramatically what is possible in cryptography. DNA cryptography includes encryption and steganography. We can produce encoded DNA (enDNA) by transforming a binary string into the quaternary code of DNA nucleotides, A, G, C, and T. But we will learn more about it below.

DNA is an excellent medium for data storage, with information density of petabytes of data per gram (a petabyte is 1 million gigabytes). The quantity of data that can be stored in biological mediums far exceeds the capacity of magnetic tapes and disks. There are 4 nucleotides, and each nucleotide can store 2 bits as a binary string (A = 00, T = 01, C = 10, G = 11). A set of 4 nucleotides can store 1 byte. Over 10 trillion DNA molecules can fit into an area no larger than 1 cubic centimeter, and with this amount of DNA, a computer would be able to hold 10 terabytes of data. We need more? We simply add more DNA molecules.

Researchers have successfully encoded audio, images and text files into synthesized DNA molecules, and then successfully read the information from the DNA and recovered the files.

With DNA storage, we can store massive quantities of data in media having very small physical volume. A huge advantage of DNA storage over optical, magnetic, and electronic media, is the fact that DNA molecules can survive for thousands of years, so a digital archive encoded in this form could be recovered by people after thousands of years.

DNA storage technology will not become obsolete. It's not like floppy disks or CDs. With DNA storage we also dramatically improve environmental sustainability, and we have way less greenhouse gas emissions, energy consumption and water use.

The main disadvantages of DNA storage are the slow encoding speed and the high cost. But year after year we have less costs and higher speed. The technology will become commercially viable on a large scale in a few years, to the point where DNA storage can function effectively for general backup applications and even primary storage.

Today, we have the technology to manufacture DNA molecules with arbitrary sequences. It is good to say that the molecules we make are not biological DNA, they're synthetic DNA. There's no life, no cells, no organisms involved in this type of digital data storage. We're using DNA as a medium to store information, synthetic DNA.

Biosteganography

During the Cold War, spies used microdot cameras to photograph and reduce documents onto a single tiny piece of film. The piece of film could be embedded into the text of a letter as small as a period (.). Microdots were also hidden in other things.

The FBI’s March 2020 Artifact of the Month was more than just a toy—it was a tool of espionage tradecraft. A German spy used this doll to smuggle secret photographs to Nazi Germany. The photos were reduced in size so that the film they were on was as small as the period at the end of a sentence. Spies hid this film, called a “microdot,” on the doll, where it was virtually invisible to regular censors.

How did microphotography work? Spies would photograph espionage material with a camera. Then, through a special contraption of lenses, they would copy the image, reduce it in size, and imprint it on especially sensitized film. The Germans concealed microdots on letters and other materials they could carry across borders or mail to dead letter boxes in Europe. (A dead letter box was a fake address that acted as a cutout between a spy and German intelligence headquarters).

In DNA Steganography, we can also encrypt hidden messages within microdots, this time within DNA encoded microdots. We have a plain text message, we encrypt it, and than we convert the letters of the encrypted message into combinations of Thymine (T), Adenine (A), Cytosine (C), and Guanine (G), creating a synthetic strand of DNA - we create synthesized DNA with adenine, cytosine, guanine, and thymine in a sequence corresponding to the order of the bits in the digital file. A tiny piece of DNA with the message is then placed into a normal piece of DNA which is then mixed with DNA strands of similar length. The mixture is then dried on paper that can be cut into microdots, with each dot containing billions of strands of DNA. It is very difficult to detect, and only one strand of the billions of strands within the microdot contains the message, that is also encrypted.

What about digital watermarking? Well, it can only become better. We can place tiny DNA authentication stamps to easily detect counterfeits or copyright infringements.

You may also visit Cyberbiosecurity