“Life is a process of self-assembly.”


Soft-spoken biochemist Alex Li apologizes if he’s sounding too philosophical, but it’s hard to avoid such reflections when your work deals with the fundamental principles of how living things are put together.

He’s especially fascinated by the way proteins come to have the shape they have. Proteins make up our hair and muscle, our brains and lungs, our enzymes and antibodies, and each one must attain a particular shape in order to do its work. They start out as chains of small links called amino acids and then, within milliseconds of their creation, they fold and twist and wad up into the distinctive shapes that are critical to their function. Many go on to combine with other proteins—either identical copies of themselves or different proteins—to assemble into a sort of super-structure.

Alex Li
Alex Li (Photo Robert Hubner)

Despite the complexity of the task, it appears that proteins assemble with little or no help from “cellular machinery,” says Li. Amazingly, improbably, most protein complexes achieve their shape by following specific codes that are built into their structure. They self-assemble.

While other scientists delve into the details of how that happens, Li is looking for ways to turn the natural folding and assembly processes to our advantage—to use them to make nano-scale machines that could do things like deliver drugs to a specific location in the body or sense the presence of a pathogen or toxin.

As Li sees it, nature’s assembly methods have been honed by eons of evolutionary selection; rather than creating nanomachines by trying to shrink our standard methods of production, why not use the processes nature provides?


With its bumps and grooves and hidden pockets, the surface of a mature protein is so distinctive that Li thinks it presents a “molecular code” that allows it to be recognized by other proteins that have a matching code, the way a key fits its matching lock.

He thinks such codes are powerful enough to explain how proteins and other molecules self-assemble—and that if we understood how the codes work, we could use the same tactic to manufacture nanomachines—molecules made to order to do specific tasks.

“It starts here,” says Li, tapping his head. “You visualize something: I think this has the perfect matching code, this is going to self-assemble.”

In one recent experiment, he showed that a very simple molecular code enables molecules to recognize each other and come together to form a larger structure. He made molecules that were flat and roughly oval in shape. Each had two small gaps, or bays, where Li could attach other small chemical groups. The bays weren’t big enough to accommodate the added groups unless the whole molecule twisted a bit to open up the bays more. By attaching groups of different sizes, Li forced the molecule to twist very little, a lot, or an in-between amount. Then he mixed molecules that had different amounts of twist to see whether any of them would recognize and associate with each other.

He picks up a sheet of paper and holds it out horizontally in front of himself.

“We’re taking a planar molecule and we’re twisting it,” he says, turning the edges of the paper in opposite directions. “As we’re twisting the plane at different degrees, we make different codes.”

He found that molecules with the same amount of twist glommed onto each other and stacked up like Pringles potato chips. Those with slightly different amounts of twist associated to some extent, like regular chips that sometimes fit together but more often don’t. Those with very different amounts of twist were like popcorn. The individual units didn’t come together at all.

“When molecules have a matching code, it’s kind of like people sharing the same personality, same common interests,” says Li. “They just get together and become friends.” Molecules with incompatible codes, on the other hand, “basically hate each other. You put them in the same flask, they don’t see each other. They never come together.”


That proteins follow precise patterns of folding to attain just the right shape to do their job has been standard fare in biochemistry classes for decades, says WSU molecular biologist Ray Reeves.

“I was trained that proteins have to have some sort of structure,” he says. “The other part was not interesting.”

Ray Reeves, WSU molecular biologist
Ray Reeves, WSU molecular biologist
(Photo Robert Hubner)

By “other part” he means portions of some proteins that were found to be not all that orderly. Instead of maintaining a distinctive shape, they flopped around loosely. They appeared to have no real purpose, and were thought to be evolutionary relics like the appendix in humans.

Then Reeves met a protein called HMGA.

It was in the 1980s and he was studying proteins involved in cell division. One group of proteins caught his interest because of one strange behavior: They remained dissolved in 10 percent acid. Other proteins precipitate—become solid—at such concentrations of acid.

Reeves found that one member of the group, HMGA, is strongly associated with dividing cells, suggesting it plays an important role in that process. Yet it floats around like an open chain. It has no shape, no characteristic structure, of its own.

Which looked like scientific heresy. With no inherent shape, what does HMGA do, and how does it do it?

Over the next several years, Reeves and his students found that HMGA is a transcription factor, a protein that binds to DNA and assists in turning on, or off, specific genes. In humans, it’s involved in the regulation of at least 50 genes, almost all of them involved in controlling cell division or growth. HMGA is abundant in embryonic cells, which are dividing rapidly as part of normal growth. It is present in lower amounts in adult cells that divide slowly throughout life, like those that line the gut and lungs.

It also shows up in cancer cells.

“This is one of the best biomarkers for cancer,” says Reeves. HMGA has been found in almost every cancer that has been looked at, including lymphoma, breast cancer, and prostate cancer, “and the worse the cancer, the higher the level.”

But the mystery of its shapelessness remained. Other transcription factors have definite shapes and turn on their target genes by recognizing specific sequences of DNA. How does HMGA work?

DNA is composed of four subunits, labeled A, T, C, and G (for adenine, thymine, cytosine, and guanine). A single gene has hundreds or thousands of these subunits in a specific sequence that spells out the order of amino acids needed to make a particular protein.

Most transcription factors recognize a short sequence near the start of their target gene. Reeves found that HMGA doesn’t recognize DNA sequence at all. Instead, it recognizes the structure of DNA in certain areas, and then shapes itself to fit the structure it finds.

The key to how it works is that DNA is not symmetrical. It’s twisted in suc
h a way that one side of it is narrower than the other. In stretches of DNA with all As and Ts (and no Cs or Gs), that narrow side is especially skinny. A patch of just 6 As or Ts in a row, in any order, is enough to create that skinny groove.

“It’s like the Colorado River in the Grand Canyon,” says Reeves. “It’s that narrow canyon that these guys [HMGA] are looking for.”

When HMGA finds such a slot in DNA, in it goes. Then it latches on with three “AT hooks” that form in the protein to snag the edges of the DNA. “They’re like little hands sticking in there and getting a grip,” says Reeves, “but they don’t have any shape [of their own] until they recognize and bind to something.”

As HMGA binds to DNA, it changes the shape of the DNA just enough to allow other transcription factors to bind and turn on (or off) whatever gene is nearby.

Since the discovery of HMGA, biochemists have taken a closer look at the unruly, shapeless parts of other proteins, those parts that were once thought to be useless. So far they all have some way of fitting themselves to the shape of their target; many even have AT hooks. “Ours is just taking that idea to the extreme,” says Reeves.


Looking for ways to exploit the ability of coded molecules to recognize each other, Li hit upon the idea of a “smart” sensor that would signal when it recognized something of interest. In recent years, many labs have worked on biosensors that will detect a virus or an airborne toxin. Each of those biosensors was built more or less from scratch and was designed to recognize just one kind of target. Li aimed for a sensor that could be tailored to different uses by plugging in a part that would recognize the specific thing you were interested in.

He came up with a molecule made of two kinds of alternating segments, like lengths of a broomstick connected by lengths of chain. The broomstick segments provide the basic framework of the sensor. They share a code that allows them to attach to each other so the entire structure folds up like a road map. The chain segments could be bits of protein or DNA—something that will recognize and bind to the target. At each end of the sensor, Li attached a different color fluorescent dye.

At rest, the sensor is fully folded and the dyes at the ends are close to each other. When the recognition segments (the lengths of chain) bind to their target, their structure changes. That, in turn, makes the folds open up, which makes the dyes at the ends move farther apart, which changes the color you see (cf. illustration below).

Li built the framework segments, incorporating code features so they would match up and make the sensor fold. The next step was to select linker segments that would be the “sensing” part of the sensor. He needed to use something that would recognize a specific target, and would change its shape when it bound to that target. He asked if Reeves knew of any good candidates.

“I said, ‘Funny you should mention that!” recalls Reeves. “‘I think we’ve got something for you.’”

Reeves suggested that using stretches of AT-rich DNA as the linkers should allow the sensor to detect the presence of HMGA. It would be a good way to test Li’s design, and if it worked, it could offer a new diagnostic test for cancerous cells.

Li tried it. He connected his framework segments with lengths of AT-rich DNA and then exposed the sensor to HMGA. It worked—HMGA bound to the linker segments and changed the shape of the sensor enough to change the color it emitted.

“He said, ‘your protein is amazing!’” says Reeves. “I said, ‘it’s not my protein, it’s nature’s protein, but it is amazing.’”

Li hopes he’ll eventually be able to put his biosensor into cells to see if they’re making more HMGA than they should be (and therefore might be cancerous). The sensor is small enough that it can be used with living cells, and it is much more sensitive than current methods of detecting cancer, but a few hurdles remain before it is ready for clinical use. Li still needs to find a way to get the sensor into the cells in question. He also has to figure out how to tell whether a cell that’s making HMGA is normal and harmless, or poses a threat. At early stages of cancer, the cells aren’t dividing rapidly and don’t look much different from healthy cells. That’s why early detection by current methods is so difficult.

“Can you distinguish that [normal] guy from something that’s going to go on to become a tumor?” says Reeves. “At the low end of the scale, going from normal to cancerous is the [point] that would be most important and interesting to detect. I think we’re getting closer, but it’s not there yet.”


The problems yet to be solved don’t faze Li. His smart sensor works; he’s devised a way to make molecules that are the size he needs and that will find their proper partners and assemble into the proper form; his ideas about how molecules recognize each other have been confirmed.

Best of all, he feels he’s onto something entirely new, a set of fundamental principles that apply to many aspects of chemistry.

“I think we have a cross-cutting theory that allows you to say exactly why molecules come together and why they don’t come together,” he says.

Besides, the work is just plain fun.

“I love coming in every day and thinking about molecules,” he says.


These molecules are made for each other: Alex Li showed that the shape of a molecule is a “code” that allows it to recognize molecules with similar codes.

WSU biochemist Alex Li used molecular codes to design a biosensor that detects HMGA, a protein made by cancer cells.

Shapeshifting protein: Most proteins have a distinctive shape that influences which other molecules they interact with. WSU molecular biologist Ray Reeves works with a remarkable protein that has no shape of its own.

On the Web

Foldit :: A free computer puzzle game designed by scientists to calculate protein shapes. (Read an article about Foldit, “Wasting Time for a Good Cause.”)

What is Folding and Why Does it Matter?