United Kingdom

A new comprehensive map links each human gene to its function

Data on a new gene function map are available for use by other scientists. “It’s a great resource in the way that the human genome is a great resource, because you can go in and do research based on discoveries,” said Professor Jonathan Weissman.

The researchers used their Perturb-seq cell sequencing tool on each gene expressed in the human genome, linking each to its work in the cell.

Genetic research has progressed rapidly over the past few decades. For example, just a few months ago, scientists announced the first complete sequencing of the human genome without gaps. Now researchers have advanced again, creating the first comprehensive functional map of genes that are expressed in human cells.

The Human Genome Project was an ambitious initiative to sequence every part of human DNA. The project brought together collaborators from research institutions around the world, including the Massachusetts Institute of Technology’s Whitehead Biomedical Research Institute, and was finally completed in 2003. Now, more than two decades later, Massachusetts Institute of Technology professor Jonathan Weissman and colleagues it went beyond the sequence to present the first comprehensive functional map of genes that are expressed in human cells. Data from this project, published online on June 9, 2022 in the journal Cell, links each gene to its work in the cell and is the culmination of years of collaboration on the Perturb-seq single-cell sequencing method.

The data are available for use by other scientists. “It’s a great resource in the way the human genome is a great resource, because you can go in and do research based on discoveries,” said Weissman, who is also a member of the Whitehead Institute and a researcher at the Howard Hughes Institute of Medicine. . “Instead of defining in advance what biology you are going to look at, you have this genotype-phenotype map and you can log in and browse the database without having to do any experiments.

CRISPR, meaning grouped, regularly spaced short palindromic repeats, a genome editing tool invented in 2009, has made DNA editing easier than ever. It is easier, faster, cheaper and more accurate than previous genetic editing methods.

The screen allowed researchers to delve into various biological issues. They use it to study the cellular effects of genes with unknown functions, to study the response of mitochondria to stress, and to screen for genes that cause chromosome loss or acquisition, a phenotype that has proved difficult to study in the past. “I think this set of data will allow for all sorts of analyzes that we haven’t even come up with yet from people who come from other parts of biology, and suddenly they just have something to take advantage of,” he said. postdoc from Weissman Lab Tom Norman, co-author of the article.

Pioneering Perturb-seq

The project takes advantage of the Perturb-seq approach, which makes it possible to track the impact of the inclusion or exclusion of genes with unprecedented depth. This method was first published in 2016 by a group of researchers, including Weissman and fellow MIT professor Aviv Regev, but could only be used for small sets of genes and at high cost.

The massive map of Perturb-seq was made possible by the founding work of Joseph Replogl, a doctoral student in Weissman’s laboratory and co-author of this article. Replogle, in collaboration with Norman, who now runs a lab at the Memorial Sloan Kettering Cancer Center; Brit Adamson, assistant professor in the Department of Molecular Biology at Princeton University; and a group of 10x Genomics have set out to create a new version of Perturb-seq that can be expanded. Researchers have published a document proving the concept in Nature Biotechnology in 2020.

The Perturb-seq method uses CRISPR-Cas9 genome editing to introduce genetic changes into cells and then uses single-cell RNA sequencing to capture information about RNA that is expressed as a result of a genetic change. Because RNA controls all aspects of cell behavior, this method can help decode the many cellular effects of genetic change.

After their initial proof of concept, Weissman, Regev, and others used this method of sequencing on a smaller scale. For example, researchers used Perturb-seq in 2021 to study how human and viral genes interact during infection with HCMV, a common herpesvirus.

In the new study, Replogle and colleagues, including Ruben Saunders, a graduate student at Weissman’s lab and co-author of the article, extended the method to the entire genome. Using human blood cancer cell lines as well as non-cancerous cells derived from the retina, he performed Perturb-seq on more than 2.5 million cells and used the data to build a comprehensive map linking genotypes to phenotypes.

Data insight

After completing the screen, the researchers decided to use their new dataset and explore several biological issues. “The advantage of Perturb-seq is that it allows you to get a large set of data in an impartial way,” says Tom Norman. “No one knows exactly what the limits of what you can get from this type of data set are. Now the question is, what are you actually doing with it? “

The first, most obvious application was to study genes with unknown functions. Because the screen also relies on phenotypes of many known genes, researchers could use the data to compare unknown genes with known ones and look for similar transcription results, which may suggest that gene products work together as part of a larger complex. .

The mutation of a gene called C7orf26 stands out. The researchers noticed that the genes whose removal led to a similar phenotype were part of a protein complex called the Integrator, which plays a role in making small nuclear RNA. The Integrator complex consists of much smaller subunits – 14 studies have suggested 14 separate proteins – and researchers have been able to confirm that C7orf26 is the 15th component of the complex.

They also found that the 15 subunits work together in smaller modules to perform specific functions within the Integrator complex. “The lack of that thousand-foot-high view of the situation was not so clear that these different modules are so functionally different,” Saunders said.

Another advantage of Perturb-seq is that because the analysis focuses on individual cells, researchers can use the data to look at more complex phenotypes that are blurred when studied with data from other cells. “We often take all the cells where Gene X is broken down and average them together to see how they’ve changed,” Weissman said. “But sometimes when you break down a gene, different cells that lose the same gene behave differently, and that behavior can be overlooked.”

The researchers found that a subset of genes whose removal led to different cell-to-cell results were responsible for chromosomal segregation. Removing them causes the cells to lose a chromosome or take on an extra one, a condition known as aneuploidy. “You can’t predict the transcriptional response to the loss of this gene, because it depends on the side effect of what chromosome you’ve gained or lost,” Weissman said. “We realized that then we could reverse this and create this composite phenotype looking for chromosome signatures that are gained and lost. Thus, we did the first genome-wide screening for factors necessary for proper DNA segregation.

“I think the study of aneuploidy is the most interesting application of this data so far,” says Norman. “It captures a phenotype that you can only get with a single cell reading. You can’t pursue him any other way. “

The researchers also used their data set to study how mitochondria respond to stress. Mitochondria, which evolved from free-living bacteria, carry 13 genes in their genome. Within nuclear DNA, about 1,000 genes are linked in some way to mitochondrial function. “People have long been interested in how nuclear and mitochondrial DNA coordinate and regulate in different cellular conditions, especially when the cell is under stress,” says Replogle.

The researchers found that when they disrupted different genes associated with mitochondria, the nuclear genome reacted in a similar way to many different genetic changes. However, responses to the mitochondrial genome were much more variable.

“There’s still an open question as to why mitochondria still have their own DNA,” Replogle said. “The big picture from our work is that one benefit of having a separate mitochondrial genome can be localized or very specific genetic regulation in response to various stressors.

“If you have one mitochondria that is broken and another that is broken in different ways, those mitochondria may react differently,” Weissman said.

In the future, researchers hope to use Perturb-seq on different cell types besides the cancer cell line in which they started. They also hope to continue researching their gene function map and hope others will do the same. “This is really the culmination of many years of work by authors and other contributors, and I am really pleased to see that it continues to evolve and expand,” says Norman.

Reference: “Mapping of information-rich genotype-phenotype landscapes with Perturb-seq on a genome scale” by Joseph M. Replogl, Ruben A. Saunders, Angela N. Pogson, Jeffrey A. Husman, Alexander Lenail, Alina Guna, Lauren Maskibroda Eric J. Wagner, Karen Adelman, Gila Lithwick-Yanai, Nika Iremadze, Florian Oberstrass, Doron Lipson, Jessica L. Bonnar, Marco Jost, Thomas M. Norman and Jonathan S. Weissman, June 9, 2022, Cell.DOI: 10.1016 / j .cell.2022.05.013