CLOSE
iStock
iStock

Report Finds Microsoft Excel Causes Errors in 20 Percent of Genomics Studies

iStock
iStock

Microsoft Excel, that ubiquitous tool for data crunching, has been playing an unexpected role in the scientific world. The program has been screwing with data in genomics studies. A new report in the journal Genome Biology estimates that around 20 percent of scientific papers published in leading genome-focused journals that include gene lists from Excel contain errors due to the program’s default autocorrect settings, Slate reports.

The problem is, several genes have symbols that look a lot like dates. The program has a tendency to convert gene symbols like SEPT2 (Septin 2) and MARCH1 (Membrane Associated Ring-CH-Type Finger) into what Excel thinks is proper date form—turning them into 2-Sept and 1-Mar instead. In some, SEPT2 became “2006/09/02.”

"Inadvertent gene symbol conversion is problematic because these supplementary files are an important resource in the genomics community that are frequently reused," the paper’s authors write. They reviewed the supplementary gene list Excel files from 18 journals, examining studies published between 2005 and 2015—Excel’s gene-typo issue was first reported in 2004—for date formatting within lists of genes. The analysis was performed by a program that flagged supplementary materials that seemed to be lists of genes, then searched them for date formatting. Out of more than 35,000 supplementary files, they confirmed 987 files with gene errors that were published as part of 704 studies.

Overall, 19.6 percent of papers in the 18 journals contained gene name errors caused by Excel’s autocorrect function, but some journals were worse than others. High-impact journals, typically the most respected outlets to publish research in, actually had more affected gene lists, which the researchers speculate may be because studies published in these journals are more likely to have larger and more numerous data sets.

The highest proportion of gene lists with errors (more than 20 percent) came from the journals Nucleic Acids Research, Genome Biology, Nature Genetics, Genome Research, Genes and Development, and Nature; conversely, the journals Molecular Biology and Evolution, Bioinformatics, DNA Research, and Genome Biology and Evolution showed errors in less than 10 percent of genomics papers.

While this isn’t the worst scientific error to end up in a journal, since it’s pretty clear that 2006/09/02 isn’t a gene symbol, it’s also fairly disturbing that this many papers could make it through the editing process without anyone noticing that they contained lists of nonexistent genes.

The researchers highlight Google Sheets as a potential alternative for Excel, because it doesn’t suffer from the same symbol-date mixup, and it seems that when you open Sheets documents in other programs like Excel, the data is protected from Excel’s default autocorrection. They suggest that journal editors and reviewers should look out for these errors, pasting gene name lists into blank files and sorting them so that any dates that have been mistakenly inserted will become apparent.

[h/t Slate]

Know of something you think we should cover? Email us at tips@mentalfloss.com.

nextArticle.image_alt|e
Gino Fornaciari, University Of Pisa
arrow
Stones, Bones, and Wrecks
Scientists Accidentally Discover Ancient Hepatitis B in a 16th-Century Mummy
Gino Fornaciari, University Of Pisa
Gino Fornaciari, University Of Pisa

Since the 1980s, a child mummy buried in the Basilica of Saint Domenico Maggiore in Naples, Italy in the 16th century has been known as the earliest recorded case of smallpox in the world. The problem is, the 2-year-old didn’t have smallpox, according to new research spotted by IFLScience. But, as the scientists reexamining the remains discovered, it’s still a landmark study in disease evolution. It appears to be the earliest instance of hepatitis B that researchers have ever found in Italy, giving scientists insight into how the virus has evolved over the last several centuries.

The hepatitis B virus (HBV) attacks the liver and can result in cirrhosis and liver cancer, killing around 887,000 people per year. Though it can now be largely prevented by a vaccine, the World Health Organization estimates that 257 million people around the world live with HBV. It often affects children, spreading from mother to child during birth.

For the current study published in PLOS Pathogens, a team of researchers from McMaster University in Canada set about studying the child mummy with the hopes of continuing their past work nailing down how smallpox spread and evolved over human history. But when they used molecular analysis to study the mummy’s skin and bones, they didn’t find anything that indicated that the toddler had smallpox. Instead, they found the hepatitis B virus—which can cause a rash called Gianotti-Crosti Syndrome that the original researchers studying the mummy may have mistaken for the telltale rash associated with smallpox.

The ancient HBV strain found in the mummy's tissues had a genome closely related to that of the modern virus, which, The New York Times explains, could very well mean that the mummy was contaminated when it was first studied in the 1980s. But after analyzing the genetic material further and studying other examples of older HBV strains, they found that it’s plausible that the virus just hasn’t evolved extensively in the past 500 years. Though the contamination theory is still possible, it’s more likely that the mummy really does carry an ancient version of the virus. Considering that HBV has also been traced back to the 16th century in Asia, it’s likely that Europeans were suffering from it around the same time.

[h/t IFLScience]

nextArticle.image_alt|e
Illustration by Eric S. Carlson in collaboration with Ben A. Potter
arrow
Stones, Bones, and Wrecks
11,500-Year-Old Skeleton Reveals an Unknown Group of Ancient Migrants to the Americas
Illustration by Eric S. Carlson in collaboration with Ben A. Potter
Illustration by Eric S. Carlson in collaboration with Ben A. Potter

In 2013, deep in the forest of central Alaska's remote Tanana River Valley, archaeologists unearthed the remains of a 6-week-old baby at a Late Pleistocene archaeological site. The tiny bones yielded big surprises for researchers, who announced this week that the child's genome—the oldest complete genetic profile of a New World human—reveals the existence of a human lineage that was previously unknown to scientists. Related to yet genetically distinct from modern Native Americans, the infant offers fresh insights into how the Americas were first peopled, National Geographic reports.

Published in the journal Nature on January 3, the study analyzed the DNA of the infant, whom the local Indigenous community named Xach'itee'aanenh T'eede Gaay ("sunrise girl-child" in the local Athabascan language). Then, researchers used genetic analysis and demographic modeling to identify connections between different groups of ancient Americans. This allowed them to figure out where this newly identified population—named Ancient Beringians—fit on the timeline.

University of Alaska Fairbanks professors Ben Potter and Josh Reuther excavate at the Upward Sun River site in central Alaska.
Members of the archaeology field team watch as University of Alaska Fairbanks professors Ben Potter and Josh Reuther excavate at the Upward Sun River site.
UAF photo courtesy of Ben Potter

The study suggests that a single founding group of Native Americans separated from East Asians some 35,000 years ago. This group, in turn, ended up dividing into two distinct sub-groups 15,000 years later, consisting of both the Ancient Beringians and what would eventually become the distant ancestors of all other Native Americans. The division could have occurred either before or after humans crossed over the Bering land bridge around 15,700 years ago.

After arriving in the New World, Ancient Beringians likely remained north, while the other population spread out across the continent. Eventually, the Ancient Beringians either melded with or were replaced by the Athabascan peoples of interior Alaska. 

The study provides "the first direct evidence of the initial founding Native American population, which sheds new light on how these early populations were migrating and settling throughout North America," said Ben Potter, the University of Alaska-Fairbanks archaeologist who discovered the remains, in a news release. Potter was a lead author of the study, along with Eske Willerslev and other researchers at the Center for GeoGenetics at the University of Copenhagen's Natural History Museum of Denmark.

[h/t National Geographic]

SECTIONS

arrow
LIVE SMARTER
More from mental floss studios