CLOSE
Original image
iStock

Report Finds Microsoft Excel Causes Errors in 20 Percent of Genomics Studies

Original image
iStock

Microsoft Excel, that ubiquitous tool for data crunching, has been playing an unexpected role in the scientific world. The program has been screwing with data in genomics studies. A new report in the journal Genome Biology estimates that around 20 percent of scientific papers published in leading genome-focused journals that include gene lists from Excel contain errors due to the program’s default autocorrect settings, Slate reports.

The problem is, several genes have symbols that look a lot like dates. The program has a tendency to convert gene symbols like SEPT2 (Septin 2) and MARCH1 (Membrane Associated Ring-CH-Type Finger) into what Excel thinks is proper date form—turning them into 2-Sept and 1-Mar instead. In some, SEPT2 became “2006/09/02.”

"Inadvertent gene symbol conversion is problematic because these supplementary files are an important resource in the genomics community that are frequently reused," the paper’s authors write. They reviewed the supplementary gene list Excel files from 18 journals, examining studies published between 2005 and 2015—Excel’s gene-typo issue was first reported in 2004—for date formatting within lists of genes. The analysis was performed by a program that flagged supplementary materials that seemed to be lists of genes, then searched them for date formatting. Out of more than 35,000 supplementary files, they confirmed 987 files with gene errors that were published as part of 704 studies.

Overall, 19.6 percent of papers in the 18 journals contained gene name errors caused by Excel’s autocorrect function, but some journals were worse than others. High-impact journals, typically the most respected outlets to publish research in, actually had more affected gene lists, which the researchers speculate may be because studies published in these journals are more likely to have larger and more numerous data sets.

The highest proportion of gene lists with errors (more than 20 percent) came from the journals Nucleic Acids Research, Genome Biology, Nature Genetics, Genome Research, Genes and Development, and Nature; conversely, the journals Molecular Biology and Evolution, Bioinformatics, DNA Research, and Genome Biology and Evolution showed errors in less than 10 percent of genomics papers.

While this isn’t the worst scientific error to end up in a journal, since it’s pretty clear that 2006/09/02 isn’t a gene symbol, it’s also fairly disturbing that this many papers could make it through the editing process without anyone noticing that they contained lists of nonexistent genes.

The researchers highlight Google Sheets as a potential alternative for Excel, because it doesn’t suffer from the same symbol-date mixup, and it seems that when you open Sheets documents in other programs like Excel, the data is protected from Excel’s default autocorrection. They suggest that journal editors and reviewers should look out for these errors, pasting gene name lists into blank files and sorting them so that any dates that have been mistakenly inserted will become apparent.

[h/t Slate]

Know of something you think we should cover? Email us at tips@mentalfloss.com.

arrow
science
Scientists Study the Starling Invasion Unleashed on America by a Shakespeare Fan

On a warm spring day, the lawn outside the American Museum of Natural History in Manhattan gleams with European starlings. Their iridescent feathers reflect shades of green and indigo—colors that fade to dowdy brown in both sexes after the breeding season. Over the past year, high school students from different parts of the city came to this patch of grass for inspiration. "There are two trees at the corner I always tell them to look at," Julia Zichello, senior manager at the Sackler Educational Lab at the AMNH, recalls to Mental Floss. "There are holes in the trees where the starlings live, so I was always telling them to keep an eye out."

Zichello is one of several scientists leading the museum's Science Research Mentoring Program, or SRMP. After completing a year of after-school science classes at the AMNH, New York City high school students can apply to join ongoing research projects being conducted at the institution. In a recent session, Zichello collaborated with four upperclassmen from local schools to continue her work on the genetic diversity of starlings.

Before researching birds, Zichello earned her Ph.D. in primate genetics and evolution. The two subjects are more alike than they seem: Like humans, starlings in North America can be traced back to a small parent population that exploded in a relatively short amount of time. From a starting population of just 100 birds in New York City, starlings have grown into a 200-million strong flock found across North America.

Dr. Julia Zichello
Dr. Julia Zichello
©AMNH

The story of New York City's starlings began in March 1890. Central Park was just a few decades old, and the city was looking for ways to beautify it. Pharmaceutical manufacturer Eugene Schieffelin came up with the idea of filling the park with every bird mentioned in the works of William Shakespeare. This was long before naturalists coined the phrase "invasive species" to describe the plants and animals introduced to foreign ecosystems (usually by humans) where their presence often had disastrous consequences. Non-native species were viewed as a natural resource that could boost the aesthetic and cultural value of whatever new place they called home. There was even an entire organization called the American Acclimatization Society that was dedicated to shipping European flora and fauna to the New World. Schieffelin was an active member.

He chose the starling as the first bird to release in the city. It's easy to miss its literary appearance: The Bard referenced it exactly once in all his writings. In the first act of Henry IV: Part One, the King forbids his knight Hotspur from mentioning the name of Hotspur's imprisoned brother Mortimer to him. The knight schemes his way around this, saying, "I'll have a starling shall be taught to speak nothing but 'Mortimer,' and give it him to keep his anger still in motion."

Nearly three centuries after those words were first published, Schieffelin lugged 60 imported starlings to Central Park and freed them from their cages. The following year, he let loose a second of batch of 40 birds to support the fledgling population.

It wasn't immediately clear if the species would adapt to its new environment. Not every bird transplanted from Europe did: The skylark, the song thrush, and the bullfinch had all been subjects of American integration efforts that failed to take off. The Acclimatization Society had even attempted to foster a starling population in the States 15 years prior to Schieffelin's project with no luck.

Then, shortly after the second flock was released, the first sign of hope appeared. A nesting pair was spotted, not in the park the birds were meant to occupy, but across the street in the eaves of the American Museum of Natural History.

Schieffelin never got around to introducing more of Shakespeare's birds to Central Park, but the sole species in his experiment thrived. His legacy has since spread beyond Manhattan and into every corner of the continent.

The 200 million descendants of those first 100 starlings are what Zichello and her students made the focus of their research. Over the 2016-2017 school year, the group met for two hours twice a week at the same museum where that first nest was discovered. A quick stroll around the building reveals that many of Schieffelin's birds didn't travel far. But those that ventured off the island eventually spawned populations as far north as Alaska and as far south as Mexico. By sampling genetic data from starlings collected around the United States, the researchers hoped to identify how birds from various regions differed from their parent population in New York, if they differed at all.

Four student researchers at the American Museum of Natural History
Valerie Tam, KaiXin Chen, Angela Lobel and Jade Thompson (pictured left to right)
(©AMNH/R. Mickens)

There are two main reasons that North American starlings are appealing study subjects. The first has to do with the founder effect. This occurs when a small group of individual specimens breaks off from the greater population, resulting in a loss of genetic diversity. Because the group of imported American starlings ballooned to such great numbers in a short amount of time, it would make sense for the genetic variation to remain low. That's what Zichello's team set out to investigate. "In my mind, it feels like a little accidental evolutionary experiment," she says.

The second reason is their impact as an invasive species. Like many animals thrown into environments where they don't belong, starlings have become a nuisance. They compete with native birds for resources, tear through farmers' crops, and spread disease through droppings. What's most concerning is the threat they pose to aircraft. In 1960, a plane flying from Boston sucked a thick flock of starlings called a murmuration into three of its four engines. The resulting crash killed 62 people and remains the deadliest bird-related plane accident to date.

Today airports cull starlings on the premises to avoid similar tragedies. Most of the birds are disposed of, but some specimens are sent to institutions like AMNH. Whenever a delivery of dead birds arrived, it was the students' responsibility to prep them for DNA analysis. "Some of them were injured, and some of their skulls were damaged," Valerie Tam, a senior at NEST+m High School in Manhattan, tells Mental Floss. "Some were shot, so we had to sew their insides back in."

Before enrolling in SRMP, most of the students' experiences with science were limited to their high school classrooms. At the museum they had the chance to see the subject's dirty side. "It's really different from what I learned from textbooks. Usually books only show you the theory and the conclusion, but this project made me experience going through the process," says Kai Chen, also a senior at NEST+m.

After analyzing data from specimens in the lab, an online database, and the research of previous SRMP students, the group's hypothesis was proven correct: Starlings in North America do lack the genetic diversity of their European cousins. With so little time to adapt to their new surroundings, the variation between two starlings living on opposite coasts could be less than that between the two birds that shared a nest at the Natural History Museum 130 years ago.

Students label samples in the lab.
Valerie Tam, Jade Thompson, KaiXin Chen and Angela Lobel (pictured left to right) label samples with Dr. Julia Zichello.
©AMNH/C. Chesek

Seeing how one species responds to bottlenecking and rapid expansion can provide important insight into species facing similar conditions. "There are other populations that are the same way, so I think this data can help [scientists],” Art and Design High School senior Jade Thompson says. But the students didn't need to think too broadly to understand why the animal was worth studying. "They do affect cities when they're searching for shelter," Academy of American Studies junior Angela Lobel says. “They can dig into buildings and damage them, so they're relevant to our actual homes as well.”

The four students presented their findings at the museum's student research colloquium—an annual event where participants across SRMP are invited to share their work from the year. Following their graduation from the program, the four young women will either be returning to high school or attending college for the first time.

Zichello, meanwhile, will continue where she left off with a new batch of students in the fall. Next season she hopes to expand her scope by analyzing older specimens in the museum's collections and obtaining bird DNA samples from England, the country the New York City starlings came from. Though the direction of the research may shift, she wants the subject to remain the same. "I really want [students] to experience the whole organism—something that's living around them, not just DNA from a species in a far-away place." she says. "I want to give them the picture that evolution is happening all around us, even in urban environments that they may not expect."

arrow
science
Scientists Put a GIF Inside Living Bacteria

Researchers at Harvard University have figured out a way to embed moving images into the DNA of E. coli bacteria. The team described their process in the journal Nature.

It's a setup any spy would love: a code within a code. The paper authors see bacterial DNA as a form of information storage, almost like a computer's hard drive. As the science of gene editing technology advances, we're learning how to fit more—and more complex—information on the same equipment.

Enabling this advancement is a gene editing technique called CRISPR-Cas, which gives scientists access to certain immune-activating regions of bacterial DNA. Researchers have already used that access to engineer malaria-resistant mosquitoes and track down disease-causing pathogens. 

Other scientists have successfully inserted secret messages in E. coli's genetic blueprints. Some have even gotten the bacteria to hold pictures. But until now, none of those pictures have moved.

The Harvard team wanted to see how far CRISPR-Cas could get them. First, they had to select their images. And while some researchers may have taken this opportunity to immortalize a goofy cat GIF, the Harvard team wanted the content of the first-ever bacterial home movies to have significance.

Eadweard Muybridge was a 19th-century photographer whose work blurred the line between art and science. Muybridge pushed the camera technology of the time to its limits, using what was then high-speed imaging to capture incredible shots of people and other animals in motion. His photos showed us the potential of both cameras and our bodies.

And so the authors of the new paper thought it would be appropriate to make their first moving image a Muybridge—specifically, his groundbreaking image of a horse in full gallop. They converted the images to pixels, then converted those pixels to nucleotides, which are often called the building blocks of DNA. They popped those nucleotides into the bacteria's genetic code, then ran the DNA through a sequencer to see if the pixel information stayed in place. It did.

But lead author Seth Shipman says printing images is just the beginning. He envisions a world in which our cells work like microscopic cameras, recording the state and goings-on inside our bodies.

"What we want this system to be used for, eventually, is not to encode information that we already have, but for a way for cells to go out and gather information that we don't have access to," Shipman told Popular Science. "If we could have them collect data and then store that data in their genomes, then we might have access to completely new types of information."

If that concept sounds kind of creepy to you, we have some good news: It's still a long way off.

[h/t Popular Science]

SECTIONS

More from mental floss studios