OmicsWeb: start-up co-founded by MIT professor launches bioinformatics copilot

“Research assistant in a box” will be available to the public, and free for academics and nonprofit organizations.

10594 omicsweb
OmicsWeb Co-pilot is available on Biostate AI's website
Photo Courtesy of Orion Li, Biostate AI Scientist

The journey of turning data into a scientific story is one of the quintessential processes of discovery and research. OmicsWeb, a bioinformatics copilot launched on July 8 by a start-up headed by former MIT and Rice University faculty,  might serve as the guide for many journeys with RNA sequencing data. OmicsWeb, like ChatGPT, is a conversational virtual assistant. It takes in genomic data, namely RNA sequencing, and returns an analysis with prompts as simple as, “What is this data about?” In turn, the user will gain information about the characteristics and significance of genes present in the dataset, as well as obtain visual representations of the data, such as volcano plots, all without having to actually look at the dataset.

OmicsWeb is trained on 2% of all the rat RNA sequencing data ever generated – data that Biostate AI is making open access for worldwide use. This corresponds to several terabytes of bulk RNAseq data, or about 3,400 samples. As of its launch, OmicsWeb is free for academics and nonprofit organizations.

The start-up behind OmicsWeb, Biostate AI, was co-founded in June 2023 by Ashwin Gopinath, former MIT Assistant Professor of Mechanical Engineering, and David Zhang, former Rice University Assistant Professor of Bioengineering. Gopinath, who described himself as “stumbling around from different topics” throughout his career, previously founded a proteomics company, Palamedrix, whose goal was to scale single-molecule protein quantification. His research in large language models (LLMs) centered on machine-based introspection: how to take the output of an LLM, have it reflect on its own thinking, and then refine its output. Zhang, who previously founded two startups in cancer diagnostics and PCR instruments, saw an opportunity when ChatGPT rose into the public eye near the end of 2022. “This is now the right time to build a biology AI,” he said.

Thus, the two Caltech labmates became co-founders of Biostate AI. The start-up comprises around 15 full-time employees, including Orion Li ’24. 

Li explains that OmicsWeb is not Biostate AI’s main product but rather a serendipitous result of their broader goal: to build an AI personally tailored to one’s health needs. “Your personal AI will know how your health will evolve over time,” Zhang said. Is your body on track to get a cold? Will your head hurt tomorrow because of that all-nighter? Are you going to have a heart attack next week? These cases can be particularly important, Zhang says, because “there's a golden 60-90 minute window in which you can rescue someone from death. but if you miss that period, if you happen to be on an airplane, then you might die.”

The term “biostate” refers to the body’s biological state, which is informed by the DNA, RNA, proteins, and other biological components within us. Traditionally, this data is collected just once for analysis, providing a snapshot of information at a given point. However, in order to predict a person’s health trajectory, Biostate AI wants to insert the dimension of time.

“There, the temporal data becomes more important,” Gopinath stated. If the AI predicts an adverse reaction based on the biological data, then the course of action can be stopped before harming the patient.

These types of temporal studies are not completely new to the research scene. What’s new, Gopinath stated, is adding the dimension of time to an already complex analysis of multiple biological components, including DNA, RNA, and proteins. This increases the number of assays required, which is where AI minimizes the burden.

As with many AI pursuits, refining these predictions requires vast amounts of data and time for analysis. Offering low-cost data collection services and creating OmicsWeb was their solution. Li stated that Biostate AI offers to conduct experiments for research teams at a lower cost than many competitors. “It's usually a factor of three lower than everybody else,” Zhang states. In exchange, Biostate AI uses the generated data to train its AI models. OmicsWeb then cuts down on analysis time. Rather than waiting for a bioinformatician to handle the analysis, biologists can draw conclusions from the data without writing a single line of code.

It’s a “research assistant in a box,” Gopinath said. “It became very clear that we are not the only ones who will make use of this. So we decided, ‘let's open this up to the world.’”

Li says that researchers across the field will benefit from OmicsWeb, including seasoned researchers looking to accelerate their analysis, newcomers unfamiliar with bioinformatic analyses, and biologists with limited programming experience. “This will be a tool for them to basically flatten out the skill difference,” Li said. “We do want people to get into this research area. Better, faster, and more fluent.’

At MIT, Li majored in Course 6-3 but became interested in biology after taking the 7.01X GIR. Though biology is a chaotic system, “there are so many relationships that we can discover and eventually forge,” he stated. He was a sophomore when ChatGPT came out and was studying language processing in school. “I didn't know that AI [in genomics] was a thing before I came across Biostate AI. It introduced me to this entire field, and I'm grateful for that.” The opportunity was enough to convince him to graduate a year early from MIT to pursue AI research at Biostate AI full-time. Having a role with more independence and opportunities for diverse tasks was exciting for Li.

Zhang supports this mindset while leading the start-up and states, “The benefit of autonomy is that people really learn to grow.” He adds, “The opposite of a good idea is another good idea, and the opposite of a bad idea is common sense.”

He recognizes many students’ concerns that AI will become a saturated field but argues that “when you have a lot of massive changes, there's the most opportunity. With the advent of AI becoming mature, there's going to be a lot of things becoming available now that were never available before.”

“Now, it's the time to take advantage of these opportunities.”