In part two of this series, we demonstrated that knowledge products derive their value by possessing unique insights about a therapeutic target, disease or new molecular entity. Great information architecture is a fundamental building block that enables a company to engineer its insight production process and accelerate the creation of valuable knowledge products. In this post, we will describe how innovative IT infrastructure can accelerate turning unknowns into knowns by significantly increasing the amount of data and information that a company can process.Why Aren’t We Getting Any Better?
Something isn’t right in life sciences R&D. Powerful new research methodologies created over the past thirty years enable scientists to interrogate pathophysiology and create therapeutic interventions like never before. Recombinant DNA, combinatorial chemistry, high-throughput screening, DNA sequencing, RNA sequencing, proteomics and countless other advances have produced an avalanche of new data and significant findings. One would think that corporate R&D departments have voraciously consumed all that data leading to a large increase in the number of pipeline programs and significantly more approved therapeutics. Part of this is true. Data is being consumed, and R&D pipelines have expanded at the industry level, but no material difference in the rate of new therapeutic approvals has resulted.
It’s perplexing that clinical development success rates in life sciences remain virtually unchanged over the past twenty years. A new therapeutic that begins Phase I clinical trials in 2017 has approximately the same 10% probability of gaining FDA approval as a new therapeutic that began clinical trials in 1997. We are still experiencing 9 failures out of every 10 tries. These are familiar statistics to industry veterans, but still daunting odds for any company that is in the business of discovering and developing new therapeutics. Why do we still fail so often?
Turning Unknowns Into Knowns
It’s possible that factors other than the amount of research data could be contributing to the persistently high failure rates. Perhaps the industry is struggling to finance R&D initiatives, new corporate formation has slowed leading to a dearth of innovative approaches, regulations have become more onerous, or we are pursuing ever more complex disease states that are inherently riskier. All of these factors could undermine successful clinical development efforts. The data, however, doesn’t support any of this as causative.
Excluding the large pharmaceutical companies, from 2000 to 2016, the life sciences industry has more than doubled its R&D expenditures, expanded the universe of publicly traded companies by nearly 200 and added almost $500 billion in market capitalization closing in on the $1 trillion level. Also, venture capital investments in the life sciences have increased from approximately $4 billion per year in 2000 to over $10 billion per year in 2016. A quick analysis of therapeutic approvals in 2016 show that, with a few exceptions, the industry is primarily pursuing the same diseases with similar therapeutic intervention strategies and clinical trial designs as it did during the early 2000s. Both standard and accelerated review times have trended lower at the FDA during this timeframe. Based on these statistics, corporate formation, R&D spending, and public market valuations are hovering near all-time highs while development complexity and regulatory morass have remained consistent. What is causing the flat-lined productivity?
Each failure is no doubt a surprise. In vitro, preclinical and even early clinical trial data always looks promising and supports the decision to advance compounds into the next and more expensive stage of testing. Then, seemingly out of nowhere, adverse safety events or a lack of efficacy emerges to kill the program. Unfortunately, an unknown has become a known only after a considerable amount of time and money was spent on the compound. Despite generating ever-increasing amounts of new data, the life sciences industry isn't improving its ability to convert unknowns into knowns before embarking on expensive development programs. It could be that critical data remains to be produced and companies are persistently placing bets with incomplete information. Alternatively, it could be that sufficient data exists, but companies cannot locate, consume and process the available information to turn unknowns into knowns.
"Unfortunately, an unknown has become a known only after a considerable amount of time and money was spent on the compound."
Life Sciences Companies Are Like Enzymes
If we think about the R&D cycle like an enzymatic reaction, then a possible explanation emerges. Experimental data is the substrate, corporate R&D departments are the enzymes consuming the substrate, and the resulting products are FDA approved therapeutics. The concentrations of substrate (data) and enzyme (corporate R&D) will determine the velocity of a reaction. If the amount of enzyme is held constant and the amount of substrate is increased, then reaction velocity will increase. However, this is only true until the enzyme becomes saturated by substrate at which point the reaction velocity plateaus. Adding more substrate once enzyme saturation has occurred will produce no further effect on reaction velocity. Perhaps corporate R&D (enzyme) has been saturated by data (substrate) for quite some time, and no amount of additional data will increase the reaction velocity required to yield more products.
In this analogy, corporate R&D departments represent a limited concentration of enzyme. The number of companies and researchers employed can’t grow fast enough to consume all of the available data. It’s the consumption of data by researchers that initiates the unique insight cycle that we wrote about in our last blog post. If consumption plateaus, then insight generation and new product approvals will plateau as well. Given the pool of corporate researchers is mostly fixed, and it requires a long time to educate and train new ones, the potential to significantly increase the concentration of ‘R&D enzyme’ looks bleak.
There is an alternative solution to increasing the concentration of enzyme, however, and that’s to increase the catalytic efficiency of the enzyme that’s present. In other words, if systems are engineered that enable corporate R&D departments to consume (not produce) larger amounts of data while holding resources constant then the reaction velocity will increase yielding more products. Getting more R&D completed and achieving milestones faster with the resources on hand today should be the goal of all senior leadership in life sciences. How can this be accomplished?
"Perhaps corporate R&D (enzyme) has been saturated by data (substrate) for quite some time, and no amount of additional data will increase the reaction velocity required to yield more products."
Where Isn’t Innovation Happening?
As discussed, innovative research methodologies have fundamentally changed the type, amount, and quality of experimental data. Likewise, innovative therapeutics that attack disease in fundamentally new ways have entered the clinic and in some cases are now FDA approved products treating patients. Antisense, RNAi, and gene-modified cell therapies all represent entirely new classes of prescription therapeutics that have emerged over the past twenty years. Innovation in life sciences R&D is thriving, and that’s what makes this industry special, but innovation in life sciences IT has lagged badly. Going back to our enzyme analogy, the lack of innovation in life sciences IT is preventing the industry from achieving better catalytic efficiency in R&D.
The two major trends in life sciences IT over the past ten years have been transitioning from paper to digital and migrating business processes to the cloud. While developments such as electronic lab notebooks and cloud-based services have improved access, archiving, the total cost of ownership and collaboration, they haven't fundamentally changed how a researcher works. A spreadsheet that used to only live on a researcher’s laptop now lives in a Box or DropBox folder, but it’s still the same old spreadsheet used in the same way. Likewise, moving a genomics pipeline, for example, from on-premises to the cloud or being able to access a third party cloud-based genomics pipeline certainly lowers cost of ownership and increases access. But, simply living in the cloud doesn’t make that pipeline more likely to help researchers derive insights from the data. A distinction needs to be drawn between software and IT as well as data production and data consumption. A lot of innovative bioinformatics and computational biology software has been developed, but where is the IT innovation that will amplify the benefits these applications could bring to a life sciences company? Where is the consumption platform that’s built to match the volume being pumped out by the various data production platforms?
"Innovation in life sciences R&D is thriving, and that's what makes this industry special, but innovation in life sciences IT has lagged badly."
Configurable Resources and Composable Workflows
To be successful, development stage life sciences companies must become great at solving complex scientific problems. Researchers solve problems (uncertainties) by generating one hard-earned insight after another chipping away until answers are known. As we discussed in our last post, great information architecture is required to identify and procure all of the resources, both internal and external, that are needed for researchers to effectively perform their jobs. But, unless these resources are comprehensive, findable, usable, interoperable, easily exposed to analytic services and truly networked across an R&D department they won’t materially change how researchers work.
Life sciences IT must evolve beyond moving point solutions and data generating pipelines to the cloud. A new type of platform is needed that re-engineers the workflows researchers use to create insights and produce corporate knowledge. Otherwise, enzyme remains saturated and all the additional substrate will not affect the rate of new product approval. Building this new type of platform requires an in-depth understanding of how corporations conduct R&D and new thinking about how to architect technology solutions. IT solutions that manage the production and storage of data, documents and other corporate content have emerged, but IT solutions that facilitate turning those assets into knowledge are absent.
“But, unless these resources are comprehensive, findable, usable, interoperable, easily exposed to analytic services and truly networked across an R&D department they won’t materially change how researchers work.”
Effective life sciences IT solutions can’t be a one size fits all proposition. Not all companies pursue the same science or operate at the same stage of development. A company’s business and research objectives should define its information architecture and IT solutions that specifically address those requirements implemented. In other words, the infrastructure and technology stack must enable resources (digital research objects) to be configurable and scientific workflows to be composable. Companies and researchers shouldn’t be forced to use tools that impose a generic “best practices” onto their workflows. It should be the other way around with companies determining what resources they need, how their teams work and their own best practices. Corporate IT needs to lead the way in delivering a platform to the business that amplifies consumption of resources and increases the pace of knowledge production.
“Effective life sciences IT solutions can’t be a one size fits all proposition. Not all companies pursue the same science or operate at the same stage of development.”
IT Architecture and Infrastructure
Data, information, and analytics required to produce unique insights come from a nearly endless array of sources. Publications, patents, regulatory documents, press releases, news feeds, open source data sets, proprietary datasets, open source analytics, proprietary analytics, etc. are scattered across countless unlinked repositories both inside and outside a company. These extremely valuable resources have been produced at high cost, but are underutilized by most companies because they are hard to find, nearly impossible to use at scale and often reside in repositories or web services that have questionable security and privacy policies.
A purpose-built life sciences platform designed to accelerate the consumption of data and information should possess the following attributes:
- Private, secure and dedicated environment
- Ingestion engine that handles all types of unstructured data
- Data normalization pipelines
- Containerization to deploy a broad array of analytic applications
- Data upload portals to integrate internal and external resources
- Infrastructure that enables configurable resources
- Infrastructure that enables composable workflows
- Infrastructure that facilitates internal and external collaboration
- Networked communications capability
- Granular content controls that govern sharing, permissions and archiving
- Enterprise search enabled with domain specific language
- Notification systems
- Data architectures that support machine learning and artificial intelligence applications
- Ongoing customization
- Access from any device anywhere
IT infrastructure that delivers these capabilities in a secure and scalable manner will materially transform how a life sciences company consumes resources. IT can and should be a source of competitive differentiation. Companies that think strategically and implement innovative IT platforms will enable researchers to rapidly find what they need, combine previously disconnected data sets at will, expose their data to analytic services on the fly and compose workflows that fit the task at hand rather than trying to retrofit a generic solution. Data and workflows can be saved, reused, combined and shared without the need for an informatics specialist. This capability creates a true corporate network effect because all data and capabilities are discoverable and consumable by all researchers on the network who in turn create and contribute new content back to the platform through their work-related activities. The platform gets richer and more powerful as more researchers use it over time.
Cloud services platforms like AWS provide secure and scalable compute power, databases, data storage, back-up/archiving and disaster recovery capabilities. When combined with a microservices-based architecture, utilization of container image systems and infrastructure orchestration services, a new type of life sciences IT platform can be envisioned. A team comprised of cloud system architects, back side, front end, database and DevOps engineers can design, deploy and scale the required applications, data stores, and infrastructure to meet a company’s current and future needs. AI engineers specializing in natural language processing, machine learning, and data science can build applications that enable data ingestion, enterprise search, advanced analytics, configurable architectures, composable workflows, collaboration and ongoing customization that fits the specific needs of a particular life sciences company. It’s time to innovate in life sciences IT and build an integrated consumption platform that can keep pace with the output of production platforms that are overwhelming our ability to use and make sense of all that data.
“It’s time to innovate in life sciences IT and build an integrated consumption platform that can keep pace with the output of production platforms that are overwhelming our ability to use and make sense of all that data.”
Tying it All Together
So, how does all of this lead to better R&D and increased rates of development success? Companies that implement powerful resource consumption platforms will enable their teams to reason over much larger information spaces than previously possible. Giving your researchers better processing tools will decrease the cost and accelerate the pace at which they derive unique insights and turn unknowns into knowns. More knowns will lead to better decision-making particularly in early-stage development programs. By getting to Go/No Go decisions more rapidly and with more confidence, senior decision makers can marshal capital more efficiently by concentrating bets on programs more likely to win. Importantly, decision makers will also be able to identify and discontinue lower probability bets before costly, and time-consuming programs are initiated. Over time, a well-designed resource consumption platform should enable experts in a particular field to more rapidly generate unique insights and create valuable knowledge products that can be monetized to fund business objectives. Stacking pipelines with therapeutic candidates that start the development cycle with fewer unknowns will ultimately lead to higher rates of clinical success. To achieve this goal, innovation in life sciences IT must catch-up to the innovation occurring in life sciences R&D.