Amgen Improves Shipping Logistics with MarkLogic
Posted by Fiona Ehret-Kayser on 10 July 2019 08:00 AM
When biopharma company, Amgen, transports drugs around the world, it needs to know that the product is getting to the right people as quickly as possible and in a safe condition. So when product is left out on the docks for too long and it spoils, the company needs access to instant, rich data to investigate and prevent future mishaps.
Amgen came to MarkLogic with more than 100 million records governing its drug development and distribution networks, many of them in formats such as PDFs that were unreadable by its database. Data points included carrier, packaging, temperature, multiple addresses and more than 4,000 individual routes. Each record covered part of the distribution puzzle, but none carried contextual data, which meant it was difficult to link them and understand how they fit together.
When a problem would arise, the relevant business unit had to manually sift through disparate data sets to try to piece together what happened and why—costing Amgen in product, money and time.
Diagnosing a Sick System
Amgen knew it had valuable data in documents, but this data also had huge semantic gaps, including different structures and inconsistent terminology and descriptions. And with differing IT systems, no master data oversight and no platform to connect data, Amgen teams were unable to iterate to improve shipping logistics.
Amgen Taxonomy Management Lead, Dr. Alice Augustine, said every piece of the Amgen pipeline created a data silo, simply by using different naming conventions for the same drug.
“A product name can go through several morphologies as it moves through the pipeline,” she told a group at this year’s MarkLogic World in May.
“Imagine at the beginning of the pipeline, you might have a molecule named (by a researcher) but as it moves through the pipeline, you might get a generic name. As it moves to commercialization, it gets a trade name and then if you’re using a device, it has a device name. So you have this terminology around a single concept, which is product (but even that) changes over time.”
The pharma’s data was difficult to understand because there were many data sources, different words used to describe the same thing, the same word used to describe different things, and the data itself did not carry provenance information vital to understanding where it came from and other important context.
According to Dr. Augustine, the Amgen team “could not ask a simple question like: ‘Give me a route between Thousand Oaks, California and the Netherlands.’ They couldn’t say: ‘Give me a route that used a shipper with temperature 8-10 degrees.’ They couldn’t ask very simple questions.”
Making Critical Data Connections
To standardize the data for input, the team used semantic AI platform, Semaphore, to extract information including carrier, shipper, origin and destination. They built a taxonomy based on consistent, simple vocabulary with reference data arranged in domains. The new modular and linked taxonomy allowed the MarkLogic® engine to understand data points in context—clearing the way for the creation of a new data hub.
Due to the multitude of data types, Amgen could not build a primary key. Instead, MarkLogic built a complex key to harmonize datasets with semantics. It implemented a scoring system that allowed greater visibility into the company’s data, e.g., high temperatures putting product at risk, or missing addresses that could result in shipment rebounds. The process incorporated 100 million records—12 years of data—within eight weeks.
Today, Amgen has access to two-billion records and is able to ask ad hoc questions that return meaningful answers. With the ability to search across data by type, business units can now create dashboards and integrate them with data visualization tools to spot patterns and potential problems before they occur.
Dr. Augustine said that although the company had used data visualization before, data silos made it difficult to see a true picture of the company’s health.
“Tableau can really give you some pretty pictures, right? So everybody was looking at the reports and saying: ‘Wow, this is great, you know, it looks so good.’ But this was the first time you could actually see these scores and see that those Tableau reports were really not true,” she said.
“The picture was pretty, but the data underlying it was not, and this was the first time people had visibility into the data and the gaps in data—addresses missing packages going to the wrong destinations or temperature excursions … so many things came out because data became visible.”
With a consistent vocabulary across the pipeline and the ability to search anything, Amgen now has greater visibility into all aspects of its production and shipping cycle—ensuring that medications get to where they need to go quickly and safely.