News Categories
(6)Press Releases (14)MarkLogic World (28)Big Data (38)Uncategorized (7)Dynamic Publishing (21)Agile Development (1)cloud (8)Hadoop (33)NoSQL (10)semantics (43)Enterprise NoSQL (1)HTML5 (4)Mobile (1)data enrichment (4)defense (1)geospatial (4)intelligence (5)search (4)use case (12)Analytics (14)ACID compliant (5)Defense (9)Search (3)alerting (1)query (1)schema (1)variety (1)velocity (6)Security (8)Content Platform (1)Migration (1)Serialized Search (1)Springer (7)Financial Services (1)Fraud (1)Big Data Nation Dallas (1)Big Data Nation (1)Chris Anderson (2)Fernando Mesa (1)Reed Construction Data (1)Reed Elsevier (1)Tony Jewitt (2)Situational Awareness (2)Dan McCreary (1)LexisNexis (1)Mark Rodgers (1)David Gorbet (1)David Leeming (1)MUGL (5)Publishing (1)Royal Society of Chemistry (1)RSC (1)Science (1)User Group (2)Intel (1)Sony (2)Amir Halfon (1)AML (1)Anti-Money Laundering (1)BDN Boston (1)Temis (4)DAM (1)Condé Nast (1)Digital Asset Management (1)Henry Stewart (4)Book Publishing (1)XQuery (1)Direct Digital (1)Typeswitch (1)Permissions (1)AIP (1)Digital Media (1)James Wonder (1)mission-based publishing (1)STM associations (2)STM publishing (8)Media (6)Media & Marketing (1)Facets (1)mongoDB (10)Semantic Web (1)Amazon Web Service (1)Cloud (1)BBC (2)MarkLogic 7 (1)Mike Bowers (1)Sanjay Anand (1)Software Upgrade (1)Zynx Health (1)Multi-Version Concurrency (7)Marketing (3)The Real Scoop (1)Frank Rubino (1)Operational Trade Store (2)Linked Data (1)Philip Fennell (2)RDF (1)Adam Fowler (1)Range Indexes (1)range indexing scoring (2)Journey to Sanity (1)Jason Hunter (1)Loading As Is (2)MapReduce (2)HDFS (1)ASTM (2)Learning Management System (2)LMS (1)Intelligence (11)Healthcare (1)Enterprise Reference Data management (1)Reference data (2)Tableau (2)JSON (3)AngularJS (2)jQuery (1)Education (1)LRS (1)TinCan (1)Events (1)San Francisco (15)Data Management (1)MarkLogic World Tour (5)Government (1)Decision Support (8)Semantics (2)hiring (2)jobs (2)skill set (3)REST API (2)C++ REST Wrapper (2)narrative (2)Polling (2)Unstructured (2)Early Access (1)Open Source (1)free developer license (1)Java Client API (1)open source (1)proprietary (2)metadata (1)Women in Technology (1)Grace Hopper (1)Mary Hostege (1)frankfurt book fair (1)Klopotek (1)Larry (1)OpenWorld (1)Oracle (1)DaaS (1)Data as a Service (1)women in technology (4)ACID transactions (4)Government-grade security (2)rapid application development (5)RDBMS (5)Community (2)Bitemporal (4)MarkLogic 8 (1)Turkey (1)Santa's List (2)data integration (1)geospatial data (1)Patient 360 (1)EHR interoperability (1)HIE (2)semantic data (1)semantic interoperability (1)technology interoperability (1)Time-Series (3)Angular JS (1)Ember (1)JEE (1)document database (2)NBC (3)SNL app (3)Martin Fowler (3)microservices (3)polyglot persistence (2)polyglot persistance (2)Risk Management (2)Samplestack (2)Java (2)multi-statement transactions (2)product management (2)samplestack (2)Innovation (2)MarkLogic History (2)Timeline (1)Operational Data Warehouse (9)Retail and Consumer (1)Enterprise (2)Retail (1)Healthcare apps (1)Healthcare reform (7)Big Data in Retail (1)Omnichannel 360 (6)Omnichannel in Retail (1)Saturday Night Live app (2)Consumer 360 (1)Loyalty Programs in Retail (2)Big data in Government (2)Transformational leadership (2)E-Commerce in Retail (1)infographic (1)E-Commerce (1)Online sales (2)multi-model database (3)big data (1)Tech Summit (1)Business Data (1)Herold (1)precision search (1)Compliance (1)patient-centered care (2)Data Modeling (5)Relational Databases (1)CIO (1)data modeling (1)E-R diagram (1)entity relational diagram (4)relational databases (1)EHR (1)structured data (1)unstructured data (1)scalabilty (2)Retail & Consumer (1)360 view (1)cybersecurity (1)data variety (1)mixed workloads (1)OLAP (1)OLTP (1)shadow IT (1)agile (1)app development (1)impedance mismatch (1)JavaScript (1)ORM (1)relational database (1)Mainframe (1)Healthcare & Life Sciences (1)big data in healthcare
Supported Server Versions
Version Original Release Date Current Release Windows Version End of Life Date
MarkLogic Server 8.0 February 6, 2015 8.0-3.2 8.0-3.3 In Circulation
MarkLogic Server 7.0 November 14, 2013 7.0-5.4 7.0-5.5 In Circulation
MarkLogic Server 6.0 September 12, 2012 6.0-6 6.0-6.1 June 26, 2016
Latest Updates
It’s Cyber Monday Blues for Retail E-Commerce
Posted by Donald Soares on 30 November 2015 09:00 AM

E-commerce is set for significant growth this year. According to the Adobe Digital Index 2015 Holiday Shopping Predictions,1 2015 U.S. Holiday Season sales i.e., sales during November and December, will grow to an all-time high of $83 billion – an 11% year-over-year increase.

Meanwhile U.S. online sales on November 30 – Cyber Monday – are forecasted to see a 12% year-over-year increase, climbing to a record $3 billion in one day. Certainly nothing when compared to 11/11, or Singles’ Day2 in China this year, when $14.3 billion in goods was sold through Alibaba’s online sales platform – but not bad!

Yet, for most retailers e-commerce continues to face significant challenges and missed opportunities because the end game boils down to converting online visitors into buyers. As per industry analyst Monetate3 in Q4 2014, a mere 2.84% of online visitors to retail e-commerce websites actually bought anything. Also, analysis of traffic patterns showed that less than 1% of shoppers on mobile devices actually made a purchase. Even when online shoppers added items to their carts, two out of three – or 66% – did not end up completing the transaction.4

Worse, despite billion dollar investments by Wal-Mart and Target in building up their retail e-commerce infrastructure most U.S. consumers prefer as a shopping destination. A recent Reuters/Ipsos poll,5 found that 51% plan to do most of their online shopping at Amazon this holiday season, compared to 16% at Wal-Mart, 3% at Target and 2% at Macy’s. So, is not just winning the war online – but also managing to steal in-store sales from brick and mortar retailers as well. As per a November 18, New York Times article, “Amazon has built more than 100 warehouses from which to package and ship goods and it hasn’t really slowed its pace in establishing more.”6 Not to mention the success of its Prime Service that has garnered millions of subscribers with its unlimited “free” shipping at $99 per year.

At MarkLogic, we’ve identified three major pain points that are holding store-based and pure play e-commerce retailers back:

  • Handling product data complexity  The inability to provide consumers with the right selection they seek and dealing with the “Long-Tail” as efficiently as
  • Dealing with the nightmarish search experience most consumers face  Simply put, the average consumer can’t find what they are looking for online. Most retailers restrict the number of attributes or facets a consumer can search on to around 25 while consumers schooled on Google want to look for products a million different ways
  • Context to data online  Consumers are looking for solutions, whether that’s finding a recipe for dinner or buying an entertainment system online. Most e-commerce sites offer little context or sensible recommendations with respect to product linkages (i.e., What cable or sound box goes with the Samsung HDTV I’m considering?) or allow comparisons between products along attributes consumers consider important

Finally, it’s all about the conversion – and the data integration. My colleagues put together this infographic and slide show that illuminate the pains of winning digital consumers and the benefits of using Enterprise NoSQL to quickly connect online customers with the information and solutions they need.

We’re hosting a webinar that will cover the challenges of personalization on December 2 at 11:00 EST.  To sign up, click here. Don’t miss out on this opportunity to re-invent your business for true profits!


– Adobe Digital Index 2015 Holiday Shopping Predictions, October 28, 2015
– China’s Single Day Shopping by Ana Swanson Washington Post, November 18, 2015
– Monetate Q4, 2014 Report
– Wal-Mart’s annual report, 2014
– Reuters/Ipsos poll: U.S. consumers favor Amazon for online holiday shopping article in Reuters by Nathan Layne
– New York Times article by Farhad Manjoo, November 18, 2015

It’s Cyber Monday Blues for Retail E-Commerce from MarkLogic.

Read more »

A New Way to Master MDM
Posted by Christy Haragan on 25 November 2015 08:10 AM


Why can it be so hard to achieve? Something I’ve learned with problem solving is: if you find yourself spiraling into a pit of complexity, often the right approach is to go back to the start and look for another way.

For centuries, astronomers struggled to map the paths of the stars and the planets. The more they tried to map their models to the actual data, the more complex these models became, involving wild calculations that would make even the most hardened mathematician cry. That was, until Copernicus came along and suggested: What if the earth wasn’t the center of the Universe, and everything revolves around it? What if, instead, the Earth revolved around the Sun? And just like, that these absurd and complex models were reduced to a beautiful simplicity.

Of course, astronomy is still a difficult subject; but by finding the right start, vast amounts of unnecessary complexity were removed from the problem.

Data Sources As Astral Bodies

MDM projects, are essentially the same thing. Instead of astral bodies you have data sources. Instead of geocentricism (the earth at the center of the universe model), you have relational technology; whole IT budgets have been wasted trying to make it fit this problem. The question then is, what is the corresponding heliocentric (earth revolving around the Sun) model?

First a definition: MDM comprises the processes, governance, policies, standards and tools that consistently define and manage the critical data of an organization to provide a single point of reference.

For critical (master) data two criteria are required:

  • A single version of the truth: If you have a customer Christy Haragan registered as living in Winchester in one system, and Christy Haragan registered as living in London in another, it is advantageous to know which is the correct address.
  • Clean data: If you have a street address residing in a city field, this is unclean data.

As you can imagine those two criteria are aspirational because organizations frequently will have a myriad of overlapping data, which arises from both organic growth, and immature data management.

To overcome this, organizations invest in huge squads of people who seek to consolidate, clean and de-duplicate their master data — and then manage this single true version of the truth going forward. They may wish to do this using a registry style (leave the data where it is, and instead maintain a registry of which data sits where), or a hub style (where they move the data into a single repository and manage it from there). In either case, because this is dealing with the most critical data to the business, a fully enterprise grade solution is required, with ACID transactions, HA, and DR to ensure data is always consistent, never lost, and always available.

There are, however, a number of challenges associated to the pursuit of this managed single version of the truth, which often results in very large multi-year multi-million dollar projects (in the best case), or outright failure (in the worst). Over two-thirds of all MDM projects fail!

Registry vs Hub

Registry approaches have the best chance of succeeding, but have the challenge that if the Christy Haragan entity is referenced (and perhaps duplicated) across multiple systems, the processes involved in maintaining a single version can become expensive and error prone (conflict resolution, for example). This approach works for smaller projects with smaller more localized data sets (e.g., spanning a single data center or perhaps geographic location), but as you can imagine doesn’t scale.

Registry Style

MDM Registry style: Data is left in the source systems and managed centrally.

The hub approach, does scale, but requires a domain model (e.g. Customer, Product, Account, etc.). Due to the inherent intricacy in any of these domains, these models are extremely large and complex. The process of mapping the source data onto these domain models is extremely challenging, expensive, and error prone. Moreover, as data is changed and re-arranged, (shape-shifting at its finest) the risk of breaking existing processes increases, and adds further risk and expense to the project.

Hub Style

MDM Hub style: Data is mapped to a domain model and moved to the central hub to be managed.

On top of those constraints, a common challenge to both of these approaches is the inherent problem of data cleansing and data de-duplication. A successful data cleansing exercise might automate 80 percent of records, but that can still leave you with quite a bit! A modest number of a million records to process (a low number in MDM terms), this still would require 200,000 records to be manually cleansed. And in the case of multiple addresses, how would the system know which is correct? Maybe both are, maybe neither.

Heliocentric Solution for MDM

There are, however, two approaches that can be taken to address these fundamental and challenging problems with a traditional MDM solution:

  • A schema-agnostic solution will ensure that the effort of consolidating data is reduced by orders of magnitude. Instead of having to develop – or buy – expensive, complex, and often insufficient domain models (that require expensive and brittle ETL processes to map to) a schema-agnostic approach will allow the data to be brought together painlessly, ingesting as-is.

    Heliocentric Solution for MDM

    Operational Data Hub: Data is loaded as-is to the central hub to be managed.

    And instead of a big-bang approach required by traditional MDM — which demands all data be mapped before the system is useful, a schema-agnostic approach is more flexible and responsive. Iterative transformation of data after ingest would allow businesses to focus on high value tasks first, testing each change for correctness, and being able to respond to business changes quickly. For data in invalid fields, a schema-agnostic approach means leaving the data in its original form (reducing the risk of breaking existing processes). The entry can be enriched with meta-data to indicate whether it’s a street address, zip code, etc.

  • Instead of trying to reduce the truth to a single version, semantics can be used to link data. Instead of having to decide which address is correct, a link can be made to the two entries, and only when a business process is executed (or about to be executed) and requires a valid address to be known, would the business need to engage the potentially expensive task to ascertain the correct address.

Of course most schema-agnostic approaches won’t provide a lot of the enterprise grade features of ACID, HA and DR.

MarkLogic provides a schema-agnostic database platform allowing data to be stored in its original form, and enriched as necessary. It’s semantics capability allows flexible links to be created between entities. But it has those highly-sought after enterprise features — so the business doesn’t sacrifice anything in adopting this new approach to managing it’s most critical data.

MDM, like Astronomy, is still a difficult topic. But by starting with the right approach, we can reduce large amounts of unnecessary complexity.

MarkLogic is MDM’s heliocentric solution.

A New Way to Master MDM from MarkLogic.

Read more »

Copyright © 2015 MarkLogic Corporation. All Rights Reserved. MARKLOGIC® is a registered trademark of MarkLogic Corporation.   Terms of Use  |  Privacy Policy  |  Careers  |  Sitemap