Briefing Insights: Zeenea – The Adaptive Data Catalog
Timm Grosser, BARC’s Senior Analyst for Data Management, looks at Zeenea, the young and growing data catalog software provider.
Data cataloging is one of the hype topics of the moment and the market is awash with solutions from established providers and new players. The market is exciting and is beginning to take shape. The first promising providers of catalog offerings such as Datum, Podium Data, Semanta, Unifi Software and Waterline Data have already been bought up. But many newer specialists are also using the hype to successfully put forward their ideas of data cataloging. One of them is Zeenea.
Zeenea was founded in 2017 and currently employs around 40 people. The privately managed, Paris-based software vendor now has 50 customers and counting. Having initially focused on the European market, Zeenea continues to extend its reach. For example, a German subsidiary is next on the roadmap. The company is also starting to expand into the US market and its presence is building noticeably. Zeenea’s major customers include Renault, Société Générale, Thales and CBP Group. It targets all industry sectors. According to its own figures, the company is currently enjoying triple-digit growth.
Zeenea is primarily a software vendor that also offers consulting services for implementing its data catalog product. The goal is to create a smart data catalog for all data users. To achieve this, Zeenea focuses primarily on ease of use for business users, an adaptive metadata model and an aggressive pricing model. With this, Zeenea is competing directly with the big players in the industry such as Alation, Collibra, IBM, Informatica et al. and is attempting to replace existing installations. Zeenea’s idea is to grow along with the data requirements of the company and scale more flexibly in terms of content and price.
The use cases for the platform are diverse and range from data governance, data stewardship, regulation and PII detection to support for analytics. The young company has already demonstrated its ability to be used on a large scale. Renault, for example, uses the technology with 200 data stewards and over 1000 users worldwide.
The cloud product consists of a common repository and two user tools:
- – Zeenea Studio is used for the integration, documentation and maintenance of metadata, for example by a data steward.
- – The Zeenea Explorer is primarily for business analysts or data scientists to support them in their data search and discovery.
Both tools are principally designed for business users and are simple and intuitive to use.
Zeenea focuses exclusively on the storage and analysis of metadata. In its latest version, Zeenea has made a paradigm shift and now uses a knowledge graph for mapping metadata, rather than relational database technology. Above all, the graph helps to simplify the adaptation of the metadata model during operation. In graphs, structures (objects and relationship types) can be added or adapted more easily than in relational models. The graph offers two other significant advantages:
- The relationships and dependencies between the unique objects per se are given in the graph model and can be used for dependency analyses and linear evaluations;
- The context of the metadata objects is described and machine-readable. This machine-readability opens up the potential to use ML.
Unfortunately, the extent to which Zeenea already uses these advantages could not be examined in more detail in my initial review.
To fill the repository, Zeenea offers all kinds of scanners for the import of metadata. This includes not only databases. BI tools such as Tableau and Qlik are supported, as well as ETL and data quality tools (e.g., from Informatica) and packaged applications including products from the SAP portfolio. For the integration of unstructured data or unknown data sources, an API and SDK are available. The tool also monitors the connected metadata sources and reacts to changes. This should ensure that the content is always up to date in the catalog.
The Zeenea architecture
Typically, the data steward curates the metadata in the Studio. The tool offers support for generating templates for metadata acquisition, designing information around metadata, and metadata generation (e.g., through data profiling) as well as supporting the data steward with functions and automations for metadata documentation and maintenance. The ML capabilities of the tool are set to be enhanced in upcoming versions, especially for the simplification of metadata maintenance. Support will be provided on a content/semantic level (e.g., through similarity analyses).
The core of any data catalog is its search functionality. Zeenea offers direct and context-related search options. Hit lists can be made more specific through extensive filter mechanisms. Due for release soon, I’m particularly excited about the dynamic profiling, which will enable the customized presentation of catalog content.
In addition to searching, the Zeenea data catalog supports business glossary, data lineage and data profiling, thus focusing on the core functions of a data catalog. Users who also want a tool for policy management, workflows or data access will need to look to external providers. The open architecture of Zeenea offers the possibility of integration with third-party tools.
Zeenea is a fairly young product and has already demonstrated quite a lot considering it has just three years of market experience and 50 customers. The company is concentrating on the core of data cataloging, thereby setting itself apart from the competition, which is becoming broader and broader from my point of view. The tool is clearly arranged for me. The essential core functions run stably, even in larger scenarios. In terms of look and feel, it is clearly aimed at business users. Detailed technical information about, for example, ETL processes was not given in the briefing. Also metadata curation workflows (approval processes) were not covered.
I do not feel that there is sufficient support for collaboration. Commenting is supported but I would like to see chat or other collaboration functions as well as knowledge from the masses (e.g., ratings or reviews) to help better evaluate data sets. Zeenea has recognized that the adoption of a catalog is based on trust. However, trust doesn’t come from the tool alone, but is based on the content that is generated and used by all users. In this sense, user opinion is an important part of a catalog.
The roadmap features many ML-based, innovative functions, such as improved similarity detection, pattern recognition and identifying relevant attribute values.
I think the idea and approaches have succeeded in making the data catalog grow adaptively with the company. I am of the opinion that we do not need another battleship catalog. Instead, it is important to take the company and users on a journey towards a data-centric enterprise. And this journey must be feasible. It requires consideration of the organization, the capabilities within the company, and a flexible, simple structure for catalog content. The idea is there and the first technological precautions have been taken. I’m curious to see what Zeenea will make of it.