Briefing Insights: Dremio – The “Data Lake Engine”
Reflections on BARC’s recent briefing with Dremio by Timm Grosser, BARC’s Senior Analyst for Data Management
What is a data lake engine?
In short, it should help to find data in its (cloud) data lake quickly and easily, and to evaluate it with a high level of query performance. Technically speaking, it is an SQL-based query engine with a semantic layer that enables queries on different data storage systems (on-premises or cloud-based). It acts as a central access point for JDBC/ODBC-compatible user tools.
Dremio was established in 2015 with headquarters in Santa Clara, USA. Currently, around 120 employees work for the technology supplier. Customers include companies such as Diageo, Microsoft, NCR, PayPal, Standard Chartered and Transunion. In the DACH region, DATEV, DB Cargo, Henkel and Software AG (Cumulocity IoT) already use Dremio. Datev, DBCargo and Henkel are among the showcase customers in the DACH region. Dremio is suitable for use by companies from all industry sectors. A dedicated team was established in early 2020 to focus on the German-speaking market. The company plans to expand this team in the future.
In 2018, Dremio Enterprise Edition was launched as a supplement to the open source Dremio Community Edition product. The Enterprise Edition primarily includes additional enterprise functions related to data protection and security as well as services. Dremio can be used on-premises and/or in your own cloud account (AWS, Azure).
Dremio is available in the AWS and Azure marketplaces and is a co-sell partner of both these providers.
Another strong global partner is Tableau. Tableau uses Dremio primarily for SQL data access to distributed file systems and has already convinced several customers to work with Dremio.
Dremio is leveraged and has recently received a US$70 million cash injection.
With its technology, Dremio aims to simplify and accelerate access to data for analytical workflows and to make this more cost-effective than other players in the market. Its cost-effectiveness extends beyond technology license fees: its approach of not moving and duplicating data in the overall architecture also saves costs. Dremio follows the approach of providing fast, flexible access to (distributed) data via a user-friendly interface. This is to avoid additional persistent layers such as aggregations and to give users a platform to perform ad hoc analyses.
Its main users are business analysts, data scientists and data engineers. Dremio considers it important to be seen as an agnostic tool. It enables the querying of different data storage technologies at different locations (cross-cloud, on-premises/cloud, etc.).