• This thread is just the tip of the iceberg.The people ahead of the curve aren't Googling for answers — they're already in here, having the conversations you haven't found yet. DealerRefresh is free.Get the full picture →

Anyone have experience with data lakes

Brad Burlingham

4 Pounder
May 28, 2009
76
21
Awards
4
First Name
Brad
I'm hoping to setup api from all of our dealerships into a datalake so that we can create our own reporting. Does anyone have any suggestions on vendors or overall advice?

First stage is to use it to develop marketing dashboards to evaluate trends, performance, kpi etc. next step would be to use it with a cdp to develop audiences etc.
 
Snowflake is still the industry leader by far, but ClickHouse is quickly stealing market share as a more innovative solution:
Fast Open-Source OLAP DBMS - ClickHouse

Getting marketing data like GA4 into a data warehouse can sometimes be easier with Google BigQuery, but tools like Supermetrics can help load that data into Snowflake as well.

For reporting, I would highly suggest looking into Sigma. We’re in the process of moving off Tableau and Google Data / Looker Studio and over to Sigma:
Sigma. Unified AI apps and analytics

But honestly, the most challenging piece is going to be setting up the pipelines to extract data out of your DMS, CRM, etc because as I’m sure you already know, automotive still tends to be a pretty walled off garden.
 
I went through this process previously at my 4 rooftop group. I documented the process and put it in the DR Resources tab here as well as GitHub with a more detailed architectural breakdown.

We used:
  • CDK (DMS): The Data Your Way export tool for daily pipeline cadence. This was super helpful.
  • DriveCentric (CRM): They offered direct API connection
  • Infrastructure: AWS API Gateway, S3 storage, Glue for dedup/cleaning, and Redshift for storage
  • Reporting: Data Studio
  • Marketing Activation: Segment CDP
What I'd do differently now:
  • I'd go with BigQuery/Google Cloud over Redshift/AWS for the native integrations with GA4 and Data Studio as Ryan mentioned ^
  • Supermetrics (they recently released a marketing intelligence platform for activation) or Airbyte if you want more technical control
  • Evidence.dev for reporting. They have NLP built-in so non-technical folks can build reports too and permission controls if you wanted to create reports per department.

Depending on the DMS and CRM, the pipeline will be the largest obstacle. But DriveCentric and CDK had these export tools which made it ALOT easier to get things started. And as always, get comfortable with the data dictionary provided

It was a fun and beneficial project for our group and I hope some of this helps you in your quest!
 

✨ AI Highlights

A dealer looking to build a data lake for multi-store reporting and marketing dashboards asks for vendor recommendations and practical advice. Snowflake emerges as the consensus industry leader, with ClickHouse flagged as a rising alternative, Google BigQuery noted for GA4 integration, and Sigma recommended for reporting over Tableau or Looker Studio. The thread's key takeaway is that the hardest and most expensive part won't be the data lake itself but extracting data from DMS and CRM vendors, who are often resistant or charge fees for API access.

Replies Views 4 548 Started Last Reply