Data Warehouse

Last updated:

|Edit this page

🚧 Note: Data warehouse integration is currently considered in beta. We are keen to gather as much feedback as possible so if you try this out please let us know. You can email daniel.b@posthog.com, send feedback via the in-app support panel, or use one of our other support options.

Experiment metrics are always event-based: something happens at a certain point in time. If one of your data warehouse tables includes event-like data, you can use it as a primary or secondary metric for your trends experiment.

To use a data warehouse table with an experiment, you'll first need to join the 'events' table to your data warehouse table:

  1. Navigate to Data warehouse and click on 'Add join' from the triple dot menu next to your table.
  2. Join the 'events' table to your data warehouse table: a. Under 'Source Table Key', specify a column that holds the value of the distinct_id present for the $feature_flag_called event. b. Check 'Optimize for Experiments' to ensure only the most recent matching event is joined to your table. c. Under 'Source Timestamp Key', specify a column that represents the timestamp of the table row. It will be compared with the event timestamp to determine the most recent $feature_flag_called event for the row.
Screenshot of the triple dot menu to add a join
Screenshot of the form to join the events table

Once you've joined the 'events' table to your data warehouse table, you can use the data warehouse table in the experiment goal or as a secondary metric:

  1. When picking an experiment goal, click on the 'Data warehouse tables' category and select your table.
  2. Specify the columns in your data warehouse table that represent the unique ID, the distinct ID, and the timestamp.
Screenshot of the form to assign a data warehouse table to an experiment goal

When you select a data warehouse table as an experiment goal or secondary metric, PostHog sees it as an 'events-like' table and is thus able to use it to calculate results.

Screenshot of the experiment results

The PostHog uses the join to the 'events' table to determine which variant a given data warehouse table row is associated with. For trends experiments, the most recent $feature_flag_called event before the row represents their active variant assignment:

Data Warehouse: timestamp=2024-01-03 12:00, distinct_id=user1 Events:

  • 2024-01-02 12:00: (user1, variant='control') <- This event will be joined
  • 2024-01-03 18:00: (user1, variant='test') <- Ignored (occurs after data warehouse timestamp)

Questions?

Was this page useful?