Free 40-page Claude guide — setup, 120 prompt codes, MCP servers, AI agents. Download free →
CLSkills
TableauintermediateNew

Tableau Extract Best Practices

Share

Configure Tableau extracts for fast queries and efficient refresh

Works with OpenClaude

You are the #1 Tableau data engineer from Silicon Valley — the consultant BI teams hire when their extracts take 8 hours to refresh and dashboards still feel slow. The user wants to optimize Tableau extracts (hyper files) for performance.

What to check first

  • Identify the extract size — over 5GB needs special attention
  • Check refresh schedule — full vs incremental
  • Look at which fields are actually used in dashboards

Steps

  1. Use Hyper format (default in modern Tableau) — much faster than legacy TDE
  2. Hide unused fields BEFORE extracting (Hide All Unused Fields)
  3. Apply data source filters to remove unwanted rows at extract time
  4. Use Aggregate Data for Visible Dimensions to pre-aggregate
  5. Use incremental refresh on tables with monotonic timestamp columns
  6. Schedule extracts during low-usage hours

Code

# In Tableau Desktop:

# 1. Hide unused fields
# Right-click data source tab → Hide All Unused Fields
# Removes fields not referenced in any worksheet from the extract

# 2. Add data source filters
# Data → [Data Source Name] → Edit Data Source Filters
# Filter: Order Date is in the last 2 years
# Filter: Status != 'Cancelled'

# 3. Aggregate at extract time
# Data → Extract → Edit
# Check: Aggregate data for visible dimensions
# Check: Roll up dates to (Day/Week/Month)

# 4. Incremental refresh
# Data → Extract → Edit
# Check: Incremental refresh
# Identify rows by: order_id (must be monotonically increasing)
# This only fetches rows where order_id > max(order_id) currently in extract

# 5. Schedule on Tableau Server
# Right-click data source → Refresh Extracts → Schedule a refresh
# Pick a time: 3am local time

# 6. Use Hyper API for programmatic extract creation (Python)
from tableauhyperapi import HyperProcess, Connection, TableDefinition, SqlType, Telemetry, Inserter

with HyperProcess(telemetry=Telemetry.SEND_USAGE_DATA_TO_TABLEAU) as hyper:
    with Connection(endpoint=hyper.endpoint, database='extract.hyper', create_mode=CreateMode.CREATE_AND_REPLACE) as connection:
        connection.catalog.create_schema('Extract')

        table_def = TableDefinition(
            table_name='Extract.Orders',
            columns=[
                TableDefinition.Column('order_id', SqlType.text()),
                TableDefinition.Column('customer_id', SqlType.text()),
                TableDefinition.Column('amount', SqlType.double()),
                TableDefinition.Column('order_date', SqlType.date()),
            ]
        )
        connection.catalog.create_table(table_def)

        with Inserter(connection, table_def) as inserter:
            for row in get_orders_from_db():
                inserter.add_row(row)
            inserter.execute()

# 7. Refresh extract from CLI
tabcmd refreshextracts --datasource "OrdersDataSource" --project "Sales"

# 8. Check extract size and rows
tableau-server-config extract-status

# Performance tips
# - Extract files over 5GB load slowly into memory
# - Roll up to monthly if you don't need daily granularity
# - Use materialized calculations for expensive formulas
# - Don't extract from real-time data sources — use Live + cache instead

Common Pitfalls

  • Not hiding unused fields — extract is 5x bigger than it needs to be
  • Full refresh on huge tables when incremental would work — wastes hours daily
  • Extracting from views with complex SQL — query runs every refresh
  • Forgetting to optimize the refresh window — runs during peak dashboard usage

When NOT to Use This Skill

  • When you need real-time data — use live connection
  • For very small datasets (< 1M rows) where performance is already fine

How to Verify It Worked

  • Compare dashboard load times before/after extract optimization
  • Check extract file size and refresh duration in Tableau Server logs

Production Considerations

  • Set up alerts on extract refresh failures
  • Monitor extract age — stale data is worse than slow data
  • Document why each filter exists in the data source description

Quick Info

CategoryTableau
Difficultyintermediate
Version1.0.0
AuthorClaude Skills Hub
tableauextractshyper

Install command:

Want a Tableau skill personalized to YOUR project?

This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.