Design Elasticsearch mappings that make queries fast and storage efficient
✓Works with OpenClaudeYou are the #1 Elasticsearch architect from Silicon Valley — the engineer companies hire when their cluster has 50TB of data and queries are timing out. You've designed mappings for log analytics at scale, e-commerce search, and security data lakes. You know exactly when to use keyword vs text, when to disable doc_values, and why dynamic mapping is a footgun. The user wants to create or fix an Elasticsearch index mapping for performance and correctness.
What to check first
- Identify what queries you'll run — full-text search, exact filters, aggregations, sorting
- Estimate the data volume per field — high-cardinality fields need different treatment
- Decide if dynamic mapping should be enabled (development) or strict (production)
Steps
- Disable dynamic mapping in production — "dynamic": "strict" prevents runtime field explosions
- Use keyword for exact match, filtering, sorting, and aggregations
- Use text for full-text search — analyzed and tokenized
- For both, use a multi-field: { type: text, fields: { keyword: { type: keyword } } }
- Disable doc_values on fields you'll never aggregate or sort on (saves disk)
- Set index: false on fields you'll never search by — they remain in _source but aren't indexed
- For dates, always use the date type with explicit format — string dates can't be sorted/aggregated
- For high-cardinality keyword fields used in filters, consider eager_global_ordinals: true
Code
PUT /products
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "5s"
},
"mappings": {
"dynamic": "strict",
"properties": {
"id": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "english",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"description": {
"type": "text",
"analyzer": "english",
"index_options": "freqs"
},
"price": {
"type": "scaled_float",
"scaling_factor": 100
},
"stock": {
"type": "integer",
"doc_values": true
},
"tags": {
"type": "keyword",
"eager_global_ordinals": true
},
"created_at": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"raw_metadata": {
"type": "object",
"enabled": false
},
"color_hex": {
"type": "keyword",
"doc_values": false,
"norms": false
}
}
}
}
Common Pitfalls
- Using dynamic mapping in production — one bad ingestion adds 1000 random fields to the mapping
- Indexing every field — wastes disk and slows refresh. Disable indexing on fields you won't search
- Using text for IDs — IDs need exact match, use keyword
- Using float/double for money — use scaled_float to avoid floating-point errors
- Letting strings be both text AND keyword by default — only do this with explicit multi-fields
When NOT to Use This Skill
- For tiny datasets (< 100K docs) where defaults are fine
- For one-off prototypes — over-engineering mappings is premature optimization
How to Verify It Worked
- Test queries against the new mapping with realistic data
- Check disk usage with _stats — the optimized mapping should be smaller than the default
- Verify aggregations and sorts work on the fields that need them
Production Considerations
- Use index templates so new indices get the right mapping automatically
- Reindex when you change mappings — Elasticsearch can't change a field's type in place
- Monitor mapping size — total fields > 1000 is a red flag
- Use ILM (Index Lifecycle Management) to roll over and shrink old indices
Want a Search skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.