SQL WHERE


SQL
POSTGRES
INTERACTIVE

SQL WHERE Clause: Complete Technical Guide

1. Fundamental Syntax
SELECT column1, column2, ... 
FROM table_name 
WHERE condition 
[AND|OR additional_conditions];

you can stack AND and OR in SQL WHERE clauses as much as you like!

2. Internal Execution Process
  1. Predicate Evaluation
    1. Conditions are parsed into expression trees
    2. Short-circuit evaluation for AND/OR logic
    3. Three-valued logic (TRUE/FALSE/UNKNOWN) for NULL handling
  2. Access Path Selection
    1. Index Seek (optimal): Direct lookup via B-tree when WHERE matches index
    2. Index Scan: Sequential index reading when partial match exists
    3. Table Scan: Full data read when no usable index exists
  3. Filter Application
    1. Storage engine applies remaining filters after index access
    2. Predicate pushdown in modern DBs (filters applied during storage retrieval)
3. Condition Types and Optimization
Condition Type Index Usage Performance Notes
column = value Excellent Best case for B-tree index seeks
column > value Good Range scan from index
column LIKE 'prefix%' Good Uses index for prefix matches
column LIKE '%suffix' None Forces full scan
column IN (list) Good Treated as multiple OR conditions
column IS NULL Varies Depends on NULL storage in index
function(column) = value None Prevents index usage
4. Advanced WHERE Patterns
Subquery Conditions
SELECT * FROM products 
WHERE category_id 
IN (SELECT id FROM categories WHERE active = 1);

Can be rewritten as JOINs for better performance

JSON Path Queries
SELECT * FROM orders 
WHERE JSON_VALUE(order_data, '$.status') = 'shipped';
Full-Text Search
SELECT * FROM articles 
WHERE CONTAINS(content, 'database NEAR optimization');
Pattern Matching
SELECT * FROM users 
WHERE phone REGEXP '^[0-9]{3}-[0-9]{3}-[0-9]{4}$';
💡 Performance Pro Tips
  1. Place most selective conditions first in AND chains
  2. Use EXISTS() instead of IN() for large subqueries
  3. Avoid functions on indexed columns in WHERE clauses
  4. Consider computed columns with indexes for frequent expressions
  5. Use partial indexes for filtered queries (e.g., WHERE status = 'active')
5. Database-Specific Optimizations
MySQL
  1. Index Condition Pushdown: Filters during index reads
  2. Range Optimizer: Combines multiple range conditions
  3. Use FORCE INDEX hints when optimizer chooses poorly
PostgreSQL
  1. Bitmap Index Scan: Combines multiple indexes
  2. JIT Compilation: Accelerates complex WHERE clauses
  3. Expression indexes support functional conditions
SQL Server
  1. Predicate Pushdown: Into columnstore indexes
  2. Parameter Sniffing: Caches plans for parameterized queries
  3. Filtered indexes for targeted conditions
Oracle
  1. Bloom Filters: For large table joins
  2. Result Cache: For repeated identical queries
  3. Virtual columns with indexes for computed conditions
⚠️ Common Anti-Patterns
  1. WHERE CAST(date_column AS TEXT) LIKE '2023%'
    (Prevents index use)
  2. Overusing OR conditions (consider UNION ALL rewrite)
  3. Implicit type conversions in comparisons
  4. Negated conditions (NOT IN, !=) often perform poorly
  5. Overly complex conditions that prevent predicate pushdown

Key technical aspects that distinguish WHERE from other clauses:

  1. Early Filtering: WHERE clauses execute before GROUP BY/SELECT, reducing processed data volume
  2. Index Utilization: Directly impacts access path selection (index vs scan)
  3. Predicate Evaluation Order: Affects short-circuiting behavior
  4. SARGability: Search ARGument ability determines if conditions can use indexes
  5. Parameterization: Prepared statements handle WHERE values differently than literals

Advanced WHERE Clause Techniques

1. Multi-Column Index Optimization
-- Optimal index for this query would be (last_name, first_name) 
SELECT * FROM employees 
WHERE last_name = 'Smith' AND first_name = 'John';

Key Insight:

The order of columns in a compound index must match your query patterns. The most selective columns should come first.

-- Index ordering matters! This won't use the index efficiently: 
SELECT * FROM employees 
WHERE first_name = 'John' AND last_name = 'Smith';
2. Partial Indexes (Filtered Indexes)
-- PostgreSQL/SQL Server syntax 
CREATE INDEX idx_active_users
ON users(email) WHERE active = true; 

-- MySQL equivalent 
CREATE INDEX idx_active_users ON users(email) 
WHERE (active = 1);

When to Use:

When you frequently query a subset of data (e.g., active users, recent orders). Can reduce index size by 90%+ in some cases.

3. Function-Based Indexes
-- Oracle/PostgreSQL 
CREATE INDEX idx_name_lower 
ON employees(LOWER(last_name)); 
                
-- Now this query can use the index: 
SELECT * FROM employees 
WHERE LOWER(last_name) = 'smith';
MySQL/SQL Server Workaround:
ALTER TABLE employees 
ADD COLUMN last_name_lower VARCHAR(100) AS (LOWER(last_name)); 
CREATE INDEX idx_name_lower 
ON employees(last_name_lower);
4. JSON Path Query Optimization
-- PostgreSQL (with GIN index) 
CREATE INDEX idx_product_tags 
ON products USING gin((data->'tags')); 

-- Efficient JSON query:
SELECT * FROM products WHERE data->'tags' @> '["sale"]';

Performance Tip:

JSONB (binary JSON) typically performs better than JSON for query operations.

5. Advanced Date Filtering
-- Index-friendly date range (with index on event_date) 
SELECT * FROM events 
WHERE event_date BETWEEN '2023-01-01' AND '2023-01-31'; 

-- Problematic pattern (can't use index): 
SELECT * FROM events 
WHERE YEAR(event_date) = 2023 AND MONTH(event_date) = 1;

Pro Tip:

For rolling date windows, use:

WHERE event_date >= CURRENT_DATE - INTERVAL '30 days'

6. Optimizing OR Conditions
-- Problematic OR pattern: 
SELECT * FROM orders 
WHERE status = 'shipped' OR customer_id = 1001; 

-- Better approaches: 
SELECT * FROM orders WHERE status = 'shipped' 
UNION ALL SELECT * FROM orders 
WHERE customer_id = 1001 AND status != 'shipped'; 
--Avoids duplicates

Warning:

OR conditions often prevent index usage. Test with EXPLAIN to verify.

7. Materialized Path Pattern (Hierarchies)
-- With path column like '/1/4/7/' (indexed) 

SELECT * FROM categories 
WHERE path LIKE '/1/%' AND path LIKE '%/7/'; 

-- Finds all children of 1 with ancestor 7

Alternative:

Consider recursive CTEs in modern databases for hierarchical queries.

⏱️ Performance Checklist
  1. Always check EXPLAIN output for WHERE clause execution
  2. Prefer = over IN when possible (better optimizer estimates)
  3. Use covering indexes to avoid table lookups
  4. Consider index-only scans when all columns are in the index
  5. Monitor cardinality estimates vs actual rows