SQL PARTITIONING


SQL
POSTGRES
INTERACTIVE

Complete SQL Partitioning Technical Guide

1. Partitioning Fundamentals
-- PostgreSQL Range Partitioning
CREATE TABLE sales (
    id SERIAL,
    sale_date DATE NOT NULL,
    amount DECIMAL(10,2),
    region VARCHAR(50)
) PARTITION BY RANGE (sale_date);

-- MySQL List Partitioning
CREATE TABLE employees (
    id INT AUTO_INCREMENT,
    name VARCHAR(100),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    PRIMARY KEY (id, department)
) PARTITION BY LIST COLUMNS(department) (
    PARTITION p_engineering VALUES IN ('dev', 'qa', 'ops'),
    PARTITION p_sales VALUES IN ('sales', 'marketing'),
    PARTITION p_other VALUES IN (DEFAULT)
);
  1. Horizontal Partitioning: Splits table by rows into physical segments
  2. Partition Key: Column(s) determining row distribution
  3. Partition Types: Range, List, Hash, Composite
  4. Benefits: Improved query performance, easier maintenance, better I/O distribution
2. Partitioning Strategies
Range Partitioning (Time-Series Data)
-- Oracle Time-Based Partitioning
CREATE TABLE sensor_data (
    sensor_id NUMBER,
    reading_time TIMESTAMP,
    value NUMBER
) PARTITION BY RANGE (reading_time) (
    PARTITION p_2023_q1 VALUES LESS THAN (TO_DATE('01-APR-2023', 'DD-MON-YYYY')),
    PARTITION p_2023_q2 VALUES LESS THAN (TO_DATE('01-JUL-2023', 'DD-MON-YYYY')),
    PARTITION p_future VALUES LESS THAN (MAXVALUE)
);
Hash Partitioning (Even Distribution)
-- SQL Server Hash Partitioning
CREATE TABLE user_sessions (
    session_id UNIQUEIDENTIFIER,
    user_id INT,
    start_time DATETIME2,
    end_time DATETIME2
) PARTITION BY HASH(user_id)
PARTITIONS 4;
3. Advanced Partition Operations
Partition Maintenance
-- PostgreSQL Add/Drop Partitions
ALTER TABLE sales ADD PARTITION p_2023_q3
VALUES FROM ('2023-07-01') TO ('2023-10-01');

ALTER TABLE sales DROP PARTITION p_2022_q4;

-- MySQL Partition Exchange
ALTER TABLE sales 
EXCHANGE PARTITION p_2023_q1 WITH TABLE sales_archive;
Subpartitioning
-- Oracle Composite Partitioning
CREATE TABLE customer_orders (
    order_id NUMBER,
    customer_id NUMBER,
    order_date DATE,
    amount NUMBER
) PARTITION BY RANGE (order_date)
SUBPARTITION BY HASH (customer_id) SUBPARTITIONS 4 (
    PARTITION p_2022 VALUES LESS THAN (TO_DATE('01-JAN-2023', 'DD-MON-YYYY')),
    PARTITION p_2023 VALUES LESS THAN (TO_DATE('01-JAN-2024', 'DD-MON-YYYY'))
);
4. Partition-Aware Queries
Partition Pruning
-- PostgreSQL (only scans relevant partitions)
EXPLAIN ANALYZE SELECT * FROM sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-03-31';

-- SQL Server Partition Elimination
SELECT * FROM sensor_data
WHERE reading_time >= '2023-01-01'
AND reading_time < '2023-02-01';
Partition-Wise Joins
-- Join partitioned tables on partition key
SELECT c.customer_name, SUM(o.amount)
FROM customers c JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY c.customer_name;

-- Oracle Parallel Partition Processing
SELECT /*+ PARALLEL(4) */ * FROM sales
WHERE sale_date > SYSDATE - 30;
5. Partitioning Best Practices
  1. Choose Right Key: Select columns frequently used in WHERE clauses
  2. Balance Partition Size: Target 1GB-10GB per partition (adjust based on DBMS)
  3. Monitor Partition Usage: Track query performance and storage distribution
  4. Plan Growth: Automate partition creation for time-series data
  5. Index Strategically: Local indexes per partition vs global indexes
Maintenance Automation
-- PostgreSQL Automatic Partition Creation
CREATE OR REPLACE FUNCTION create_monthly_partition()
RETURNS TRIGGER AS $$
BEGIN
    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS sales_%s PARTITION OF sales '
        'FOR VALUES FROM (%L) TO (%L)',
        to_char(NEW.sale_date, 'YYYY_MM'),
        date_trunc('month', NEW.sale_date),
        date_trunc('month', NEW.sale_date) + interval '1 month'
    );
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER sales_partition_trigger
BEFORE INSERT ON sales
FOR EACH ROW EXECUTE FUNCTION create_monthly_partition();

Partitioning vs Sharding

  1. Partitioning: Logical division within single database instance
  2. Sharding: Physical distribution across multiple servers
  3. Some databases offer hybrid approaches (e.g., Citus for PostgreSQL)

Partition Key Selection

  1. Time columns for time-series data
  2. Geographic regions for location-based apps
  3. Tenant ID for multi-tenant SaaS applications
  4. Avoid frequently updated columns

Cross-Database Support

  1. PostgreSQL: Declarative partitioning (10+), inheritance-based
  2. MySQL: Partitioning limited to certain storage engines
  3. Oracle: Advanced features like interval partitioning
  4. SQL Server: Partition functions and schemes

Performance Considerations

  1. Partition pruning dramatically improves query speed
  2. Parallel query execution across partitions
  3. Reduced index size per partition
  4. Consider partition overhead for small tables

Common Pitfalls

  1. Over-partitioning (too many small partitions)
  2. Under-partitioning (giant partitions)
  3. Choosing wrong partition key
  4. Forgetting to maintain partition constraints

Advanced Techniques

  1. Partitioning by expression (e.g., hash of multiple columns)
  2. Foreign table partitions (PostgreSQL FDW)
  3. Zero-downtime partition reorganization
  4. Partitioning combined with table compression