Partitioning

I would like to start some posts series on my blog regarding techniques connected with high data volume. One of these techniques is partitioning. There are lots of sources describing partitioning approach. I will concentrate on practical usage in my posts. I will start this theme with partitioning views, since they are available in earlier versions of SQL server standard editions and they help to better illustrate principles of partitioning.

Partitioning helps when you have huge table with lots of historical data. Even when this table has designed indexes in good shape, large table has large indexes which has impact on seek operations and reading is not so fast.  It could cause other problems too, for example with data modification or data maintenance. Full-scan of very large statistic created with the index could be very time consuming. 

Splitting such a large objects to smaller partitions will help with all above mentioned aspects.  Even better, each partition could be part of filegroup on separate disk storage which leads to better I/O operations when data could be read in parallel. It will brings other advantages when you dont want to touch historical data or use them just for reading – you can bring them to filegroups and change them to read-only state.

There are few approaches how partitioning can be designed. I would like to describe them in several posts. In short:

from point of view of partitioned direction.

  • Horizontal partitioning – data are partitioned based on attribute/attributes defined in table. For example, data partitioned by month in separated tables. Each partition has the same structure – columns.
  • Vertical partitioning – there is defined attribute which connects more tables with fewer columns. Structure of each partition can be different. For example, let’s have transaction info, one table partition stores about customers another one regarding region. Each partition has unique key transaction_id creating relations between such a partitions.

from point of view of partition data distribution

  • Equally distributed – for example partitions based on monthly period connected to date dimension
  • Floating partitions – partition with data for last tree months. Older data are moved to archive partition with historical data

From point of view of data source

  • Partitions created in the same database
  • Partitions created across databases on the same server
  • Distributed partitions created across servers

There are more perspectives how we could look on partitoining. Lets go trough some sammples and implementation styles to get better overview of these basic terms.

Next post will be concentrated on partitioned views so stay tuned.


Leave a Reply

Your email address will not be published. Required fields are marked *