Join us on YugabyteDB Community Slack
Star us on
Get Started
Slack
GitHub
Get Started
v2.7 (latest) v2.4 (stable) v2.2 (earlier version) v2.1 (earlier version) v2.0 (earlier version) v1.3 (earlier version)
  • YUGABYTEDB CORE
    • Quick start
      • 1. Install YugabyteDB
      • 2. Create a local cluster
      • 3. Explore distributed SQL
      • 4. Build an application
        • Java
        • NodeJS
        • Go
        • Python
        • Ruby
        • C#
        • PHP
        • C++
        • C
        • Scala
    • Explore features
      • YSQL vs PostgreSQL
        • Schemas and Tables
        • Data Types
        • Data Manipulation
        • Queries and Joins
        • Expressions and Operators
        • Cursors
        • Stored Procedures
        • Triggers
        • Table Partitioning
        • Tablespaces
        • Views
      • Fault tolerance
      • Horizontal Scalability
        • Scaling Transactions
        • Sharding Data
      • Transactions
        • Distributed Transactions
        • Isolation Levels
        • Explicit Locking
      • JSON Support
      • Multi-Region Deployments
        • Sync replication (3+ regions)
        • Async Replication (2+ regions)
        • Row-Level Geo-Partitioning
      • Query Tuning
        • Analyzing Queries with EXPLAIN
        • Viewing live queries with pg_stat_activity
        • Optimizing YSQL queries using pg_hint_plan
      • Follower reads
      • Colocated tables
      • Change data capture (CDC)
      • Extensions
      • Observability
        • Prometheus Integration
      • Security
    • Develop
      • Learn app development
        • 1. SQL vs NoSQL
        • 2. Data modeling
        • 3. Data types
        • 4. ACID transactions
        • 5. Aggregations
        • 6. Batch operations
        • 7. Date and time
        • 8. Strings and text
        • 9. TTL for data expiration
      • Ecosystem integrations
        • Apache Kafka
        • Spring Framework
        • Apache Spark
        • JanusGraph
        • KairosDB
        • Hasura
        • Presto
        • Metabase
      • Build GraphQL apps
        • Hasura
        • Prisma
      • Real-world examples
        • E-Commerce app
        • IoT fleet management
        • Retail Analytics
      • Explore sample apps
      • Best practices
    • Migrate
      • Migration process overview
      • Migrate from PostgreSQL
        • Convert a PostgreSQL schema
        • Migrate a PostgreSQL application
        • Export PostgreSQL data
        • Prepare a cluster
        • Import PostgreSQL data
        • Verify Migration
    • Deploy
      • Deployment checklist
      • Manual deployment
        • 1. System configuration
        • 2. Install software
        • 3. Start YB-Masters
        • 4. Start YB-TServers
        • 5. Verify deployment
      • Kubernetes
        • Single-zone
          • Open Source
          • Amazon EKS
          • Google Kubernetes Engine
          • Azure Kubernetes Service
        • Multi-zone
          • Amazon EKS
          • Google Kubernetes Engine
        • Multi-cluster
          • Google Kubernetes Engine
        • Best practices
        • Connect Clients
      • Docker
      • Public clouds
        • Amazon Web Services
        • Google Cloud Platform
        • Microsoft Azure
      • Multi-DC deployments
        • Three+ data center (3DC)
        • Two data center (2DC)
        • Read replica clusters
      • Change data capture (CDC)
        • CDC to Kafka
    • Benchmark
      • TPC-C
      • sysbench
      • YCSB
      • Key-value workload
      • Large datasets
      • Scalability
        • Scaling queries
      • Resilience
        • Jepsen testing
      • Performance Troubleshooting
    • Secure
      • Security checklist
      • Enable Authentication
        • Enable User Authentication
        • Configure ysql_hba_conf_csv
      • Authentication Methods
        • Password Authentication
        • LDAP Authentication
        • Host-Based Authentication
        • Trust Authentication
      • Role-Based Access Control
        • Overview
        • Manage Users and Roles
        • Grant Privileges
        • Row-Level Security (RLS)
        • Column-Level Security
      • Encryption in Transit
        • Create server certificates
        • Enable server-to-server encryption
        • Enable client-to-server encryption
        • Connect to Clusters
      • Encryption at rest
      • Column-Level Encryption
      • Audit Logging
        • Configure Audit Logging
        • Session-Level Audit Logging
        • Object-Level Audit Logging
      • Vulnerability disclosure policy
    • Manage
      • Back up and restore
        • Back up data
        • Restore data
        • Point-in-time restore
        • Snapshot and restore data
      • Migrate data
        • Bulk import
        • Bulk export
      • Change cluster configuration
      • Diagnostics reporting
      • Upgrade a deployment
      • Grow cluster
    • Troubleshoot
      • Troubleshooting
      • Common error messages
      • Cluster level issues
        • YCQL connection issues
        • YEDIS connection Issues
        • Recover tserver/master
        • Replace a failed YB-TServer
        • Replace a failed YB-Master
        • Manual remote bootstrap when a majority of peers fail
      • Node level issues
        • Check servers
        • Inspect logs
        • System statistics
        • Disk failure
    • Contribute
      • Core database
        • Contribution checklist
        • Build the source
        • Configure a CLion project
        • Run the tests
  • YUGABYTE PLATFORM
    • Yugabyte Platform
      • Overview
        • Install
        • Configure
      • Install Yugabyte Platform
        • Prerequisites
        • Prepare the environment
        • Install software
        • Prepare nodes (on-prem)
        • Uninstall software
      • Configure Yugabyte Platform
        • Create admin user
        • Configure the cloud provider
        • Configure the backup target
        • Configure alerts and health checking
        • Create and edit instance tags
      • Create deployments
        • Multi-zone universe
        • Multi-region universe
        • Read replica cluster
      • Manage deployments
        • Start and stop processes
        • Add a node
        • Enable high availability
        • Remove a node
        • Edit a universe
        • Edit configuration flags
        • Upgrade the YugabyteDB software
        • Delete a universe
        • Migrate to Helm 3
      • Back up and restore universes
        • Configure backup storage
        • Back up universe data
        • Restore universe data
        • Schedule data backups
      • Security
        • Security checklist
        • Customize ports
        • Authorization platform
        • Create a KMS configuration
        • Enable encryption at rest
        • Enable encryption in transit (TLS)
        • Network security
      • Alerts and monitoring
        • Live Queries dashboard
        • Slow Queries dashboard
      • Troubleshoot
        • Install and upgrade issues
        • Universe issues
      • Administer Yugabyte Platform
        • Back Up and Restore Yugabyte Platform
  • YUGABYTE CLOUD
    • Yugabyte Cloud
      • Free tier
      • Create clusters
      • Monitor clusters
      • Create databases
      • Manage database access
      • Connect to clusters
  • REFERENCE
    • Reference
    • Architecture
      • Design goals
      • Key concepts
        • Universe
        • YB-TServer Service
        • YB-Master Service
      • Core functions
        • Universe creation
        • Table creation
        • Write IO path
        • Read IO path
        • High availability
      • Layered architecture
      • Query layer
        • Overview
      • DocDB transactions layer
        • Transactions overview
        • Transaction isolation levels
        • Explicit locking
        • Single-row transactions
        • Distributed transactions
        • Transactional IO path
      • DocDB sharding layer
        • Hash & range sharding
        • Tablet splitting
        • Colocated tables
      • DocDB replication layer
        • Replication
        • xCluster replication
        • Read replicas
        • Change data capture (CDC)
      • DocDB storage layer
        • Persistence
        • Performance
    • APIs
      • YSQL
        • The SQL language
          • SQL statements
            • ABORT
            • ALTER DATABASE
            • ALTER DEFAULT PRIVILEGES
            • ALTER DOMAIN
            • ALTER GROUP
            • ALTER POLICY
            • ALTER ROLE
            • ALTER SEQUENCE
            • ALTER TABLE
            • ALTER USER
            • BEGIN
            • CALL
            • COMMENT
            • COMMIT
            • COPY
            • CREATE AGGREGATE
            • CREATE CAST
            • CREATE DATABASE
            • CREATE DOMAIN
            • CREATE EXTENSION
            • CREATE FUNCTION
            • CREATE GROUP
            • CREATE INDEX
            • CREATE OPERATOR
            • CREATE OPERATOR CLASS
            • CREATE POLICY
            • CREATE PROCEDURE
            • CREATE ROLE
            • CREATE RULE
            • CREATE SCHEMA
            • CREATE SEQUENCE
            • CREATE TABLE
            • CREATE TABLE AS
            • CREATE TRIGGER
            • CREATE TYPE
            • CREATE USER
            • CREATE VIEW
            • DEALLOCATE
            • DELETE
            • DO
            • DROP AGGREGATE
            • DROP CAST
            • DROP DATABASE
            • DROP DOMAIN
            • DROP EXTENSION
            • DROP FUNCTION
            • DROP GROUP
            • DROP OPERATOR
            • DROP OPERATOR CLASS
            • DROP OWNED
            • DROP POLICY
            • DROP PROCEDURE
            • DROP ROLE
            • DROP RULE
            • DROP SEQUENCE
            • DROP TABLE
            • DROP TRIGGER
            • DROP TYPE
            • DROP USER
            • END
            • EXECUTE
            • EXPLAIN
            • GRANT
            • INSERT
            • LOCK
            • PREPARE
            • REASSIGN OWNED
            • RESET
            • REVOKE
            • ROLLBACK
            • SELECT
            • SET
            • SET CONSTRAINTS
            • SET ROLE
            • SET SESSION AUTHORIZATION
            • SET TRANSACTION
            • SHOW
            • SHOW TRANSACTION
            • TRUNCATE
            • UPDATE
            • VALUES
          • WITH clause
            • WITH clause—SQL syntax and semantics
            • recursive CTE
            • case study—traversing an employee hierarchy
            • traversing general graphs
              • graph representation
              • common code
              • undirected cyclic graph
              • directed cyclic graph
              • directed acyclic graph
              • rooted tree
              • Unique containing paths
              • Stress testing find_paths()
            • case study—Bacon Numbers from IMDb
              • Bacon numbers for synthetic data
              • Bacon numbers for IMDb data
        • Data types
          • Array
            • array[] constructor
            • Literals
              • Text typecasting and literals
              • Array of primitive values
              • Row
              • Array of rows
            • FOREACH loop (PL/pgSQL)
            • array of DOMAINs
            • Functions and operators
              • ANY and ALL
              • Array comparison
              • Array slice operator
              • Array concatenation
              • Array properties
              • array_agg(), unnest(), generate_subscripts()
              • array_fill()
              • array_position(), array_positions()
              • array_remove()
              • array_replace() / set value
              • array_to_string()
              • string_to_array()
          • Binary
          • Boolean
          • Character
          • Date and time
          • JSON
            • JSON literals
            • Primitive and compound data types
            • Code example conventions
            • Indexes and check constraints
            • Functions & operators
              • ::jsonb, ::json, ::text (typecast)
              • ->, ->>, #>, #>> (JSON subvalues)
              • - and #- (remove)
              • || (concatenation)
              • = (equality)
              • @> and <@ (containment)
              • ? and ?| and ?& (key or value existence)
              • array_to_json()
              • jsonb_agg()
              • jsonb_array_elements()
              • jsonb_array_elements_text()
              • jsonb_array_length()
              • jsonb_build_object()
              • jsonb_build_array()
              • jsonb_each()
              • jsonb_each_text()
              • jsonb_extract_path()
              • jsonb_extract_path_text() and json_extract_path_text()
              • jsonb_object()
              • jsonb_object_agg()
              • jsonb_object_keys()
              • jsonb_populate_record()
              • jsonb_populate_recordset()
              • jsonb_pretty()
              • jsonb_set() and jsonb_insert()
              • jsonb_strip_nulls()
              • jsonb_to_record()
              • jsonb_to_recordset()
              • jsonb_typeof()
              • row_to_json()
              • to_jsonb()
          • Money
          • Numeric
          • Range
          • Serial
          • UUID
        • Functions and operators
          • Aggregate functions
            • Informal functionality overview
            • Invocation syntax and semantics
            • grouping sets, rollup, cube
            • Per function signature and purpose
              • avg(), count(), max(), min(), sum()
              • array_agg(), string_agg(), jsonb_agg(), jsonb_object_agg()
              • bit_and(), bit_or(), bool_and(), bool_or()
              • variance(), var_pop(), var_samp(), stddev(), stddev_pop(), stddev_samp()
              • linear regression
                • covar_pop(), covar_samp(), corr()
                • regr_%()
              • mode(), percentile_disc(), percentile_cont()
              • rank(), dense_rank(), percent_rank(), cume_dist()
            • case study—percentile_cont() and the "68–95–99.7" rule
            • case study—linear regression on COVID data
              • Download the COVIDcast data
              • Ingest the COVIDcast data
                • Inspect the COVIDcast data
                • Copy the .csv files to staging tables
                • Check staged data conforms to the rules
                • Join the staged data into a single table
                • SQL scripts
                  • Create cr_staging_tables()
                  • Create cr_copy_from_scripts()
                  • Create assert_assumptions_ok()
                  • Create xform_to_covidcast_fb_survey_results()
                  • ingest-the-data.sql
              • Analyze the COVIDcast data
                • symptoms vs mask-wearing by day
                • Data for scatter-plot for 21-Oct-2020
                • Scatter-plot for 21-Oct-2020
                • SQL scripts
                  • analysis-queries.sql
                  • synthetic-data.sql
          • currval()
          • lastval()
          • nextval()
          • Window functions
            • Informal functionality overview
            • Invocation syntax and semantics
            • Per function signature and purpose
              • row_number(), rank() and dense_rank()
              • percent_rank(), cume_dist() and ntile()
              • first_value(), nth_value(), last_value()
              • lag(), lead()
              • Tables for the code examples
                • table t1
                • table t2
                • table t3
                • table t4
            • case study—analyzing a normal distribution
              • Bucket allocation scheme
              • do_clean_start.sql
              • cr_show_t4.sql
              • cr_dp_views.sql
              • cr_int_views.sql
              • cr_pr_cd_equality_report.sql
              • cr_bucket_using_width_bucket.sql
              • cr_bucket_dedicated_code.sql
              • do_assert_bucket_ok
              • cr_histogram.sql
              • cr_do_ntile.sql
              • cr_do_percent_rank.sql
              • cr_do_cume_dist.sql
              • do_populate_results.sql
              • do_report_results.sql
              • do_compare_dp_results.sql
              • do_demo.sql
              • Reports
                • Histogram report
                • dp-results
                • compare-dp-results
                • int-results
        • Extensions
        • Keywords
        • Reserved names
      • YCQL
        • ALTER KEYSPACE
        • ALTER ROLE
        • ALTER TABLE
        • CREATE INDEX
        • CREATE KEYSPACE
        • CREATE ROLE
        • CREATE TABLE
        • CREATE TYPE
        • DROP INDEX
        • DROP KEYSPACE
        • DROP ROLE
        • DROP TABLE
        • DROP TYPE
        • GRANT PERMISSION
        • GRANT ROLE
        • REVOKE PERMISSION
        • REVOKE ROLE
        • USE
        • INSERT
        • SELECT
        • EXPLAIN
        • UPDATE
        • DELETE
        • TRANSACTION
        • TRUNCATE
        • Simple expressions
        • Subscripted expressions
        • Function call
        • Operators
        • BLOB
        • BOOLEAN
        • Collection
        • FROZEN
        • INET
        • Integer and counter
        • Non-integer
        • TEXT
        • DATE, TIME, and TIMESTAMP
        • UUID and TIMEUUID
        • JSONB
        • Date and time
        • BATCH
    • CLIs
      • yb-ctl
      • yb-docker-ctl
      • ysqlsh
      • ycqlsh
      • yb-admin
      • yb-ts-cli
      • ysql_dump
      • ysql_dumpall
    • Configuration
      • yb-tserver
      • yb-master
      • yugabyted
      • Default ports
    • Drivers
      • Client drivers for YSQL API
      • YugabyteDB JDBC Driver
      • Client drivers for YCQL
      • Spring Data YugabyteDB
    • Connectors
      • Kafka Connect YugabyteDB
    • Third party tools
      • DBeaver
      • DbSchema
      • pgAdmin
      • SQL Workbench/J
      • TablePlus
      • Visual Studio Code
    • Sample datasets
      • Chinook
      • Northwind
      • PgExercises
      • SportsDB
  • RELEASES
    • Releases
    • Releases overview
    • Release versioning
    • What's new
      • v2.7 (latest)
      • v2.4 (stable)
    • Earlier releases
      • v2.5 series
      • v2.3.3
      • v2.3.2
      • v2.3.1
      • v2.3.0
      • v2.2.0 series
      • v2.1.8
      • v2.1.6
      • v2.1.5
      • v2.1.4
      • v2.1.3
      • v2.1.2
      • v2.1.1
      • v2.1.0
      • v2.0.11
      • v2.0.10
      • v2.0.9
      • v2.0.8
      • v2.0.7
      • v2.0.6
      • v2.0.5
      • v2.0.3
      • v2.0.1
      • v2.0.0
      • v1.3.1
      • v1.3.0
      • v1.2.12
      • v1.2.11
      • v1.2.10
      • v1.2.9
      • v1.2.8
      • v1.2.6
      • v1.2.5
      • v1.2.4
  • FAQ
    • Comparisons
      • Amazon Aurora
      • Google Cloud Spanner
      • CockroachDB
      • TiDB
      • Vitess
      • MongoDB
      • FoundationDB
      • Amazon DynamoDB
      • Azure Cosmos DB
      • Apache Cassandra
      • PostgreSQL
      • Redis in-memory store
      • Apache HBase
    • FAQs
      • General FAQ
      • Operations FAQ
      • API compatibility FAQ
      • Yugabyte Platform FAQ
  • MISC
    • YEDIS
      • Quick start
      • Develop
        • Build an application
        • C#
        • C++
        • Go
        • Java
        • NodeJS
        • Python
      • API reference
        • APPEND
        • AUTH
        • CONFIG
        • CREATEDB
        • DELETEDB
        • LISTDB
        • SELECT
        • DEL
        • ECHO
        • EXISTS
        • EXPIRE
        • EXPIREAT
        • FLUSHALL
        • FLUSHDB
        • GET
        • GETRANGE
        • GETSET
        • HDEL
        • HEXISTS
        • HGET
        • HGETALL
        • HINCRBY
        • HKEYS
        • HLEN
        • HMGET
        • HMSET
        • HSET
        • HSTRLEN
        • HVALS
        • INCR
        • INCRBY
        • KEYS
        • MONITOR
        • PEXPIRE
        • PEXPIREAT
        • PTTL
        • ROLE
        • SADD
        • SCARD
        • RENAME
        • SET
        • SETEX
        • PSETEX
        • SETRANGE
        • SISMEMBER
        • SMEMBERS
        • SREM
        • STRLEN
        • ZRANGE
        • TSADD
        • TSCARD
        • TSGET
        • TSLASTN
        • TSRANGEBYTIME
        • TSREM
        • TSREVRANGEBYTIME
        • TTL
        • ZADD
        • ZCARD
        • ZRANGEBYSCORE
        • ZREM
        • ZREVRANGE
        • ZSCORE
        • PUBSUB
        • PUBLISH
        • SUBSCRIBE
        • UNSUBSCRIBE
        • PSUBSCRIBE
        • PUNSUBSCRIBE
    • Legal
      • Third party software
> Explore features > Horizontal Scalability >

Sharding Data Across Nodes

  • Sharding strategies
    • Hash sharding
    • Range sharding
  • Try it out
    • Create a universe
    • Create a table
    • Explore tablets
    • Insert and query a table
    • Automatic sharding when adding nodes
    • [Optional] Clean up
  • Further reading

YugabyteDB automatically splits user tables into multiple shards, called tablets, using either a hash- or range- based strategy.

The primary key for each row in the table uniquely determines the tablet the row lives in. This is shown in the following figure.

Sharding a table into tablets

By default, YugabyteDB creates eight tablets per node in the cluster for each table and automatically distributes the data across the various tablets, which in turn are distributed evenly across the nodes. In the Try it out section, you'll explore how automatic sharding is done internally for tables. The system Redis table works in exactly the same way.

Sharding strategies

Sharding is the process of breaking up large tables into smaller chunks called shards that are spread across multiple servers. Essentially, a shard is a horizontal data partition that contains a subset of the total data set, and hence is responsible for serving a portion of the overall workload. The idea is to distribute data that can’t fit on a single node onto a cluster of database nodes.

Sharding is also referred to as horizontal partitioning. The distinction between horizontal and vertical comes from the traditional tabular view of a database. A database can be split vertically—storing different table columns in a separate database, or horizontally—storing rows of the same table in multiple database nodes.

YugabyteDB currently supports two ways of sharding data: hash (also known as consistent hash) sharding, and range sharding.

Hash sharding

With (consistent) hash sharding, a partitioning algorithm distributes data evenly and randomly across shards. The algorithm places each row of the table into a shard determined by computing a consistent hash on the partition column values of that row.

The hash space for hash-sharded YugabyteDB tables is the 2-byte range from 0x0000 to 0xFFFF. A table may therefore have at most 65,536 tablets. We expect this to be sufficient in practice even for very large data sets or cluster sizes.

For example, for a table with 16 tablets the overall hash space of 0x0000 to 0xFFFF is divided into 16 sub-ranges, one for each tablet: 0x0000 to 0x1000, 0x1000 to 0x2000, and so on up to 0xF000 to 0xFFFF. Read and write operations are processed by converting the primary key into an internal key and its hash value, and determining the tablet to which the operation should be routed.

Hash sharding is ideal for massively scalable workloads, as it distributes data evenly across all the nodes in the cluster, while retaining ease of adding nodes into the cluster.

With consistent hash sharding, there are many more shards than the number of nodes and there is an explicit mapping table maintained tracking the assignment of shards to nodes. When adding new nodes, a subset of shards from existing nodes can be efficiently moved into the new nodes without requiring a massive data reassignment.

A potential downside of hash sharding is that performing range queries could be inefficient. Examples of range queries are finding rows greater than a lower bound or less than an upper bound (as opposed to point lookups).

Hash sharding example

In YSQL, create a table with hash sharding:

CREATE TABLE customers (
    customer_id bpchar NOT NULL,
    company_name character varying(40) NOT NULL,
    contact_name character varying(30),
    contact_title character varying(30),
    address character varying(60),
    city character varying(15),
    region character varying(15),
    postal_code character varying(10),
    country character varying(15),
    phone character varying(24),
    fax character varying(24),
    PRIMARY KEY (customer_id HASH)
);

In YCQL, you can only create tables with hash sharding, so an explict syntax for setting hash sharding is not necessary.

CREATE TABLE items (
    supplier_id INT,
    item_id INT,
    supplier_name TEXT STATIC,
    item_name TEXT,
    PRIMARY KEY((supplier_id), item_id)
);

Range sharding

Range sharding involves splitting the rows of a table into contiguous ranges, based on the primary key column values. Range-sharded tables usually start out with a single shard. As data is inserted into the table, it is dynamically split into multiple shards because it is not always possible to know the distribution of keys in the table ahead of time.

This type of sharding allows efficiently querying a range of rows by the primary key values. Examples of such a query is to look up all keys that lie between a lower bound and an upper bound.

Range sharding has a few issues at scale, including:

  • Starting out with a single shard means that a single node is handling all user queries.



    This often results in a database “warming” problem, where all queries are handled by a single node even if there are multiple nodes in the cluster. The user would have to wait for enough splits to happen and these shards to get redistributed before all nodes in the cluster are being utilized. This can be a big issue in production workloads. This can be mitigated in some cases where the distribution is keys is known ahead of time by presplitting the table into multiple shards, however this is hard in practice.

  • Globally ordering keys across all the shards often generates hot spots, in which some shards get much more activity than others.



    Nodes hosting hot spots are overloaded relative to others. You can mitigate this to some extent with active load balancing, but this does not always work well in practice: by the time hot shards are redistributed across nodes, the workload may have changed and introduced new hot spots.

Range sharding example

In YSQL, create a table with range sharding:

CREATE TABLE order_details (
    order_id smallint NOT NULL,
    product_id smallint NOT NULL,
    unit_price real NOT NULL,
    quantity smallint NOT NULL,
    discount real NOT NULL,
    PRIMARY KEY (order_id ASC, product_id),
    FOREIGN KEY (product_id) REFERENCES products,
    FOREIGN KEY (order_id) REFERENCES orders
);

In YCQL, you can't create tables with range sharding. YCQL tables are always hash sharded.

Try it out

In this tutorial, you'll explore automatic sharding inside YugabyteDB. First, you'll create some tables with a replication factor (RF) of 1. This makes it easier to understand how automatic sharding works independent of data replication. Then, you'll insert entries one by one, and examine which how the data gets distributed across the various nodes.

This tutorial uses the yb-ctl local cluster management utility.

Create a universe

To create a universe, do the following:

  1. If you have a currently running local universe, destroy it using the following.

    $ ./bin/yb-ctl destroy
    
  2. Start a new local universe with a replication factor of 1 (RF=1). We are passing the following flags:

    • --rf creates a universe with a replication factor of 1.
    • --num_shards_per_tserver controls the total number of tablets (or partitions) when creating a new table. By setting the value to 4, 12 tablets will be created on a 3-node cluster.
    • --tserver_flags "memstore_size_mb=1" sets the total size of memstores on the tablet-servers to 1MB. This will force a flush of the data to disk when a value greater than 1MB is added, so that you can observe which tablets the data is written to.

    Warning

    Setting memstore_size to such a low value is not recommended in production, and is only being used here to illustrate the point by forcing flushes to happen more quickly.
    $ ./bin/yb-ctl --rf 1 --num_shards_per_tserver 4 create \
                --tserver_flags "memstore_size_mb=1"
    
  3. Add two more nodes to make this a 3-node, rf=1 universe. We need to pass the memstore_size flag to each of the added YB-TServer servers.

    $ ./bin/yb-ctl add_node --tserver_flags "memstore_size_mb=1"
    
    $ ./bin/yb-ctl add_node --tserver_flags "memstore_size_mb=1"
    
  4. Check the status of the cluster to confirm that you have three YB-TServer servers.

    $ ./bin/yb-ctl status
    
    ----------------------------------------------------------------------------------------------------
    | Node Count: 3 | Replication Factor: 1                                                            |
    ----------------------------------------------------------------------------------------------------
    | JDBC                : jdbc:postgresql://127.0.0.1:5433/yugabyte                                  |
    | YSQL                : ./bin/ysqlsh                                                               |
    | YCQL                : ./bin/ycqlsh 127.0.0.1 9042                                                |
    | YEDIS               : ./bin/redis-cli -h 127.0.0.1 -p 6379                                       |
    | Web UI              : http://127.0.0.1:7000/                                                     |
    | Cluster Data        : /Users/schoudhury/yugabyte-data                                            |
    ----------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------
    | Node 1: yb-tserver (pid 72053), yb-master (pid 72050)                                            |
    ----------------------------------------------------------------------------------------------------
    | JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
    | YSQL                : ./bin/ysqlsh                                                               |
    | YCQL                : ./bin/ycqlsh 127.0.0.1 9042                                                |
    | YEDIS               : ./bin/redis-cli -h 127.0.0.1 -p 6379                                       |
    | data-dir[0]         : /Users/schoudhury/yugabyte-data/node-1/disk-1/yb-data                      |
    | TServer Logs        : /Users/schoudhury/yugabyte-data/node-1/disk-1/yb-data/tserver/logs         |
    | Master Logs         : /Users/schoudhury/yugabyte-data/node-1/disk-1/yb-data/master/logs          |
    ----------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------
    | Node 2: yb-tserver (pid 72128)                                                                   |
    ----------------------------------------------------------------------------------------------------
    | JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
    | YSQL                : ./bin/ysqlsh                                                               |
    | YCQL                : ./bin/ycqlsh 127.0.0.2 9042                                                |
    | YEDIS               : ./bin/redis-cli -h 127.0.0.2 -p 6379                                       |
    | data-dir[0]         : /Users/schoudhury/yugabyte-data/node-2/disk-1/yb-data                      |
    | TServer Logs        : /Users/schoudhury/yugabyte-data/node-2/disk-1/yb-data/tserver/logs         |
    ----------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------
    | Node 3: yb-tserver (pid 72166)                                                                   |
    ----------------------------------------------------------------------------------------------------
    | JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
    | YSQL                : ./bin/ysqlsh -U postgres -h 127.0.0.3 -p 5433                              |
    | YCQL                : ./bin/ycqlsh 127.0.0.3 9042                                                |
    | YEDIS               : ./bin/redis-cli -h 127.0.0.3 -p 6379                                       |
    | data-dir[0]         : /Users/schoudhury/yugabyte-data/node-3/disk-1/yb-data                      |
    | TServer Logs        : /Users/schoudhury/yugabyte-data/node-3/disk-1/yb-data/tserver/logs         |
    ----------------------------------------------------------------------------------------------------
    

Create a table

Once you've got your nodes set up, you can create a YCQL table. Since you'll be using a workload application in the YugabyteDB workload generator to write data into this table, create the keyspace and table name below exactly as shown.

$ ./bin/ycqlsh
ycqlsh> CREATE KEYSPACE ybdemo_keyspace;
ycqlsh> CREATE TABLE ybdemo_keyspace.cassandrakeyvalue (k text PRIMARY KEY, v blob);

For each table, YugabyteDB creates four shards for each YB-TServer in this universe. Because you have three nodes, this should create 12 tablets for the ybdemo_keyspace.cassandrakeyvalue table.

Explore tablets

The tablets are evenly balanced across the various nodes. You can see the number of tablets per node in the Tablet Servers page of the master Admin UI, by going to the table details page. The page should look something like the image below.

Number of tablets in the table

Notice that each node has 4 tablets, and the total number of tablets is 12 as expected.

The table has 12 tablets, each owning a range of the keyspace. Navigate to the table details page to examine the various tablets. This page should look as follows.

Tablet details of the table

There are 12 tablets as expected, and the key ranges owned by each tablet are shown. This page also shows which node is currently hosting (and is the leader for) each tablet. Note here that tablet balancing across nodes happens on a per-table basis, with each table scaled out to an appropriate number of nodes.

Each tablet has a separate directory dedicated to it for data. List out all the tablet directories and check their sizes, as follows:

$ du -hs /yugabyte-data/node*/disk*/yb-data/tserver/data/rocksdb/table*/* | grep -v '0B'
 20K    /yugabyte-data/node-1/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-439ae3bde90049d6812e198e76ad29a4
 20K    /yugabyte-data/node-1/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-eecd01f0a7cd4537ba571bdb85d0c094
 20K    /yugabyte-data/node-1/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-4ea334056a3845518cc6614baef96966
 20K    /yugabyte-data/node-1/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-52642a3a9d7b4d38a103dff5dd77a3c6
 20K    /yugabyte-data/node-2/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-4e31e26b3b204e34a1e0cfd6f7500525
 20K    /yugabyte-data/node-2/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-b7ac08a983aa45a3843ab92b1719799a
 20K    /yugabyte-data/node-2/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-22c349a07afb48e3844b570c24455431
 20K    /yugabyte-data/node-2/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-8955db9e1ec841f3a30535b77d707586
 20K    /yugabyte-data/node-3/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-adac9f92466b4d288a4ae346aaad3880
 20K    /yugabyte-data/node-3/disk-1/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-f04a6d5113a74ba79a04f01c80423ef5
 20K    /yugabyte-data/node-3/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-1c472c1204fe40afbc7948dadce22be8
 20K    /yugabyte-data/node-3/disk-2/yb-data/tserver/data/rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-5aaeb96381044aa2b09ed9973830bb27

Insert and query a table

In this section, you'll use a sample app to insert a key-value entry with the value size around 2MB. Since the memstores are configured to be 1MB, this causes the data to flush to disk immediately.

The key flags you pass to the sample app are:

  • --num_unique_keys 1 to write exactly one key. Keys are numbers converted to text, and typically start from 0.
  • --num_threads_read 0 to not perform any reads (hence 0 read threads).
  • --num_threads_write 1 creates one writer thread. Since you're not writing a lot of data, a single writer thread is sufficient.
  • --value_size 10000000 to generate the value being written as a random byte string of around 10MB size.
  • --nouuid to not prefix a UUID to the key. A UUID allows multiple instances of the load tester to run without interfering with each other.

Let's get started:

  1. Download the YugabyteDB workload generator JAR file (yb-sample-apps.jar):

    $ wget https://github.com/yugabyte/yb-sample-apps/releases/download/1.3.6/yb-sample-apps.jar
    
  2. Run the CassandraKeyValue workload application.

    $ java -jar ./yb-sample-apps.jar --workload CassandraKeyValue \
                                        --nodes 127.0.0.1:9042 \
                                        --nouuid \
                                        --num_unique_keys 1 \
                                        --num_writes 2 \
                                        --num_threads_read 0 \
                                        --num_threads_write 1 \
                                        --value_size 10000000
    
    2018-02-05 07:33:33,525 [INFO|...] Num unique keys to insert: 1
    ...
    2018-02-05 07:33:36,899 [INFO|...] The sample app has finished
    
  3. Check what you've inserted using ycqlsh.

    $ ./bin/ycqlsh
    
    ycqlsh> SELECT k FROM ybdemo_keyspace.cassandrakeyvalue;
    
    k
    -------
    key:0
    
    (1 rows)
    
  4. Next, check the sizes of the various tablets:

    $ du -hs /yugabyte-data/node*/disk*/yb-data/tserver/data/rocksdb/table*/* | grep -v '0B'
    
    20K    .../rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-439ae3bde90049d6812e198e76ad29a4
    9.6M    .../rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-eecd01f0a7cd4537ba571bdb85d0c094
    20K    .../rocksdb/table-9987797012ce4c1c91782c25e7608c34/tablet-4ea334056a3845518cc6614baef96966
    ...
    
  5. Here, the key has been written to one of the tablets. In this example, the tablet's UUID is eecd01f0a7cd4537ba571bdb85d0c094. Check the table details page to determine which node this tablet belongs to, and in this case it's node-1.

    Tablet ownership with auto-sharding

  6. You can also easily confirm that node-1 does have about 10MB of storage being used.

    Inserting values with auto-sharding

Automatic sharding when adding nodes

  1. Add one more node to the universe for a total of 4 nodes:

    $ ./bin/yb-ctl add_node --tserver_flags "memstore_size_mb=1"
    
  2. Check the tablet servers page, to see that the tablets are re-distributed evenly among the 4 nodes and each node now has 3 shards:

    Auto-sharding when adding one node

  3. Add 2 more nodes to the universe, making it a total of 6 nodes:

    $ ./bin/yb-ctl add_node --tserver_flags "memstore_size_mb=1"
    
    $ ./bin/yb-ctl add_node --tserver_flags "memstore_size_mb=1"
    
  4. Verify that the tablets are evenly distributed across the 6 nodes. Each node now has 2 tablets.

    Auto-sharding when adding three nodes

[Optional] Clean up

If you're done experimenting, run the following command to shut down the local cluster:

$ ./bin/yb-ctl destroy

Further reading

To learn more about sharding in YugabyteDB, you may wish to read the architecture documentation, or the following blog posts:

How Data Sharding Works in a Distributed SQL Database

Four Data Sharding Strategies We Analyzed in Building a Distributed SQL Database

Overcoming MongoDB Sharding and Replication Limitations with YugabyteDB

  • Sharding strategies
    • Hash sharding
    • Range sharding
  • Try it out
    • Create a universe
    • Create a table
    • Explore tablets
    • Insert and query a table
    • Automatic sharding when adding nodes
    • [Optional] Clean up
  • Further reading
Ask our community
  • Slack
  • Github
  • Forum
  • StackOverflow
Yugabyte
Contact Us
Copyright © 2017-2021 Yugabyte, Inc. All rights reserved.