吹拉弹唱


  • Home
  • Archive
  • Categories
  • Tags
  • Books
  •  

© 2022 Kleon

Theme Typography by Makito

Proudly published with Hexo

System - Unique Id Generator

Posted at 2022-05-30Updated at 2022-05-30 interview  interview system 

常见的分布式ID生成方案浅析及大厂方案调研

  • Requirements
    • Functional Features
    • Non-Functional Features
    • Functional Features
    • Non-Functional Features
  • Solutions
    • UUID
    • Primary Key Auto Increment
    • Multi Database Primary Key Auto Increment
    • 数据库分段发号生成唯一id。(例如美团的Leaf框架中的segement模式)
    • Snowflake
  • Domain Model
    • Single Worker
    • Cluster
  • Practical
    • Snowflake

# Requirements

# Functional Features

🔹 Globally unique - If IDs are not globally unique, there could be collisions.
🔹 Roughly sorted by time - So user IDs, post IDs can be sorted by time without fetching additional info.
🔹 Numerical values only - Naturally sortable by time.
🔹 64 bits - 2^32 = ~4 billion -> not enough IDs | 2^64 is big enough | 2^128 waste space
🔹 Highly scalable, low latency | Ability to generate a lot of IDs per second in low latency fashion is critical.

  1. Capable of generating ten million sequential UIDs per second
  2. UIDs must preserve some level of ordering information
  3. capable of adjusting generation rate based on consumption

# Non-Functional Features

  1. Availability
  2. Scalability
  3. Low Latency

# Functional Features

  1. Generate a globally unique id
  2. Chronological orderable

# Non-Functional Features

# Solutions

# UUID

pros: Independent
cons: Too Long, Not Int
字符串作为主键id,插入数据时会是在聚集索引中是随机插入,容易造成页分离。而且字符串的比较比数字类型的开销更大,字符串作为主键id查询效率会低于数字类型的主键。

The string is used as the primary key id. When inserting data, it will be inserted randomly in the clustered index, which is easy to cause page separation. Moreover, the comparison of strings is more expensive than that of numeric types, and the query efficiency of strings as the primary key id will be lower than that of the primary key of numeric types.

通常可以作为一些临时性唯一标识,例如用户登陆后,生成一个UUID作为登录的会话ID,作为key存储在Redis中,Value是用户相关的信息。

# Primary Key Auto Increment

pros: Easy, Monotonically Increasing
cons:

  1. Rely on autoincrement mechanism, cannot support large amount of id concurrency.
  2. Rely on database. Single Point Failover. Master-Slave Switching may introduce duplicated key, if master database haven’t received the latest update of id from master.
  3. Id is continuously, which is not recommended for business usage.

适用于并发量不高的业务。

# Multi Database Primary Key Auto Increment

Set the step size for different databases.

1
CREATE TABLE table (...) AUTO_INCREMENT = n; alter table <table name> auto_increment=2;

pros: more efficient than single database
cons: hard to scale

# 数据库分段发号生成唯一id。(例如美团的Leaf框架中的segement模式)

使用数据库当发号器
biz_tag字段用于区分每种id应用的业务, max_id字段记录了当前已生成的最大的id, step字段代表每次可以获取id的数量

id生成项目每次使用下面这条语句从数据库获取step数量的id,并且更新max_id的值,将step数量的id存储在内存中,供业务方通过HTTP,RPC,Client等方式来获取。

1
UPDATE leaf_alloc SET max_id = max_id + step WHERE biz_tag = #{tag}

pros: depends on step, rather than the number of databases.
cons:

  1. Rely on database
  2. id is still continuous.

# Snowflake

snowflake是推特开源分布式ID生成算法,一共有64位,

第一位是0,是标志位,因为在二进制数中,第一位是0,代表是正数。

接下来41位是13位的毫秒时间戳,最大可以到2039年9月。

再接下来10个二进制位是workID,也就是服务器的id。

最后面12位是业务序列号

意味着每毫秒最大可以生成2的12次方个id,4096个,支持每个机器每毫秒生成4096个id,每秒可以生成409.6万的id

pros:

  1. efficient, 2^12=4096 ids in 1ms.
  2. relatively independent, only rely on local system time
  3. flexible, allocate bits according to business

cons:

  1. data skew.
  2. relay on clock. May generate duplicated ids if clock rewind clock back.

mongodb objectID snowflake-like id
通过“时间+机器码+pid+inc”共12个字节,通过4+3+2+3的方式最终标识成一个24长度的十六进制字符。

# Domain Model

# Single Worker

# Cluster

Generate Unique Ids per thread

Registration Service for per snowflake worker

# Practical

https://github.com/bwmarrin/snowflake

# Snowflake

Snowflake
Ace the System Design Interview — Distributed ID Generator

Share 

 Previous post: System - Rate Limiter Next post: System - URL Shortener 

© 2022 Kleon

Theme Typography by Makito

Proudly published with Hexo