MyBatis-Plus雪花算法實現(xiàn)源碼解讀

更新時間：2023年12月14日 09:06:06 作者：護發(fā)師兄

雪花算法是一種用于生成唯一標(biāo)識符（ID）的分布式算法，雪花算法的設(shè)計目標(biāo)是在分布式系統(tǒng)中生成全局唯一的ID，同時保證ID的有序性和趨勢遞增，這篇文章主要介紹了MyBatis-Plus雪花算法實現(xiàn)源碼解析,需要的朋友可以參考下

1. 雪花算法（Snowflake Algorithm）

雪花算法（Snowflake Algorithm）是一種用于生成唯一標(biāo)識符（ID）的分布式算法。最初由 Twitter 公司開發(fā)，用于生成其內(nèi)部分布式系統(tǒng)中的唯一ID。雪花算法的設(shè)計目標(biāo)是在分布式系統(tǒng)中生成全局唯一的ID，同時保證ID的有序性和趨勢遞增。

雪花算法生成的ID是64位的整數(shù)，分為以下幾個部分：

符號位（1位） 為了適配部分預(yù)研沒有無符號整數(shù)，所以這一位空缺，并且一般為0。
時間戳（41位）： 使用當(dāng)前時間戳，精確到毫秒級別。這可以確保在一定時間內(nèi)生成的ID是唯一的。由于使用的是41位，所以雪花算法可以支持68年的唯一ID生成（2^41毫秒，大約69.7年）。
機器ID（10位）： 分配給生成ID的機器的唯一標(biāo)識符。這樣可以確保在同一時間戳內(nèi)，不同機器生成的ID不會沖突。一般情況下，需要提前配置每臺機器的唯一標(biāo)識符，然后在運行時使用。
序列號（12位）： 在同一時間戳內(nèi)，同一機器上生成的ID的序列號。用于防止同一毫秒內(nèi)生成的ID發(fā)生沖突。當(dāng)在同一毫秒內(nèi)生成多個ID時，通過遞增序列號來區(qū)分它們。

1位	41位	5位	5位	12位
0	0000000000 0000000000 0000000000 0000000000 0	00000	00000	0000000000 00
符號位（一般為0）	時間戳ms 大約可以表示69.7年	mac地址混淆	mac地址與JVM-PID共同混淆	序列號

雪花算法生成的ID具有以下特點：

全局唯一性： 在整個分布式系統(tǒng)中，每個生成的ID都是唯一的。
有序性： 由于時間戳占據(jù)了大部分位數(shù)，生成的ID是趨勢遞增的，使得生成的ID在數(shù)據(jù)庫索引上有較好的性能。
分布式： 不同機器上生成的ID不會沖突，可以在分布式系統(tǒng)中使用。

2. 流程

2.1 MyBatis-Plus全局唯一ID生成器初始化

MyBatis-Plus啟動后，會通過IdentifierGeneratorAutoConfiguration類進行項目的自動配置。

注意：IdentifierGeneratorAutoConfiguration類是被@Lazy注解了，所以他是懶加載，所以有的項目會在啟動后往日志表插入一條記錄來預(yù)熱MyBatis-Plus

自動配置的內(nèi)容是往項目注入Bean，該Bean主要是用于全局唯一ID的生成。其中傳入的參數(shù)是第一個非回環(huán)地址的InetAddress類

注意：IdentifierGenerator是接口，DefaultIdentifierGenerator是其一個實現(xiàn)類

@Bean
@ConditionalOnMissingBean
public IdentifierGenerator identifierGenerator(InetUtils inetUtils) {
    return new DefaultIdentifierGenerator(inetUtils.findFirstNonLoopbackAddress());
}

會直接生成一個Sequence

public DefaultIdentifierGenerator(InetAddress inetAddress) {
    this.sequence = new Sequence(inetAddress);
}

這是Sequence的構(gòu)造器。它會設(shè)置datacenterId與workerId

public Sequence(InetAddress inetAddress) {
    this.inetAddress = inetAddress;
    this.datacenterId = getDatacenterId(maxDatacenterId);
    this.workerId = getMaxWorkerId(datacenterId, maxWorkerId);
    // 打印初始化語句
    initLog();
}

這是datacenterId的獲取部分，里面可以看到它主要是mac地址混淆得到

注意：這里得到的datacenterId還沒有經(jīng)過截取，是64位的

/**
* 數(shù)據(jù)標(biāo)識id部分
*/
protected long getDatacenterId(long maxDatacenterId) {
long id = 0L;
try {
    if (null == this.inetAddress) {
        this.inetAddress = InetAddress.getLocalHost();
    }
    NetworkInterface network = NetworkInterface.getByInetAddress(this.inetAddress);
    if (null == network) {
        id = 1L;
    } else {
        // 獲取mac地址
        byte[] mac = network.getHardwareAddress();
        // 混淆
        if (null != mac) {
            id = ((0x000000FF & (long) mac[mac.length - 2]) | (0x0000FF00 & (((long) mac[mac.length - 1]) << 8))) >> 6;
            id = id % (maxDatacenterId + 1);
        }
    }
} catch (Exception e) {
    logger.warn(" getDatacenterId: " + e.getMessage());
}
return id;
}

這是獲取workerId的方法，可以看到workerId是由mac地址和JVM-PID共同混淆得出的

/**
 * 獲取 maxWorkerId
 */
protected long getMaxWorkerId(long datacenterId, long maxWorkerId) {
    StringBuilder mpid = new StringBuilder();
    mpid.append(datacenterId);
    String name = ManagementFactory.getRuntimeMXBean().getName();
    if (StringUtils.isNotBlank(name)) {
        /*
         * GET jvmPid
         */
        mpid.append(name.split(StringPool.AT)[0]);
    }
    /*
     * MAC + PID 的 hashcode 獲取16個低位
     */
    return (mpid.toString().hashCode() & 0xffff) % (maxWorkerId + 1);
}

2.2 獲取全局唯一ID流程

注意：若之前沒有獲取過全局唯一ID，那么它會走一遍2.1的全部流程。

如果是使用MyBatis-Plus的IdType.ASSIGN_ID會到IdWorker類中獲取全局唯一ID

其中，會調(diào)用以下方法獲取全局唯一ID(long)

/**
 * 獲取唯一ID
 *
 * @return id
 */
public static long getId(Object entity) {
    return IDENTIFIER_GENERATOR.nextId(entity).longValue();
}

進入nextId方法的具體實現(xiàn)，發(fā)現(xiàn)它是使用sequence的nextId方法

@Override
public Long nextId(Object entity) {
    return sequence.nextId();
}

下面包含一些自己的注釋

注意：nextId方法是被synchronized修飾的，是同步方法

/**
 * 獲取下一個 ID
 *
 * @return 下一個 ID
 */
public synchronized long nextId() {
    long timestamp = timeGen();
    // 閏秒
    // 這里會判斷是否發(fā)生時鐘偏移，若偏移在5ms以內(nèi)會重新嘗試重新獲取時間，看是否能夠重新獲取正確的時間。
    // 因為偶爾會有閏秒的存在
    if (timestamp < lastTimestamp) {
        long offset = lastTimestamp - timestamp;
        if (offset <= 5) {
            try {
                wait(offset << 1);
                timestamp = timeGen();
                if (timestamp < lastTimestamp) {
                    throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", offset));
                }
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        } else {
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", offset));
        }
    }
    if (lastTimestamp == timestamp) {
        // 相同毫秒內(nèi)，序列號自增
        sequence = (sequence + 1) & sequenceMask;
        if (sequence == 0) {
            // 同一毫秒的序列數(shù)已經(jīng)達到最大
            // 序列數(shù)(毫秒內(nèi)自增位)為12位，最大每毫秒分配4096個
            // 序列數(shù)最大的時候會等待到下一毫秒才會分配時間戳
            timestamp = tilNextMillis(lastTimestamp);
        }
    } else {
        // 不同毫秒內(nèi)，序列號置為 1 - 2 隨機數(shù)
        // 這里序列號置為1-2的隨機數(shù)是為了方便后續(xù)分庫分表的時候hash比較均勻
        sequence = ThreadLocalRandom.current().nextLong(1, 3);
    }
    lastTimestamp = timestamp;
    // twepoch 是 時間起始標(biāo)記點，作為基準(zhǔn)，一般取系統(tǒng)的最近時間（一旦確定不能變動）
    // 因為前面已經(jīng)說過41位的時間戳可以分配69.7年，如果從1970.1.1開始數(shù)，那么時間戳可能在未來某一天大于41位
    // 時間戳部分 | 數(shù)據(jù)中心部分 | 機器標(biāo)識部分 | 序列號部分
    return ((timestamp - twepoch) << timestampLeftShift)
        | (datacenterId << datacenterIdShift)
        | (workerId << workerIdShift)
        | sequence;
}

這是生成時間的方法，其中使用了SystemClock，這是一個有趣的實現(xiàn)

protected long timeGen() {
    return SystemClock.now();
}

SystemClock類，這個類的主要思想就是用一個任務(wù)線程池以固定速率去獲取系統(tǒng)時間，若在同一時間間隔內(nèi)，那么直接返回，而不需要再次訪問系統(tǒng)時間。其實主要是因為System.currentTimeMillis()是jni方法，jni方法由于存在內(nèi)存復(fù)制和數(shù)據(jù)轉(zhuǎn)換，所以是比較耗時的。

/**
 * 高并發(fā)場景下System.currentTimeMillis()的性能問題的優(yōu)化
 *
 * <p>System.currentTimeMillis()的調(diào)用比new一個普通對象要耗時的多（具體耗時高出多少我還沒測試過，有人說是100倍左右）</p>
 * <p>System.currentTimeMillis()之所以慢是因為去跟系統(tǒng)打了一次交道</p>
 * <p>后臺定時更新時鐘，JVM退出時，線程自動回收</p>
 * <p>10億：43410,206,210.72815533980582%</p>
 * <p>1億：4699,29,162.0344827586207%</p>
 * <p>1000萬：480,12,40.0%</p>
 * <p>100萬：50,10,5.0%</p>
 *
 * @author hubin
 * @since 2016-08-01
 */
public class SystemClock {
	// 定期更新時間戳的時間單位
    private final long period;
    // 記錄當(dāng)前時間戳的原子類，因為可能存在并發(fā)線程使用
    private final AtomicLong now;
    private SystemClock(long period) {
        this.period = period;
        this.now = new AtomicLong(System.currentTimeMillis());
        scheduleClockUpdating();
    }
    private static SystemClock instance() {
        return InstanceHolder.INSTANCE;
    }
    public static long now() {
        return instance().currentTimeMillis();
    }
    public static String nowDate() {
        return new Timestamp(instance().currentTimeMillis()).toString();
    }
    // 這里是有一個定期更新方法
    // 里面有一個定時線程池，它會以固定的時間間隔(period)在類里面更新當(dāng)前的時間戳
    private void scheduleClockUpdating() {
        ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor(runnable -> {
            Thread thread = new Thread(runnable, "System Clock");
            thread.setDaemon(true);
            return thread;
        });
        scheduler.scheduleAtFixedRate(() -> now.set(System.currentTimeMillis()), period, period, TimeUnit.MILLISECONDS);
    }
    // 獲取事件
    private long currentTimeMillis() {
        return now.get();
    }
    // 默認(rèn)事件間隔為1ms
    private static class InstanceHolder {
        public static final SystemClock INSTANCE = new SystemClock(1);
    }
}

至此，已經(jīng)介紹完MyBatis-Plus獲取全局唯一ID的實現(xiàn)。如有錯誤，煩請指出。

到此這篇關(guān)于MyBatis-Plus雪花算法實現(xiàn)源碼解讀的文章就介紹到這了,更多相關(guān)MyBatis-Plus雪花算法內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: