TypeScript 基礎(chǔ)數(shù)據(jù)結(jié)構(gòu)哈希表 HashTable教程

更新時(shí)間：2023年02月05日 09:14:15 作者：前端要努力QAQ

這篇文章主要為大家介紹了TypeScript 基礎(chǔ)數(shù)據(jù)結(jié)構(gòu)哈希表 HashTable教程詳解，有需要的朋友可以借鑒參考下，希望能夠有所幫助，祝大家多多進(jìn)步，早日升職加薪

前言

哈希表是一種非常重要的數(shù)據(jù)結(jié)構(gòu)，幾乎所有的編程語言都有直接或者間接的應(yīng)用這種數(shù)據(jù)結(jié)構(gòu)。

很多學(xué)習(xí)編程的人一直搞不懂哈希表到底是如何實(shí)現(xiàn)的，在這一章中，我們就一點(diǎn)點(diǎn)來實(shí)現(xiàn)一個(gè)自己的哈希表。通過實(shí)現(xiàn)來理解哈希表背后的原理和它的優(yōu)勢。

1. 哈希表介紹和特性

哈希表通常是基于數(shù)組進(jìn)行實(shí)現(xiàn)的，相對于數(shù)組，他有很多優(yōu)勢：

他可以提供非?？焖俚?插入-刪除-查找操作
無論多少數(shù)據(jù)，插入和刪除都接近常量的時(shí)間：即 O(1) 的時(shí)間復(fù)雜度
哈希表的速度比樹還要快，基本可以瞬間查找到想要的元素
哈希表相對于樹來說編碼要容易很多

PS: 樹也是一種基礎(chǔ)的數(shù)據(jù)結(jié)構(gòu)，js 中沒有樹，我們下一章就會(huì)講樹

哈希表相對于數(shù)組的一些不足：

哈希表中的數(shù)據(jù)沒有順序，所以不能以一種固定的方式來遍歷其中的元素
通常情況下，哈希表中的 key 是不允許重復(fù)的。

那么哈希表到底是什么？

我們上面說了哈希表的優(yōu)勢和不足，似乎還是沒說它到底長什么樣子？

這也是哈希表不好理解的地方，它不像數(shù)組、鏈表和樹等可通過圖形的形式表示其結(jié)構(gòu)和原理。

哈希表結(jié)構(gòu)就是數(shù)組，但它神奇之處在于對下標(biāo)值的一種變換，這種變換我們可以稱之為哈希函數(shù)，通過哈希函數(shù)可以獲取HashCode。

2. 哈希表的一些概念

哈?；簩⒋髷?shù)字轉(zhuǎn)化成數(shù)組范圍內(nèi)下標(biāo)的過程，稱之為哈?；?/li>
哈希函數(shù)：我們通常會(huì)將單詞轉(zhuǎn)化成大數(shù)字，把大數(shù)字進(jìn)行哈?；拇a實(shí)現(xiàn)放在一個(gè)函數(shù)中，該函數(shù)就稱為哈希函數(shù)；
哈希表：對最終數(shù)據(jù)插入的數(shù)組進(jìn)行整個(gè)結(jié)構(gòu)的封裝，得到的就是哈希表。

仍然需要解決的問題：

哈希化過后的下標(biāo)依然可能重復(fù)，如何解決這個(gè)問題呢？這種情況稱為沖突，沖突是不可避免的，我們只能解決沖突。

3. 地址沖突解決方案

解決沖突常見的兩種方案：

3.1 方案一：鏈地址法

如下圖所示：

鏈地址法解決沖突的辦法是每個(gè)數(shù)組單元中存儲(chǔ)的不再是單個(gè)數(shù)據(jù)，而是一個(gè)鏈條
這個(gè)鏈條使用什么數(shù)據(jù)結(jié)構(gòu)呢？常見的是數(shù)組或者鏈條
比如鏈表，也就是每個(gè)數(shù)組單元中存儲(chǔ)著一個(gè)鏈表。一旦發(fā)現(xiàn)重復(fù)，將重復(fù)的元素插入到鏈表的首端或者末端即可。
當(dāng)查詢時(shí)，先根據(jù)哈?；蟮南聵?biāo)值找到對應(yīng)的位置，再取出鏈表，依次查詢要尋找的數(shù)據(jù)

數(shù)組還是鏈表？

數(shù)組或者鏈表在這其實(shí)都可以，效率上也差不多
因?yàn)楦鶕?jù)哈?；?index 找出這個(gè)數(shù)組或者鏈表時(shí)，通常就會(huì)使用線性查找，這個(gè)時(shí)候數(shù)組和鏈表的效率是差不多的。
當(dāng)然在某些實(shí)現(xiàn)中，會(huì)將新插入的數(shù)據(jù)放在數(shù)組或者鏈表的最前面，因?yàn)橛X得新插入的數(shù)據(jù)用于取出的可能性更大
這種情況最好采用鏈表，因?yàn)閿?shù)組在首位插入數(shù)據(jù)是需要所有其他項(xiàng)后移的，鏈表就沒有這樣的問題

3.2 方案二：開放地址法

開放地址法的主要工作方式是尋找空白的單元格來放置沖突的數(shù)據(jù)項(xiàng)。（了解即可，現(xiàn)在很少用到了）

據(jù)探測空白單元格位置方式的不同，可分為三種方法：

線性探測：如果發(fā)現(xiàn)要插入的位置已經(jīng)被占據(jù)，則 +1，直到探測到空白位置
二次探測：二次探測就是在線性探測的基礎(chǔ)上把步長修改了（+1 +4 +9）
再哈希法：如果發(fā)現(xiàn)要插入的位置已經(jīng)被占據(jù)，通過算法將這個(gè)位置再次哈?；?，直到得到新的位置沒被占據(jù)

4. 哈希函數(shù)代碼實(shí)現(xiàn)

好的哈希函數(shù)應(yīng)該盡可能的讓計(jì)算的過程變得簡單，提高計(jì)算的效率。

哈希表的主要優(yōu)點(diǎn)是它的速度，所以在速度上不能滿足，那么就打不到設(shè)計(jì)的目的了。
提高速度的一個(gè)辦法就是讓哈希函數(shù)中盡量少的有乘法和除法。因?yàn)樗鼈兊男阅苁潜容^低的。

設(shè)計(jì)好的哈希函數(shù)應(yīng)該具有的優(yōu)點(diǎn)：

快速的計(jì)算（霍納法則）
均勻的分布

下面我們就來實(shí)現(xiàn)一個(gè)哈希函數(shù)：

/**
 * 哈希函數(shù)，將 key 映射成 index
 * @param key 要轉(zhuǎn)換的 key
 * @param max 數(shù)組的長度（最大的數(shù)值）
 * @returns
 */
function hashFunc(key: string, max: number): number {
  // 1. 計(jì)算 hashCode cats => 60337 （27為底的時(shí)候）
  let hashCode = 0;
  const length = key.length;
  for (let i = 0; i < length; i++) {
    // 霍納法則計(jì)算 hashCode
    hashCode = 31 * hashCode + key.charCodeAt(i);
  }
  // 2. 求出索引值
  const index = hashCode % max;
  return index;
}

5. 哈希表封裝

經(jīng)過前面這么多內(nèi)容的鋪墊，我們現(xiàn)在終于可以真正實(shí)現(xiàn)自己的哈希表了

5.1 整體框架 v1 版

class HashTable<T = any> {
  // 創(chuàng)建一個(gè)數(shù)組，用來存放鏈地址法中的鏈
  storage: [string, T][][] = [];
  // 定義數(shù)組的長度
  private length: number = 7;
  // 記錄已經(jīng)存放元素的個(gè)數(shù)
  private count: number = 0;
}

在上面代碼中，我們定義了三個(gè)屬性

storage：創(chuàng)建一個(gè)數(shù)組，用來存放鏈地址法中的鏈
length：定義數(shù)組的長度
count：記錄已經(jīng)存放元素的個(gè)數(shù)

5.2 添加 put 方法 v2 版

put 方法的作用是插入&修改數(shù)據(jù)

在哈希表中，插入和修改操作是同一個(gè)函數(shù)。

因?yàn)椋?dāng)使用者傳入一個(gè)<key,value>時(shí)
如果原來不存在該 key，那么就是插入操作
如果已經(jīng)存在該 key，那么就是修改操作

put(key: string, value: T) {
  // 1.根據(jù)key 獲取數(shù)組中對應(yīng)的索引值
  const index = this.hashFunc(key, this.length);
  // 2. 取出索引值對應(yīng)位置的數(shù)組
  let bucket = this.storage[index];
  // 3. 判斷 bucket 是否有值
  if (!bucket) {
    bucket = [];
    this.storage[index] = bucket;
  }
  // 4. 確定已經(jīng)有一個(gè)數(shù)組了，但是數(shù)組中是否已經(jīng)存在 key 時(shí)不確定的
  let isUpdate = false;
  for (let i = 0; i < bucket.length; i++) {
    const tuple = bucket[i];
    const tupleKey = tuple[0];
    if (tupleKey === key) {
      // 修改/更新的操作
      tuple[1] = value;
      isUpdate = true;
    }
  }
  // 5. 如果上面的代碼沒有進(jìn)行覆蓋，那么在該位置進(jìn)行添加
  if (!isUpdate) {
    bucket.push([key, value]);
    this.count++
  }
}

測試：

const hashTable = new HashTable();
hashTable.put("aaa", 200);
hashTable.put("bbb", 300);
hashTable.put("ccc", 400);

上面 hashTable 的結(jié)構(gòu)可以用下圖來表示：（aaa、bbb、ccc 可能在一個(gè) bucket 中，也可能不在，具體要看哈希函數(shù)得到的 index）

5.3 添加 get 方法 v3 版

get 方法的作用是根據(jù) key 獲取對應(yīng)的值

  get(key: string): T | undefined {
    // 1. 根據(jù) key 獲取索引值 index
    const index = this.hashFunc(key, this.length);
    // 2. 獲取 bucket（桶）
    const bucket = this.storage[index];
    if (!bucket) return undefined;
    // 3. 對 bucket 進(jìn)行遍歷
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      const tupleKey = tuple[0];
      const tupleValue = tuple[1];
      if (tupleKey === key) {
        return tupleValue;
      }
    }
    return undefined;
  }

測試：

  get(key: string): T | undefined {
    // 1. 根據(jù) key 獲取索引值 index
    const index = this.hashFunc(key, this.length);
    // 2. 獲取 bucket（桶）
    const bucket = this.storage[index];
    if (!bucket) return undefined;
    // 3. 對 bucket 進(jìn)行遍歷
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      const tupleKey = tuple[0];
      const tupleValue = tuple[1];
      if (tupleKey === key) {
        return tupleValue;
      }
    }
    return undefined;
  }

5.4 添加 delete 方法 v4 版

delete 方法的作用是根據(jù) key 刪除對應(yīng)的值

  delete(key: string): T | undefined {
    // 1. 獲取索引值的位置
    const index = this.hashFunc(key, this.length);
    // 2. 獲取 bucket（桶）
    const bucket = this.storage[index];
    if (!bucket) return undefined;
    // 3. 對 bucket 進(jìn)行遍歷
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      const tupleKey = tuple[0];
      const tupleValue = tuple[1];
      if (tupleKey === key) {
        bucket.splice(i, 1);
        this.count--;
        return tupleValue;
      }
    }
    return undefined;
  }

測試：

const hashTable = new HashTable();
hashTable.put("aaa", 200);
hashTable.put("bbb", 300);
hashTable.put("ccc", 400);
hashTable.delete('aaa')
console.log(hashTable.get("aaa")); // undefined
console.log(hashTable.get("bbb")); // 300
console.log(hashTable.get("ccc")); // 400

6. 哈希表的自動(dòng)擴(kuò)容

目前，我們是將所有的數(shù)據(jù)項(xiàng)放在長度為 7 的數(shù)組中，因?yàn)槲覀兪褂玫氖擎湹刂贩ǎ?code>loadFactor 可以大于 1，所以這個(gè)哈希表是可以無限制的插入新數(shù)據(jù)的。

但是，隨著數(shù)據(jù)量的增多，每一個(gè) index 對應(yīng)的 bucket 會(huì)越來越長，也就造成效率的降低，所以在合適的情況對數(shù)組進(jìn)行擴(kuò)容是很有必要的。

loadFactor 譯為裝載因子。裝載因子用來衡量 HashMap 滿的程度

如何進(jìn)行擴(kuò)容？

擴(kuò)容可以簡單的將容量增大兩倍，但是這種情況下，所有的數(shù)據(jù)項(xiàng)一定要同時(shí)進(jìn)行修改（重新調(diào)用哈希函數(shù)）

1. 調(diào)整代碼，添加 resize 方法

private resize(newLength) {
  // 設(shè)置新的長度
  this.length = newLength;
  // 獲取原來所有的數(shù)據(jù)，并且重新放入到新的容量數(shù)組中
  // 1. 對數(shù)據(jù)進(jìn)行的初始化操作
  const oldStorage = this.storage;
  this.storage = [];
  this.count = 0;
  // 2. 獲取原來數(shù)據(jù)，放入新的數(shù)組中
  oldStorage.forEach((bucket) => {
    if (!bucket) return;
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      this.put(tuple[0], tuple[1]);
    }
  });
}

上面的 resize 方法的作用就對哈希表進(jìn)行擴(kuò)容，但是我們應(yīng)該如何使用 resize 方法呢?

答案就是，我們可以在 put 和 delete 方法的最后判斷一下當(dāng)前哈希表的 loadFactor，比如：

當(dāng) loadFactor 大于 0.75 時(shí)，對哈希表進(jìn)行兩倍的擴(kuò)容
當(dāng) loadFactor 小于 0.25 時(shí)，對哈希表進(jìn)行兩倍的縮容

2. 修改 put 方法

  put(key: string, value: T) {
    // 1.根據(jù)key 獲取數(shù)組中對應(yīng)的索引值
    const index = this.hashFunc(key, this.length);
    console.log({ index });
    // 2. 取出索引值對應(yīng)位置的數(shù)組
    let bucket = this.storage[index];
    // 3. 判斷 bucket 是否有值
    if (!bucket) {
      bucket = [];
      this.storage[index] = bucket;
    }
    // 4. 確定已經(jīng)有一個(gè)數(shù)組了，但是數(shù)組中是否已經(jīng)存在 key 時(shí)不確定的
    let isUpdate = false;
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      const tupleKey = tuple[0];
      if (tupleKey === key) {
        // 修改/更新的操作
        tuple[1] = value;
        isUpdate = true;
      }
    }
    // 5. 如果上面的代碼沒有進(jìn)行覆蓋，那么在該位置進(jìn)行添加
    if (!isUpdate) {
      bucket.push([key, value]);
      this.count++;
+     // 發(fā)現(xiàn) loadFactor 比例已經(jīng)大于 0.75，那么就直接擴(kuò)容
+     const loadFactor = this.count / this.length;
+     if (loadFactor > 0.75) {
+       this.resize(this.length * 2);
+     }
    }
  }

3. 修改 delete 方法

  delete(key: string): T | undefined {
    // 1. 獲取索引值的位置
    const index = this.hashFunc(key, this.length);
    // 2. 獲取 bucket（桶）
    const bucket = this.storage[index];
    if (!bucket) return undefined;
    // 3. 對 bucket 進(jìn)行遍歷
    for (let i = 0; i < bucket.length; i++) {
      const tuple = bucket[i];
      const tupleKey = tuple[0];
      const tupleValue = tuple[1];
      if (tupleKey === key) {
        bucket.splice(i, 1);
        this.count--;
+       // 發(fā)現(xiàn) loadFactor 比例已經(jīng)小于 0.75，那么就直接縮容
+       const loadFactor = this.count / this.length;
+       if (loadFactor < 0.25 && this.length > 7) {
+         this.resize(Math.floor(this.length / 2));
+       }
        return tupleValue;
      }
    }
    return undefined;
  }

測試：

const hashTable = new HashTable();
hashTable.put("aaa", 200);
hashTable.put("bbb", 300);
hashTable.put("ccc", 400);
hashTable.put("ddd", 500);
hashTable.put("eee", 600);
console.log(hashTable.storage.length); // 7
hashTable.put("fff", 600);
console.log(hashTable.storage.length); // 14
hashTable.delete("fff");
hashTable.delete("eee");
console.log(hashTable.storage.length); // 14
hashTable.delete("ddd");
console.log(hashTable.storage.length); // 7

以上就是TypeScript 基礎(chǔ)數(shù)據(jù)結(jié)構(gòu)哈希表 HashTable教程的詳細(xì)內(nèi)容，更多關(guān)于TypeScript 數(shù)據(jù)結(jié)構(gòu)HashTable的資料請關(guān)注腳本之家其它相關(guān)文章！

您可能感興趣的文章: