在Java8與Java7中HashMap源碼實現的對比

更新時間：2017年01月21日 17:05:57 作者：alvading

這篇文章主要介紹了在Java8與Java7中HashMap源碼實現的對比,內容包括HashMap 的原理簡單介紹、結合源碼在Java7中是如何解決hash沖突的以及優(yōu)缺點，結合源碼以及在Java8中如何解決hash沖突，balance tree相關源碼介紹，需要的朋友可以參考借鑒。

一、HashMap的原理介紹

此乃老生常談，不作仔細解說。

一句話概括之：HashMap是一個散列表，它存儲的內容是鍵值對(key-value)映射。

二、Java 7 中HashMap的源碼分析

首先是HashMap的構造函數代碼塊1中，根據初始化的Capacity與loadFactor（加載因子）初始化HashMap.

//代碼塊1
 public HashMap(int initialCapacity, float loadFactor) {
 if (initialCapacity < 0)
  throw new IllegalArgumentException("Illegal initial capacity: " +
      initialCapacity);
 if (initialCapacity > MAXIMUM_CAPACITY)
  initialCapacity = MAXIMUM_CAPACITY;
 if (loadFactor <= 0 || Float.isNaN(loadFactor))
  throw new IllegalArgumentException("Illegal load factor: " +loadFactor);

 this.loadFactor = loadFactor;
 threshold = initialCapacity;
 init();
 }

Java7中對于<key1,value1>的put方法實現相對比較簡單，首先根據 key1 的key值計算hash值，再根據該hash值與table的length確定該key所在的index,如果當前位置的Entry不為null，則在該Entry鏈中遍歷，如果找到hash值和key值都相同，則將值value覆蓋，返回oldValue；如果當前位置的Entry為null，則直接addEntry。

代碼塊2
public V put(K key, V value) {
 if (table == EMPTY_TABLE) {
  inflateTable(threshold);
 }
 if (key == null)
  return putForNullKey(value);
 int hash = hash(key);
 int i = indexFor(hash, table.length);
 for (Entry<K,V> e = table[i]; e != null; e = e.next) {
  Object k;
  if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
  V oldValue = e.value;
  e.value = value;
  e.recordAccess(this);
  return oldValue;
  }
 }

 modCount++;
 addEntry(hash, key, value, i);
 return null;
 }

//addEntry方法中會檢查當前table是否需要resize
 void addEntry(int hash, K key, V value, int bucketIndex) {
 if ((size >= threshold) && (null != table[bucketIndex])) {
  resize(2 * table.length); //當前map中的size 如果大于threshole的閾值，則將resize將table的length擴大2倍。
  hash = (null != key) ? hash(key) : 0;
  bucketIndex = indexFor(hash, table.length);
 }

 createEntry(hash, key, value, bucketIndex);
 }

Java7 中resize（）方法的實現比較簡單，將OldTable的長度擴展，并且將oldTable中的Entry根據rehash的標記重新計算hash值和index移動到newTable中去。

代碼如代碼塊3中所示，

//代碼塊3 --JDK7中HashMap.resize()方法
void resize(int newCapacity) {
 Entry[] oldTable = table;
 int oldCapacity = oldTable.length;
 if (oldCapacity == MAXIMUM_CAPACITY) {
  threshold = Integer.MAX_VALUE;
  return;
 }

 Entry[] newTable = new Entry[newCapacity];
 transfer(newTable, initHashSeedAsNeeded(newCapacity));
 table = newTable;
 threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
 }

 /**
 * 將當前table的Entry轉移到新的table中
 */
 void transfer(Entry[] newTable, boolean rehash) {
 int newCapacity = newTable.length;
 for (Entry<K,V> e : table) {
  while(null != e) {
  Entry<K,V> next = e.next;
  if (rehash) {
   e.hash = null == e.key ? 0 : hash(e.key);
  }
  int i = indexFor(e.hash, newCapacity);
  e.next = newTable[i];
  newTable[i] = e;
  e = next;
  }
 }
 }

HashMap性能的有兩個參數：初始容量(initialCapacity) 和加載因子(loadFactor)。容量是哈希表中桶的數量，初始容量只是哈希表在創(chuàng)建時的容量。加載因子是哈希表在其容量自動增加之前可以達到多滿的一種尺度。當哈希表中的條目數超出了加載因子與當前容量的乘積時，則要對該哈希表進行 rehash 操作（即重建內部數據結構），從而哈希表將具有大約兩倍的桶數。

根據源碼分析可以看出：在Java7 中 HashMap的entry是按照index索引存儲的，遇到hash沖突的時候采用拉鏈法解決沖突，將沖突的key和value插入到鏈表list中。

然而這種解決方法會有一個缺點，假如key值都沖突，HashMap會退化成一個鏈表，get的復雜度會變成O(n) 。

在Java8中為了優(yōu)化該最壞情況下的性能，采用了平衡樹來存放這些hash沖突的鍵值對，性能由此可以提升至O(logn) 。

代碼塊4 -- JDK8中HashMap中常量定義
 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; 
 static final int TREEIFY_THRESHOLD = 8; // 是否將list轉換成tree的閾值
 static final int UNTREEIFY_THRESHOLD = 6; // 在resize操作中，決定是否untreeify的閾值
 static final int MIN_TREEIFY_CAPACITY = 64; // 決定是否轉換成tree的最小容量
 static final float DEFAULT_LOAD_FACTOR = 0.75f; // default的加載因子

在Java 8 HashMap的put方法實現如代碼塊5所示，

代碼塊5 --JDK8 HashMap.put方法
 public V put(K key, V value) {
 return putVal(hash(key), key, value, false, true);
 }

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
   boolean evict) {
 Node<K,V>[] tab; Node<K,V> p; int n, i;
 if ((tab = table) == null || (n = tab.length) == 0)
  n = (tab = resize()).length; //table為空的時候，n為table的長度
 if ((p = tab[i = (n - 1) & hash]) == null)
  tab[i] = newNode(hash, key, value, null); // (n - 1) & hash 與Java7中indexFor方法的實現相同，若i位置上的值為空，則新建一個Node，table[i]指向該Node。
 else {
  // 若i位置上的值不為空，判斷當前位置上的Node p 是否與要插入的key的hash和key相同
  Node<K,V> e; K k;
  if (p.hash == hash &&
  ((k = p.key) == key || (key != null && key.equals(k))))
  e = p;//相同則覆蓋之
  else if (p instanceof TreeNode)
  // 不同，且當前位置上的的node p已經是TreeNode的實例，則再該樹上插入新的node。
  e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
  else {
  // 在i位置上的鏈表中找到p.next為null的位置，binCount計算出當前鏈表的長度，如果繼續(xù)將沖突的節(jié)點插入到該鏈表中，會使鏈表的長度大于tree化的閾值，則將鏈表轉換成tree。
  for (int binCount = 0; ; ++binCount) {
   if ((e = p.next) == null) {
   p.next = newNode(hash, key, value, null);
   if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
    treeifyBin(tab, hash);
   break;
   }
   if (e.hash == hash &&
   ((k = e.key) == key || (key != null && key.equals(k))))
   break;
   p = e;
  }
  }
  if (e != null) { // existing mapping for key
  V oldValue = e.value;
  if (!onlyIfAbsent || oldValue == null)
   e.value = value;
  afterNodeAccess(e);
  return oldValue;
  }
 }
 ++modCount;
 if (++size > threshold)
  resize();
 afterNodeInsertion(evict);
 return null;
 }

再看下resize方法，由于需要考慮hash沖突解決時采用的可能是list 也可能是balance tree的方式，因此resize方法相比JDK7中復雜了一些,

 代碼塊6 -- JDK8的resize方法
 inal Node<K,V>[] resize() {
 Node<K,V>[] oldTab = table;
 int oldCap = (oldTab == null) ? 0 : oldTab.length;
 int oldThr = threshold;
 int newCap, newThr = 0;
 if (oldCap > 0) {
  if (oldCap >= MAXIMUM_CAPACITY) {
  threshold = Integer.MAX_VALUE;//如果超過最大容量，無法再擴充table
  return oldTab;
  }
  else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
   oldCap >= DEFAULT_INITIAL_CAPACITY)
  newThr = oldThr << 1; // threshold門檻擴大至2倍
 }
 else if (oldThr > 0) // initial capacity was placed in threshold
  newCap = oldThr;
 else {  // zero initial threshold signifies using defaults
  newCap = DEFAULT_INITIAL_CAPACITY;
  newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
 }
 if (newThr == 0) {
  float ft = (float)newCap * loadFactor;
  newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
   (int)ft : Integer.MAX_VALUE);
 }
 threshold = newThr;
 @SuppressWarnings({"rawtypes","unchecked"})
  Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];// 創(chuàng)建容量為newCap的newTab，并將oldTab中的Node遷移過來，這里需要考慮鏈表和tree兩種情況。
 table = newTab;
 if (oldTab != null) {
  for (int j = 0; j < oldCap; ++j) {
  Node<K,V> e;
  if ((e = oldTab[j]) != null) {
   oldTab[j] = null;
   if (e.next == null)
   newTab[e.hash & (newCap - 1)] = e;
   else if (e instanceof TreeNode)
   ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); 
   // split方法會將樹分割為lower 和upper tree兩個樹，
如果子樹的節(jié)點數小于了UNTREEIFY_THRESHOLD閾值，則將樹untreeify，將節(jié)點都存放在newTab中。
   else { // preserve order
   Node<K,V> loHead = null, loTail = null;
   Node<K,V> hiHead = null, hiTail = null;
   Node<K,V> next;
   do {
    next = e.next;
    if ((e.hash & oldCap) == 0) {
    if (loTail == null)
     loHead = e;
    else
     loTail.next = e;
    loTail = e;
    }
    else {
    if (hiTail == null)
     hiHead = e;
    else
     hiTail.next = e;
    hiTail = e;
    }
   } while ((e = next) != null);
   if (loTail != null) {
    loTail.next = null;
    newTab[j] = loHead;
   }
   if (hiTail != null) {
    hiTail.next = null;
    newTab[j + oldCap] = hiHead;
   }
   }
  }
  }
 }
 return newTab;
 }

再看一下tree的treeifyBin方法和putTreeVal方法的實現，底層采用了紅黑樹的方法。

 // 代碼塊7 
 //MIN_TREEIFY_CAPACITY 的值為64，若當前table的length不夠，則resize（）
 final void treeifyBin(Node<K,V>[] tab, int hash) {
 int n, index; Node<K,V> e;
 if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
  resize();
 else if ((e = tab[index = (n - 1) & hash]) != null) {
  TreeNode<K,V> hd = null, tl = null;
  do {
  TreeNode<K,V> p = replacementTreeNode(e, null);
  if (tl == null)
   hd = p;
  else {
   p.prev = tl;
   tl.next = p;
  }
  tl = p;
  } while ((e = e.next) != null);
  if ((tab[index] = hd) != null)
  hd.treeify(tab);
 }
 }
// putVal 的tree版本 
 final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab,
     int h, K k, V v) {
  Class<?> kc = null;
  boolean searched = false;
  TreeNode<K,V> root = (parent != null) ? root() : this;
  for (TreeNode<K,V> p = root;;) {
  int dir, ph; K pk;
  if ((ph = p.hash) > h)
   dir = -1;
  else if (ph < h)
   dir = 1;
  else if ((pk = p.key) == k || (k != null && k.equals(pk)))
   return p;
  else if ((kc == null &&
    (kc = comparableClassFor(k)) == null) ||
    (dir = compareComparables(kc, k, pk)) == 0) {
   if (!searched) {
   TreeNode<K,V> q, ch;
   searched = true;
   if (((ch = p.left) != null &&
    (q = ch.find(h, k, kc)) != null) ||
    ((ch = p.right) != null &&
    (q = ch.find(h, k, kc)) != null))
    return q;
   }
   dir = tieBreakOrder(k, pk);
  }
  TreeNode<K,V> xp = p;
  if ((p = (dir <= 0) ? p.left : p.right) == null) {
   Node<K,V> xpn = xp.next;
   TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
   if (dir <= 0)
   xp.left = x;
   else
   xp.right = x;
   xp.next = x;
   x.parent = x.prev = xp;
   if (xpn != null)
   ((TreeNode<K,V>)xpn).prev = x;
   moveRootToFront(tab, balanceInsertion(root, x));
   return null;
  }
  }
 }

看了這些源碼，并一一做了比較之后，驚嘆于源碼之妙，收益良多。

總結

以上就是這篇文章的全部內容了，希望本文的內容對大家的學習或者工作能帶來一定的幫助，如果有疑問大家可以留言交流。

您可能感興趣的文章: