nikoo 发表于 2017-4-19 11:45:28

【java】由Zookeeper sourcecode联想到的

  一、导火索
  在看Zookeeper sourcecode的时候,发现server端的NIO*Factory 在channel上获取完SelectionKey列表后,进行了一个shuffle的操作保证公平性,code as followed:

ArrayList<SelectionKey> selectedList = new ArrayList<SelectionKey>(selected);
Collections.shuffle(selectedList);
for (SelectionKey k : selectedList) {
....
}
  说实话,对于我这样的资深工程师来说以前从没用过这个方法,来看看它是干什么的

    /**
* Randomly permutes the specified list using a default source of
* randomness.All permutations occur with approximately equal
* likelihood.
(大致就是说它可以随机的交换list的元素,所有交换的可能性大致是相等的)
* If the specified list does not
* implement the {@link RandomAccess} interface and is large, this
* implementation dumps the specified list into an array before shuffling
* it, and dumps the shuffled array back into the list.This avoids the
* quadratic behavior that would result from shuffling a "sequential
* access" list in place.
*/
  Note that,如果list没有实现RandomAccess interface 或者list比较大,那么它会将list convert to 一个array然后进行shuffle,最后在dump回list里,为什么要这么搞呢,因为这样可以避免二次行为 (相信很多人实际操作中第一想法会直接loop list元素,然后随机交换元素,这就是base knowledge不足啊)

int size = list.size();
if (size < SHUFFLE_THRESHOLD || list instanceof RandomAccess) {//list比较小或者实现了RandomAccess接口,则直接搞就行了.
for (int i=size; i>1; i--)
swap(list, i-1, rnd.nextInt(i));
} else {
Object arr[] = list.toArray();
// Shuffle array
for (int i=size; i>1; i--)
swap(arr, i-1, rnd.nextInt(i));
// Dump array back into list
ListIterator it = list.listIterator();
for (int i=0; i<arr.length; i++) {
it.next();
it.set(arr);
}
}
  why?
  看看RandomAccess,我挑几句注释来说明

/**
* Generic list algorithms are encouraged to check whether the given list is an <tt>instanceof</tt> this interface before applying an algorithm that would provide poor performance if it were applied to a sequential access list,
* and to alter their behavior if necessary to guarantee acceptable
* performance.
*
* <p>It is recognized that the distinction between random and sequential
* access is often fuzzy.For example, some <tt>List</tt> implementations
* provide asymptotically linear access times if they get huge, but constant
* access times in practice.Such a <tt>List</tt> implementation
* should generally implement this interface.As a rule of thumb, a
* <tt>List</tt> implementation should implement this interface if,
* for typical instances of the class, this loop:
* <pre>
*   for (int i=0, n=list.size(); i &lt; n; i++)
*         list.get(i);
* </pre>
* runs faster than this loop:
* <pre>
*   for (Iterator i=list.iterator(); i.hasNext(); )
*         i.next();
* </pre>
*/
  JDK建议我们如果是random access list,则用for(int i=0;i<list.size();i++){} 如果是sequence access list则用Iterator。因为最佳的方式就是用list instanceof RandomAccess来决定具体的算法.
  其实数据比较大的时候,二者的效率还是有一定差距的.
  JDK的source code还是值得去细细品味的
页: [1]
查看完整版本: 【java】由Zookeeper sourcecode联想到的