Java容器类(1)-List

主要源码分析,容器类关系图

Posted by zhidaliao on October 22, 2016

This document is not completed and will be updated anytime.

点击查看大图

普通容器类

Collections关系图1

以下为 List 的子类,当然也实现了其他的接口,比如 LinkedList实现了 Queue接口

All Known Implementing Classes: AbstractList, AbstractSequentialList, ArrayList, AttributeList, CopyOnWriteArrayList, LinkedList, RoleList, RoleUnresolvedList, Stack, Vector

List

接口方法如下:提供了增删改查,截取、遍历等操作

public interface List<E> extends Collection<E> {
 
    int size();
    
    boolean isEmpty();
    
    boolean contains(Object o);
    
    Iterator<E> iterator();
    
    Object[] toArray();
    
    <T> T[] toArray(T[] a); 
    
    boolean add(E e);
    
    boolean remove(Object o); 
    
    boolean containsAll(Collection<?> c);

    boolean addAll(Collection<? extends E> c);

    boolean addAll(int index, Collection<? extends E> c);

    boolean removeAll(Collection<?> c);

    boolean retainAll(Collection<?> c);

    void clear(); 

    boolean equals(Object o);

    int hashCode(); 

    E get(int index);

    E set(int index, E element);

    void add(int index, E element);

    E remove(int index); 

    int indexOf(Object o);

    int lastIndexOf(Object o);

    ListIterator<E> listIterator();

    ListIterator<E> listIterator(int index);
 
    List<E> subList(int fromIndex, int toIndex);
}

AbstractList

List的抽象类,提供了绝大部分的方法实现,删改操作留给子类去实现。

public abstract class AbstractList<E> extends AbstractCollection<E> implements List<E> {
    
    protected AbstractList() {
    }

    
    public boolean add(E e) {
        add(size(), e);
        return true;
    }
    
    abstract public E get(int index);

    
    public E set(int index, E element) {
        throw new UnsupportedOperationException();
    }

    
    public void add(int index, E element) {
        throw new UnsupportedOperationException();
    }

    
    public E remove(int index) {
        throw new UnsupportedOperationException();
    }

    ...
}
  • modCount 主要作用是防止在遍历的时候,集合有新增或者删除。
  • 遍历的内部类,在输出下一个元素的时候会先检查计数器是否一致,如果不一致会抛出 并发修改异常。
  • 如何避免?
    • 对容器克隆之后再进行迭代
    • 客户端加锁:对修改和遍历加锁
private class Itr implements Iterator<E> {
  
    int expectedModCount = modCount;

    public E next() {
        checkForComodification();
        try {
            int i = cursor;
            E next = get(i);
            lastRet = i;
            cursor = i + 1;
            return next;
        } catch (IndexOutOfBoundsException e) {
            checkForComodification();
            throw new NoSuchElementException();
        }
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
    ...
}
subList(int fromIndex, int toIndex)

重点分析 subList 方法

public List<E> subList(int fromIndex, int toIndex) {
    return new SubList<>(this, fromIndex, toIndex);
}

实际场景: 新建一个长度为8的数组,截取从下标为1的数组开始,一直到下标为6(不包括)的区间元素。

List<String> listDemo = Collections.synchronizedList(Arrays.asList("0","1", "2","3","4","5","6","7"));

List<String> subList = listDemo.subList(1, 6);

for(String str : subList){
    System.out.println(str);
}

output:
1
2
3
4
5

AbstractList 有一个内部类 SubList

class SubList<E> extends AbstractList<E> {
    private final AbstractList<E> l;
    private final int offset;
    private int size;

    SubList(AbstractList<E> list, int fromIndex, int toIndex) {
        if (fromIndex < 0)
            throw new IndexOutOfBoundsException("fromIndex = " + fromIndex);
        if (toIndex > list.size())
            throw new IndexOutOfBoundsException("toIndex = " + toIndex);
        if (fromIndex > toIndex)
            throw new IllegalArgumentException("fromIndex(" + fromIndex +
                                               ") > toIndex(" + toIndex + ")");
        l = list;
        offset = fromIndex;
        size = toIndex - fromIndex;
        this.modCount = l.modCount;
    }

    public void add(int index, E element) {
        rangeCheckForAdd(index);
        checkForComodification();
        l.add(index+offset, element);
        this.modCount = l.modCount;
        size++;
    }
    
    private void rangeCheckForAdd(int index) {
        if (index < 0 || index > size)
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    private void checkForComodification() {
        if (this.modCount != l.modCount)
            throw new ConcurrentModificationException();
    }


    ...
}

SubList 内部持有的集合是 原集合list, 所以对子集的修改会影响到原集合,示例如下:

List<String> list = new ArrayList<String>();
        
list.add("a");
list.add("b");
list.add("c");
list.add("d");
list.add("e");

List<String> subList = list.subList(0, 3);

for(String str:list){
    System.out.println(str);
}

System.out.println("-----------");

for(String str:subList){
    System.out.println(str);
}


subList.add("新增元素");


System.out.println("-----分割线-----");

for(String str:list){
    System.out.println(str);
}

System.out.println("-----------");

for(String str:subList){
    System.out.println(str);
}

Output :
a
b
c
d
e
-----------
a
b
c
-----分割线-----
a
b
c
新增元素
d
e
-----------
a
b
c
新增元素

ArrayList

父接口:Serializable, Cloneable, Iterable<E>, Collection<E>, List<E>, RandomAccess

public class ArrayList<E> extends AbstractList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable{

    private static final int DEFAULT_CAPACITY = 10;

    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    private static final Object[] EMPTY_ELEMENTDATA = {};

    private transient Object[] elementData;

    // 初始化逻辑
    public ArrayList(int initialCapacity) {
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity];
    }

    public ArrayList() {
        super();
        this.elementData = EMPTY_ELEMENTDATA;
    }
   
    ...
}

扩容场景:


List<String> list = new ArrayList<String>();
list.add("a");
list.add(1,"b");

for(String str:list){
    System.out.println(str);
}

output:
a
b
public void add(int index, E element) {
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

// 是否越界等
private void rangeCheckForAdd(int index) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

// modCount + 1
private void ensureCapacityInternal(int minCapacity) {

    if (elementData == EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}


// 开始扩容
private void ensureExplicitCapacity(int minCapacity) {
    modCount++;
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

// 对数组元素进行迁移,扩容数组长度
private void grow(int minCapacity) {
         
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);

    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);

    elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;
}

// 对数组元素进行迁移
public static <T> T[] copyOf(T[] original, int newLength) {
    return (T[]) copyOf(original, newLength, original.getClass());
}

// 对数组元素进行迁移
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
    T[] copy = ((Object)newType == (Object)Object[].class)
        ? (T[]) new Object[newLength]
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

// 原生方法,操作内存
public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

  • 每次插入前都会进行容量检查,检查是否超出了容量,如果超出了,则进行扩容

  • 默认大小是10,容量拓展,是创建一个新的数组,然后将旧数组上的数组copy到新数组,这是一个很大的消耗,所以在我们使用ArrayList时,最好能预计数据的大小,在第一次创建时就申请够内存。

  • clear方法并不是把整个数组都删除,因为毕竟已经申请了内存,这样删了,很可惜,因为可能以后还用得着,这就免去了再次去申请内存的麻烦。这里只是把每个元素的都置为null,并把size设为0.

  • 容量进行扩展的时候,其实例如整除运算将容量扩展为原来的1.5倍加1,而jdk1.7是利用位运算oldCapacity >> 1,从效率上,jdk1.7就要快于jdk1.6。

  • 在算出newCapacity时,其没有和ArrayList所定义的MAX_ARRAY_SIZE作比较,为什么没有进行比较呢,原因是jdk1.6没有定义这个MAX_ARRAY_SIZE最大容量,也就是说,其没有最大容量限制的,但是jdk1.7做了一个改进,进行了容量限制。

ArrayList实现原理以及其在jdk1.6和jdk1.7的实现区别

LinkedList

父接口:Serializable, Cloneable, Iterable<E>, Collection<E>, Deque<E>, List<E>, Queue<E>

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    transient Node<E> first;

    transient Node<E> last;

     public LinkedList() {
    }

    public LinkedList(Collection<? extends E> c) {
        this();
        addAll(c);
    }

    private void linkFirst(E e) {
        final Node<E> f = first;
        final Node<E> newNode = new Node<>(null, e, f);
        first = newNode;
        if (f == null)
            last = newNode;
        else
            f.prev = newNode;
        size++;
        modCount++;
    }

    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

    public boolean add(E e) {
        linkLast(e);
        return true;
    }

    public E getFirst() {
        final Node<E> f = first;
        if (f == null)
            throw new NoSuchElementException();
        return f.item;
    }


    private class ListItr implements ListIterator<E> {
        private Node<E> lastReturned = null;
        private Node<E> next;
        private int nextIndex;
        private int expectedModCount = modCount;

        ListItr(int index) {
            // assert isPositionIndex(index);
            next = (index == size) ? null : node(index);
            nextIndex = index;
        }

        public boolean hasNext() {
            return nextIndex < size;
        }

        public E next() {
            checkForComodification();
            if (!hasNext())
                throw new NoSuchElementException();

            lastReturned = next;
            next = next.next;
            nextIndex++;
            return lastReturned.item;
        }

        ...
    }   

    // 获取指定元素 折中查找元素
    public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }

    Node<E> node(int index) {
        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

    private void checkElementIndex(int index) {
        if (!isElementIndex(index))
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

    private boolean isElementIndex(int index) {
        return index >= 0 && index < size;
    }
}

查看一个简单添加元素的源码

public void add(int index, E element) {\
    // 检查索引是否越界等
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}


void linkBefore(E e, Node<E> succ) {
    // assert succ != null;
    final Node<E> pred = succ.prev;
    final Node<E> newNode = new Node<>(pred, e, succ);
    succ.prev = newNode;
    if (pred == null)
        first = newNode;
    else
        pred.next = newNode;
    size++;
    modCount++;
}

ArrayList vs LinkedList

ArrayList底层数据结构为数组, LinkedList 是双向列表。

LinkedList迭代器的next函数只是通过next指针快速得到下一个元素并返回。而get方法会从折中开始遍历直到index下标,查找一个元素时间复杂度为哦O(n/2),get()遍历的时间复杂度就达到了O(n2),next()方式遍历时间复杂度是O(n)

ArrayList 和 LinkedList遍历方式结果对比分析

从上面的数量级来看,同样是foreach循环遍历,ArrayList和LinkedList时间差不多,可将本例稍作修改加大list size会发现两者基本在一个数量级上。

但ArrayList get函数直接定位获取的方式时间复杂度为O(1),而LinkedList的get函数时间复杂度为O(n)。 再结合考虑空间消耗的话,建议首选ArrayList。对于个别插入删除非常多的可以使用LinkedList。

结论总结

通过上面的分析我们基本可以总结下:

(1) 无论ArrayList还是LinkedList,遍历建议使用foreach,尤其是数据量较大时LinkedList避免使用get遍历。

(2) List使用首选ArrayList。对于个别插入删除非常多的可以使用LinkedList。

(3) 可能在遍历List循环内部需要使用到下标,这时综合考虑下是使用foreach和自增count还是get方式。

同步容器类

每个操作都上锁,效率比较慢,相对于同步容器类,并发容器更高的效率和多样化的设计显得更加的主流。

迭代操作中,修改元素,可能会抛出ConcurrentModificationException , 删除元素元素可能抛出ArrayIndexOutBoundsException

Vector

Vector和ArrayList的实现方式可以看出非常的相像,每个方法中都添加了synchronized的关键字来保证同步

概览:

public synchronized int capacity() {
    return elementData.length;
}


public synchronized int size() {
    return elementCount;
}


public synchronized boolean isEmpty() {
    return elementCount == 0;
}

collections.synchronizedlist

源码:查看是否是快速随机列表


public static <T> List<T> synchronizedList(List<T> list) {
    return (list instanceof RandomAccess ?
            new SynchronizedRandomAccessList<>(list) :
            new SynchronizedList<>(list));
}

内部持有一个普通集合 List , 所有的方法都加锁

static class SynchronizedList<E> extends SynchronizedCollection<E> implements List<E> {
    private static final long serialVersionUID = -7754090372962971524L;

    final List<E> list;

    SynchronizedList(List<E> list) {
        super(list);
        this.list = list;
    }
    SynchronizedList(List<E> list, Object mutex) {
        super(list, mutex);
        this.list = list;
    }

    public int hashCode() {
        synchronized (mutex) {return list.hashCode();}
    }

    public E get(int index) {
        synchronized (mutex) {return list.get(index);}
    }
    public E set(int index, E element) {
        synchronized (mutex) {return list.set(index, element);}
    }
    public void add(int index, E element) {
        synchronized (mutex) {list.add(index, element);}
    }
    ...
}
List<String> list = Collections.synchronizedList(Arrays.asList("1", "a"));

并发容器类

CopyOnWriteArrayList

public class CopyOnWriteArrayList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
    private static final long serialVersionUID = 8673264195747942595L;

    transient final ReentrantLock lock = new ReentrantLock();

    private volatile transient Object[] array;

    // 对写操作进行加锁处理
    public boolean add(E e) {
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            // 对数组的新增,先复制一份修改,然后用复制的替换旧的
            Object[] newElements = Arrays.copyOf(elements, len + 1);
            newElements[len] = e;
            setArray(newElements);
            return true;
        } finally {
            lock.unlock();
        }
    }

    public E get(int index) {
        return get(getArray(), index);
    }

    final Object[] getArray() {
        return array;
    }

    public Iterator<E> iterator() {
        return new COWIterator<E>(getArray(), 0);
    }

    private static class COWIterator<E> implements ListIterator<E> {
        private final Object[] snapshot;
        private int cursor;

        private COWIterator(Object[] elements, int initialCursor) {
            cursor = initialCursor;
            snapshot = elements;
        }

        public boolean hasNext() {
            return cursor < snapshot.length;
        }

        public boolean hasPrevious() {
            return cursor > 0;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            if (! hasNext())
                throw new NoSuchElementException();
            return (E) snapshot[cursor++];
        }

        @SuppressWarnings("unchecked")
        public E previous() {
            if (! hasPrevious())
                throw new NoSuchElementException();
            return (E) snapshot[--cursor];
        }

        public int nextIndex() {
            return cursor;
        }

        public int previousIndex() {
            return cursor-1;
        }

        
        public void remove() {
            throw new UnsupportedOperationException();
        }

        
        public void set(E e) {
            throw new UnsupportedOperationException();
        }

        
        public void add(E e) {
            throw new UnsupportedOperationException();
        }
    }

    ...
}

CopyOnWriteArrayList实现原理及源码分析

区别于读写锁,写锁是通过 ReentrantLock 实现的 , 读是没有任何锁 ,不保证数据的强一致性

理论

CopyOnWriteArrayList是Java并发包中提供的一个并发容器,它是个线程安全且读操作无锁的ArrayList,写操作则通过创建底层数组的新副本来实现,是一种读写分离的并发策略,我们也可以称这种容器为”写时复制器”, Java并发包中类似的容器还有CopyOnWriteSet。

集合框架中的ArrayList是非线程安全的,Vector虽是线程安全的,但由于简单粗暴的锁同步机制,性能较差。而CopyOnWriteArrayList则提供了另一种不同的并发处理策略

很多时候,我们的系统应对的都是读多写少的并发场景。CopyOnWriteArrayList容器允许并发读,读操作是无锁的,性能较高。至于写操作,比如向容器中添加一个元素,则首先将当前容器复制一份,然后在新副本上执行写操作,结束之后再将原容器的引用指向新容器。

优点:

读操作性能很高,因为无需任何同步措施,比较适用于读多写少的并发场景。Java的list在遍历时,若中途有别的线程对list容器进行修改,则会抛出ConcurrentModificationException异常。而CopyOnWriteArrayList由于其”读写分离”的思想,遍历和修改操作分别作用在不同的list容器,所以在使用迭代器进行遍历时候,也就不会抛出ConcurrentModificationException异常了

缺点:

内存可见性不能得到保证 、 内存的使用比较高

参考网址

Java Collections

http://www.falkhausen.de/Java-8/