ArrayList | AlexYoung's blog

文章目录

1. ArrayList概述
2. 字段和构造函数
3. add方法
4. remove方法
5. 扩容
6. 迭代器

在上一篇博客中，我们基本了解了Java容器类的架构。本篇博客重点剖析继承Collection接口中的ArrayList类，JDK源码版本基于1.8.0_25。

ArrayList概述

ArrayList 是一个数组队列，相当于动态数组。与Java中的数组相比，它的容量能动态增长。它继承于AbstractList，实现了List, RandomAccess, Cloneable, java.io.Serializable这些接口。

java.lang.Object
   ↳     java.util.AbstractCollection<E>
         ↳     java.util.AbstractList<E>
               ↳     java.util.ArrayList<E>
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable {}

继承AbstractList：实现数组基本的添加、删除、修改、遍历等功能
RandmoAccess接口：support fast (generally constant time) random access
Cloneable接口：覆盖clone()方法，Returns a shallow copy of this ArrayList instance
Serializable接口：ArrayList支持序列化，能通过序列化去传输

RandmoAccess为List提供快速访问功能。在ArrayList中，我们即可以通过元素的序号快速获取元素对象，这就是快速随机访问。题外话：random and sequential access is often fuzzy. For example, some List implementations provide asymptotically linear access times if they get huge, but constant access times in practice.

稍后，我们会比较List的“快速随机访问”和“通过Iterator迭代器访问”的效率。

JDK 1.8.0_25：Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.

和Vector不同，ArrayList中的操作不是线程安全的，所以为了防止意外的非同步访问，最好在创建时声明：

1	List list = Collections.synchronizedList(new ArrayList(...));

ArrayList与Collection关系如下图：

ArrayList包含了两个重要的对象：elementData 和 size。

transient Object[] elementData : ArrayList容器，保存添加到ArrayList中的元素。
private int size ：The size of the ArrayList (the number of elements it contains).

transient：Java关键字，变量修饰符。如果用transient声明一个实例变量，当对象存储时，它的值不需要维持。具体讲解点击 → here

字段和构造函数

ArrayList有七个字段加一个定义在AbstractList的modCount：

private static final long serialVersionUID = 8683452581122892189L;
private static final int DEFAULT_CAPACITY = 10;
private static final Object[] EMPTY_ELEMENTDATA = {};
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
transient Object[] elementData; // non-private to simplify nested class access
private int size;
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
protected transient int modCount = 0; //定义在AbstractList中

ArrayList的默认容量DEFAULT_CAPACITY为10，EMPTY_ELEMENTDATA 和 DEFAULTCAPACITY_EMPTY_ELEMENTDATA是两个常量。

当使用有参构造函数，并且initialCapacity为0或者Colletion中没有元素的时候，返回的就是EMPTY_ELEMENTDATA。当使用默认构造函数public ArrayList()，返回的就是DEFAULTCAPACITY_EMPTY_ELEMENTDATA。

这两个数组都是空的并不会存放值。当第一次往ArrayList添加元素的时候，其实是将元素存放到elementData中，所以真正用来存放元素的是elementData。

MAX_ARRAY_SIZE是array允许分配的最大空间，在Integer最大值减去8是因为 Some VMs reserve some header words in an array.

modCount：The number of times this list has been structurally modified.

在一个改变ArrayList的结构的方法中需要对modCount进行自增，比如一些添加，删除的方法中。在ArrayList的迭代器中需要使用这个字段，用来查看是否在迭代的时候，使用了非该迭代器的方法修改了ArrayList的结构，如果被修改了，在迭代的时候就会抛出ConcurrentModificationException异常。

在多次使用默认构造函数进行实例化ArrayList的时候，其实并不是每一个实例都创建一个了Object[]，见下面的程序：

public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
		
	List<Integer>list1 = new ArrayList<>();
	List<Integer>list2 = new ArrayList<>();
	List<Integer>list3 = new ArrayList<>(20);
		
	Field f1 = list1.getClass().getDeclaredField("elementData");//根据变量名elementData获得字段
	f1.setAccessible(true);	//设置字段可访问，即暴力反射
	Object o1 = f1.get(list1);	//在那个对象上获取此字段的值
	System.out.println(o1);
		
	Field f2 = list2.getClass().getDeclaredField("elementData");	
	f2.setAccessible(true);	
	Object o2 = f2.get(list2);	
	System.out.println(o2);
		
	Field f3 = list3.getClass().getDeclaredField("elementData");	
	f3.setAccessible(true);	
	Object o3 = f3.get(list3);	
	System.out.println(o3);
}

在我的机器上输出如下

1
2
3

[Ljava.lang.Object;@15db9742
[Ljava.lang.Object;@15db9742
[Ljava.lang.Object;@6d06d69c

ArrayList提供了三个构造函数：

// 默认构造函数
public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }
// initialCapacity是ArrayList的默认容量大小。当由于增加数据导致容量不足时，容量会添加上一次容量大小的一半。
public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }
// 创建一个包含collection的ArrayList
public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if ((size = elementData.length) != 0) {
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            if (elementData.getClass() != Object[].class)
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // replace with empty array.
            this.elementData = EMPTY_ELEMENTDATA;
        }
    }

add方法

ArrayList提供了add(E e)、add(int index, E element)、addAll(Collection<? extends E> c)、addAll(int index, Collection<? extends E> c)、set(int index, E element)这个五个方法来实现ArrayList增加。

add(E e)：将指定的元素添加到此列表的尾部。

public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

这里ensureCapacity()方法是对ArrayList集合进行扩容操作

add(int index, E element)：将指定的元素插入此列表中的指定位置。

public void add(int index, E element) {
        //判断索引位置是否正确
        rangeCheckForAdd(index);
        
        //扩容
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        //对源数组进行复制处理（位移），向右移动当前位于该位置的元素以及所有后续元素。
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }

在这个方法中最根本的方法就是System.arraycopy()方法，该方法的根本目的就是将index位置空出来以供新数据插入，这里需要进行数组数据的右移，这是非常麻烦和耗时的，所以如果指定的数据集合需要进行大量插入（中间插入）操作，需要考虑性能的消耗。

addAll(Collection<? extends E> c)：按照指定 collection 的迭代器返回的元素顺序，将该 collection 中的所有元素添加到此列表的尾部。

public boolean addAll(Collection<? extends E> c) {
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
        System.arraycopy(a, 0, elementData, size, numNew);
        size += numNew;
        return numNew != 0;
    }

这个方法无非就是使用System.arraycopy()方法将C集合(先准换为数组a)里面的数据复制到elementData数组中。

System.arraycopy()原型为：public static void arraycopy(Object src, int srcPos, Object dest, int destPos, int length)。

它的根本目的就是进行数组元素的复制。将源数组src从srcPos位置开始复制到dest数组中，复制长度为length，数据从dest的destPos位置开始粘贴。

addAll(int index, Collection<? extends E> c)：从指定的位置开始，将指定 collection 中的所有元素插入到此列表中。

public boolean addAll(int index, Collection<? extends E> c) {
        rangeCheckForAdd(index); //判断索引位置是否正确
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
         //将ArrayList容器从index开始的所有元素向右移动到index+numNew的位置，从而腾出numNew长度的空间放c
        int numMoved = size - index;
        if (numMoved > 0)
            System.arraycopy(elementData, index, elementData, index + numNew,
                             numMoved);
        System.arraycopy(a, 0, elementData, index, numNew);
        size += numNew;
        return numNew != 0;
    }

set(int index, E element)：用指定的元素替代此列表中指定位置上的元素。

public E set(int index, E element) {
        rangeCheck(index);
        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }

remove方法

ArrayList提供了remove(int index)、remove(Object o)、removeRange(int fromIndex, int toIndex)、removeAll()四个方法进行元素的删除。

remove(int index)：移除此列表中指定位置上的元素。

public E remove(int index) {
        rangeCheck(index);
        modCount++;
        E oldValue = elementData(index);
        //向左移的位数，下标从0开始，需要再多减1
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        //置空最后一个元素                     
        elementData[--size] = null; // clear to let GC do its work
        return oldValue;
    }

remove(Object o)：移除此列表中首次出现的指定元素（如果存在）。

public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    //fastRemove()方法用于移除指定位置的元素，和remove方法类似，区别是void类型
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }

removeRange(int fromIndex, int toIndex)：移除列表中索引在 fromIndex（包括）和 toIndex（不包括）之间的所有元素。

protected void removeRange(int fromIndex, int toIndex) {
        modCount++;
        int numMoved = size - toIndex;
        System.arraycopy(elementData, toIndex, elementData, fromIndex,
                         numMoved);
        // clear to let GC do its work
        int newSize = size - (toIndex-fromIndex);
        for (int i = newSize; i < size; i++) {
            elementData[i] = null;
        }
        size = newSize;
    }

removeAll()：移除ArrayList中所有在c中出现的元素。

public boolean removeAll(Collection<?> c) {
        //Checks that the specified object reference is not null
        Objects.requireNonNull(c);
        Retains only the elements in this list that are contained in the specified collection
        //false是移除相同元素，方法retainAll中置为true，是保留相同元素
        return batchRemove(c, false);
    }
    
private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++)
                // 此判断是移除或保留的分水岭，complement为false则移除相同元素
                if (c.contains(elementData[r]) == complement)
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
            if (r != size) {
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }

扩容

在上面的add方法的源码中我们发现每个方法中都存在这个方法：ensureCapacity()，该方法就是ArrayList的扩容方法。

在前面就提过ArrayList每次新增元素时都会需要进行容量检测判断，若新增元素后元素的个数会超过ArrayList的容量，就会进行扩容操作来满足新增元素的需求。所以当我们清楚知道业务数据量或者需要插入大量元素前，可以使用ensureCapacity来手动增加ArrayList实例的容量，以减少递增式再分配的数量。

//外部接口
public void ensureCapacity(int minCapacity) {
    int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
        // any size if not default element table
        ? 0
        // larger than default for default empty table. It's already
        // supposed to be at default size.
        : DEFAULT_CAPACITY;
    if (minCapacity > minExpand) {
        ensureExplicitCapacity(minCapacity);
    }
}
//内部接口
private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    ensureExplicitCapacity(minCapacity);
}
//内部接口
private void ensureExplicitCapacity(int minCapacity) {
    modCount++;
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    //扩容到1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

对比网上的其他教程，可以发现Java 8中ArrayList的代码具有更好的鲁棒性，逻辑也更加清晰。

当然也许你会质疑为毛ArrayList的增长因子是1.5，而HashMap是2.0呢，可以参考这里。

大致就是说HashMap增长因子更大是为了避免哈希碰撞，而ArrayList一方面是根据经验设定，另一方面可能是为了避免增长因子过大而造成空间浪费。

当然你可以手动最小化ArrayList实例的存储量：trimToSize()

public void trimToSize() {
        modCount++;
        if (size < elementData.length) {
            elementData = (size == 0)
              ? EMPTY_ELEMENTDATA
              : Arrays.copyOf(elementData, size);
        }
    }

迭代器

主要与两种迭代器，Itr和ListItr，在java8中还加入了一个ArrayListSpliterator迭代器。

Itr和ListItr的主要区别就是Itr只能往后遍历，而ListItr可以前后遍历。

Itr

该迭代器从下标为0的地方开始遍历，直到ArrayList的最后一个元素。在取值的时候会检查modCount是否被修改了，即ArrayList的结构是否被修改了，是则抛出异常。如果要删除元素，只能使用迭代器中的remove方法。

private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;
        public boolean hasNext() {
            return cursor != size;
        }
        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }
        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();
            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }
        // 检查modCount是否修改
        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }

ListItr

ListItr继承自Itr，添加了向前遍历的功能，set和add方法，代码如下：

public void set(E e) {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();
        try {
            ArrayList.this.set(lastRet, e);
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }
    public void add(E e) {
        checkForComodification();
        try {
            int i = cursor;
            ArrayList.this.add(i, e);
            cursor = i + 1;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }

使用示例

public static void main(String[] args) {
        list.add(1);
		list.add(2);
		.add(3);
		
		Iterator<Integer>it = list.iterator();
		
		//list.add(2); throw java.util.ConcurrentModificationException
		
		while (it.hasNext()) {
			System.out.println(it.next());
		}
		
		ListIterator<Integer>listIterator = list.listIterator();
		listIterator.add(4); //   调用自身的方法修改 
		while (listIterator.hasNext()) {
			System.out.println(listIterator.next());
		}
	}

程序输出的结果：
1,2,3
1,2,3
注意listIterator输出中并没有4

在获取了iterator实例后，list就不可改变。当ArrayList使用了ListIterator()方法产生自身对应的Iterator后，只能使用Iterator自身的remove()和add()方法来修改ArrayList的结构，因为这些方法对expectedModCount和modCount变量自动同步。

迭代效率

ArrayList支持3种遍历方式:

第一种，通过迭代器遍历。即通过Iterator去遍历。

Integer value = null;
Iterator iter = list.iterator();
while (iter.hasNext()) {
    value = (Integer)iter.next();
}

第二种，随机访问，通过索引值去遍历。

Integer value = null;
int size = list.size();
for (int i=0; i<size; i++) {
    value = (Integer)list.get(i);        
}

第三种，for循环遍历。

Integer value = null;
for (Integer i : list) {
    value = i;
}

程序测试遍历十万数字，使用时间如下：
iteratorThroughIterator：7 ms
iteratorThroughRandomAccess：5 ms
iteratorThroughFor2：6 ms
由此可见，遍历ArrayList时，使用随机访问(即通过索引序号访问)效率最高，而使用迭代器的效率最低。

最后来一道测试题：

ArrayList list = new ArrayList(20); 中的list扩充几次（）
A. 0
B. 1
C. 2
D. 3

解释：默认ArrayList的长度是10个，所以如果你要往list里添加20个元素肯定要扩充一次（newCapacity 扩充为原来的1.5倍，但和输入的minCapacity相比发现小于minCapacity，于是 newCapacity = minCapacity，所以只扩容一次，具体见扩容里的grow方法），但是这里显示指明了需要多少空间，所以就一次性为你分配这么多空间，也就是不需要扩充了。