算法复杂度不等于性能,list的性能陷阱

deltamaster posted @ Oct 21, 2014 10:36:06 PM in C++ with tags c++ 算法 数据结构 性能 缓存 , 10649 阅读

在学习数据结构和算法的时候,我们一般都会通过分析算法复杂度,来对某个问题选择合适的数据结构和算法。

下面就是一个经典问题:

我们需要一个容器,容器中存放的是int数据。容器中的元素是有序的,对容器有一个插入操作,要求是插入后容器中的元素仍然是有序的。我们应该选择list还是vector?

让我们先从算法复杂度的角度来简单看一下这个问题。首先一点是有序,所以找到要插入的位置,对于list的算法复杂度是O(n),因为平均需要遍历一半的链表长度,对于vector的算法复杂度是log^2(n),因为可以进行二分查找确定位置。然后是插入,list的插入操作是O(1),非常理想的常数级操作,而vector的插入操作是O(n),因为平均要移动一半的元素。综合下来看,list是要更胜一筹,所以应该选择list。但是事实却与这个推测截然相反?

让我们用C++写一段程序来进行验证:

#include <iostream>
#include <vector>
#include <list>
#include <random>

using namespace std;

template <typename T>
typename T::iterator insert(T& c, typename T::value_type elem)
{
    auto i = c.begin();
    for (; i != c.end(); i++) {
        if (elem <= *i) {
            break;
        }
    }
    return c.insert(i, elem);
}

int main(int argc, char** argv)
{
    vector<int> v;
    list<int> l;
    auto engine = default_random_engine();

    if (argc < 3) return 1;

    if (argv[1] && argv[1][0] == 'v') {
        for (int i = 0; i < atoi(argv[2]); i++) {
            insert(v, engine());
        }
    } else {
        for (int i = 0; i < atoi(argv[2]); i++) {
            insert(l, engine());
        }
    }

    return 0;
}

在这里为了说明问题方便起见,在查找阶段我们让vector和list都使用O(n)的方法进行查找。那么对于这段程序(下面表里的算法复杂度是针对每次插入操作而言的):

  locate insert
vector O(n) O(n)
list O(n) O(1)

这样,我们通过生成一系列随机数的方式来进行测试。让我们看看不同数据规模下的测试结果如何,表格中的时间单位是秒。

规模 vector list
100 0.002 0.003
1000 0.003 0.004
10000 0.019 0.142
100000 1.153 29.504

为什么list的性能反而差,而且相差如此悬殊?因为我们之前没有站在体系结构的角度看问题。同样是O(n)的操作,对于计算机而言实际执行的效率可能有着天壤之别。

让我们回顾一下前面程序中的insert函数模板,对于两种容器,我们使用了完全相同的线性查找算法。对于计算机而言,list版本的步进操作(i++),需要多进行一次寻址(根据指针找到下一个节点的位置),而更重要的是,找到的下一个节点,很有可能没有命中缓存,因为通过list中的相邻元素在内存中并没有相邻关系;而vector的步进操作,几乎能够保证命中缓存,因为每次访问的,都是内存中相邻的元素。对于插入操作而言,list虽然只有O(1)复杂度,而且确实非常快,但却要每次进行一次很小的内存分配,而又因为list的每个节点都要包含其他指针信息,对于内存空间也是很大的浪费。

我们从这个测试中得到的启示是:

  • 性能分析要用实际测试数据说话,主观推测容易以偏概全;
  • 对于体系结构的理解非常重要,缓存命中率是影响性能的重要因素;
  • 数据规模小的时候,算法复杂度无关紧要;
  • 不要过度优化;
  • 节约使用内存!
* 本文在CC BY-SA(署名-相同方式共享)协议下发布。
Avatar_small
hikui 说:
Nov 22, 2014 04:02:31 PM

这跟Bjarne Stroustrup说的一模一样啊。。。
https://www.youtube.com/watch?v=YQs6IC-vgmo

Avatar_small
deltamaster 说:
Dec 01, 2014 09:02:21 PM

@hikui: 一样嘛就对了

Avatar_small
哆啦比猫 说:
Dec 12, 2014 09:25:02 PM

早就听 BS 说了,但是一直懒得做 benchmark……活生生的证据,可以用来说服同学和老师了

f88tw┃华歌尔 说:
Mar 04, 2020 02:53:17 PM

敬启者:个人小网站希望大家多多支持 感谢您对我们热心的支持 f88tw┃华歌尔┃I appreciate your kind assistance. f88tw|墓园|捡骨流程|捡骨费用|捡骨时间|禁忌|捡骨颜色|捡骨师|新竹|时间|台北|桃园|苗栗|头份|火化|晋塔|安葬|法事|捡骨|看日子|墓穴|墓园|坟墓|看日子|乞丐|http://mypaper.m.pchome.com.tw/f88tw

f88tw 说:
Apr 09, 2021 07:16:00 PM

敬启者:个人小网站希望大家多多支持 感谢您对我们热心的支持 f88tw┃华歌尔┃I appreciate your kind assistance. f88tw| 粗工| 粗工内容 | 粗工| 粗工内容 |墓园|捡骨流程|捡骨费用|捡骨时间|禁忌|捡骨颜色|捡骨师|新竹|时间|台北|桃园|苗栗|头份|https://mypaper.m.pchome.com.tw/f88tw

mega888 apk download 说:
Aug 30, 2021 12:05:38 AM

It is rather very good, nevertheless glance at the data with this handle.

Satta king 说:
Sep 25, 2021 05:17:11 AM

These you will then see the most important thing, the application provides you a website a powerful important internet page:

JAC 11th Previous Pa 说:
Aug 16, 2022 10:31:39 PM

The Jharkhand Academic Council was established by the Jharkhand State Legislature and given the state governor's approval on the Formatted, the day the State of Jharkhand was created. They can check their JAC Important Question Paper 2023 of 11th class from the main web portal site, and we also maintain the direct link to check Jharkhand 1st Inter Important Question Paper 2023 with ease. JAC 11th Previous Paper 2023 It was established for holding and organising a variety of examinations, including those at the conclusion of Intermediate, Secondary Education, Education, and Madrasa Education. The JAC 11th Class Important Question Paper 2023 is provided here. Please follow the instructions provided below to view Jharkhand 1st Inter 11th Class Important Question Paper 2023 for students.

charlly 说:
Dec 18, 2022 03:41:17 PM

There is a common misconception that algorithm complexity is the same as performance. However, this is not the case. Algorithm complexity only refers to the worst-case scenario, whereas performance can vary depending on the inputs. Therefore, it is possible for an algorithm with a lower complexity to have a worse performance than one with a higher complexity. cbd benefits This is often referred to as the "performance trap" of lists.

NCERT 10th Exemplar 说:
Jul 28, 2023 10:21:12 PM

For All class 10 Students Preparing for their Maths Exam 2024, Solving the NCERT Exemplar Problems 2024 is the best way to Tackle the Complex Problems asked in Mathematics Exam, we are Providing the NCERT Class 10 Maths Exemplar Problems and Solutions 2024 in Pdf Format which Students can Download to Enhance their Preparation level for CBSE Board and Competitive Examination.Exemplar NCERT 10th Exemplar Solutions for Maths 2024 Problems and Solutions Required for CBSE Exam, as well as useful to form the Foundation in the NCERT Examination, Students Download the latest NCERT Exemplar Problems Solutions Maths Class 10 in Pdf Format, in both Hindi and English. Urdu Medium You can also buy them from the links given


登录 *


loading captcha image...
(输入验证码)
or Ctrl+Enter