普林斯顿大学算法第一部分

Untitled
Untitled
Untitled
Untitled
Untitled
Untitled

recurrence:

\[ \begin{aligned} T(N)&\leq T(\frac N2)+1\\ &\leq T(\frac N4)+1+1\\ &...\\ &\leq T(\frac NN) +1+...+1\\ &=1+\log(N) \end{aligned} \]

二分查找解决 3-SUM 问题:

遍历前两个数,查找第三个能让和为 0 的数,复杂度为 N^2logN

Dynamic Connectivity

Quick Find

Untitled
Untitled

union 时需要遍历整个 array,改变 id

Untitled
Untitled

\(N^2\) 的复杂度是不能容忍的(当算力和数据量同时增长一倍,耗时也会增长一倍)

分析:

  • find () 操作的时间复杂度为𝑂(1),因为只需要访问一次 id 数组
  • union () 操作的时间复杂度为𝑂(𝑁),因为每一个合并都需要遍历整个 id 数组
  • 如果有 M 对连接𝑂(𝑀𝑁),那么构建出所有的分量的时间复杂度为。

Quick Union

Untitled
Untitled

只改变两个需要链接节点的根节点

Untitled
Untitled
Untitled
Untitled

时快时慢,当树很深的时候,会导致 find 变慢,而每一次 union 都需要 find。

分析:

  • 最好情况下,find () 操作的时间复杂度是𝑂(1),最坏情况下,find () 操作的时间复杂度是𝑂(𝑁)
  • union () 操作的时间复杂度应该与 find () 操作是一致的。
改进 Weighted Quick Union
Untitled
Untitled

永远将小树的根节点添加到大树的根节点上(大小指的是深度)

Untitled
Untitled
Untitled
Untitled

为什么是 \(\log(N)\)?

思考 N 个节点时最多有多少层?

节点数量 高度
1 0
2 1
4 2
8 3
2^k=n k

如需要扩展到第三层,就需要两个相同的最小的两层结构,即 4*4=8

log 基数为 2

path compression(在寻找 root 节点时做压缩)

Untitled
Untitled
Untitled
Untitled

Fundamental Data Type

stack

image-20211202162117614
image-20211202162117614
linked-list implementation
image-20211202162930889
image-20211202162930889
array implementation
image-20211202163017159
image-20211202163017159

Resizing Arrays

Grow array

单个扩容和两倍扩容的对比:

image-20211202163402199
image-20211202163402199
shink arrays
image-20211202163604993
image-20211202163604993
Resizing array vs. linked list
image-20211202163835647
image-20211202163835647

Queues

linked-list implementation
image-20211202164256480
image-20211202164256480
array implementation
image-20211202164322328
image-20211202164322328

Sorting

Sorting Complexity

image-20211203000329052
image-20211203000329052
image-20211203204731199
image-20211203204731199

Stability

不打乱原有顺序

Selection Sort

image-20211202170607464
image-20211202170607464

遍历 i,寻找后续最小的元素并交换

implementation:

image-20211202170726228
image-20211202170726228

复杂度:

比较次数:\((N-1)+(N-2)+...+1+0\;\sim\;\frac{N^2}2\)

交换次数:\(N\)

Insertion Sort

image-20211202171213713
image-20211202171213713

迭代 i,当 i 小于前一个元素时交换,交换后继续判断与前一个元素的大小,如果小于,则继续交换,直到大于为止。

implementaion

image-20211202171454934
image-20211202171454934

复杂度(平均):

比较次数:\(\frac14N^2\)

交换次数:\(\frac14N^2\)

Best case (排好序):

Compares: \(N-1\)

Exchanges: \(0\)

Worst case (倒序):

Compares: \(\frac12N^2\)

Exchanges: \(\frac12N^2\)

Partially-sorted arrays
image-20211202172427331
image-20211202172427331

Insertion sort 在 Partially-sorted arrays 中时间复杂度是线性的

image-20211202172445321
image-20211202172445321
image-20211202172519497
image-20211202172519497

Shell Sort

image-20211202224651579
image-20211202224651579

insertion sort with stride length h.

Why insertion sort?

Big increments -> small subarray -> nearly in order

image-20211202225029843
image-20211202225029843

implementation

image-20211202225154257
image-20211202225154257
image-20211202225215162
image-20211202225215162

vworst-case:

Compares used: \(O(N^{\frac32})\) (With 3x+1 increments)

Property:

No accurate model. Maybe $N lg N $

image-20211202230253866
image-20211202230253866

Shuffling

Shuffle sort:

image-20211202230431540
image-20211202230431540

Knuth Shuffle:

image-20211202230634507
image-20211202230634507
image-20211202230726440
image-20211202230726440

Merge Sort

分治,递归

image-20211202231824439
image-20211202231824439
image-20211202231911010
image-20211202231911010
image-20211202232018404
image-20211202232018404
image-20211202233118960
image-20211202233118960

worst case:

Compares: \(NlgN\)

Array accesses: \(6NlgN\)

image-20211202232441365
image-20211202232441365

递归次数:

image-20211202232636127
image-20211202232636127

memory: Extra space proportional to N

improvement

The cutoff to insertion sort for 7 items.

image-20211202232909946
image-20211202232909946

stop merge if already sorted

image-20211202233020483
image-20211202233020483
Bottom-up Mergesort

No Recursion Needed!

image-20211202235906782
image-20211202235906782
image-20211202235950563
image-20211202235950563

Quick Sort

image-20211203000814682
image-20211203000814682
image-20211203000952586
image-20211203000952586
image-20211203000914548
image-20211203000914548
image-20211203001153937
image-20211203001153937

Shuffling is needed for performance guarantee

image-20211203002033933
image-20211203002033933

best case: \(NlgN\)

Worst case: \(\frac12 N^2\)

Quick-select

Given an array of N items, find a \(k^{th}\) smallest item.

image-20211203193223299
image-20211203193223299
image-20211203193541948
image-20211203193541948

Duplicate keys

Mergesort with DUPLICATE KEYS. Always between \(\frac12NlgN\) and \(NlgN\) compares.

Quicksort with DUPLICATE KEYS. goes quadratic.

image-20211203195130182
image-20211203195130182
image-20211203195212179
image-20211203195212179
image-20211203195248501
image-20211203195248501
image-20211203195629792
image-20211203195629792

Priority Queue

Find the largest M items in a stream of N items.

image-20211203200955694
image-20211203200955694
image-20211203203401559
image-20211203203401559

Unorder array implementation

image-20211203201017612
image-20211203201017612

Binary Heaps

Array representation of a heap-ordered complete binary tree.

  • keys in nodes.
  • Parent’s no smaller than children’s keys.

Array representation:

  • indices start at 1.
  • Take nodes in level order.
  • No explicit links needed!
image-20211203201824853
image-20211203201824853

Root: a[1]

Parent of node at k: k/2

Children of node at k: 2k and 2k+1

Minimum-oriented priority queue: Replace less() with greater().

swim
image-20211203202552732
image-20211203202552732
insert
image-20211203202641697
image-20211203202641697
sink
image-20211203203211443
image-20211203203211443
delete
image-20211203203300032
image-20211203203300032

Heap Sort

image-20211203204242960
image-20211203204242960
  1. 从倒数第二排向上从右到左做 sink,构造二叉树。
  2. 依次取出最大值。
image-20211203204645526
image-20211203204645526

Symbol Tables

image-20211204003821156
image-20211204003821156

Sequential Search (unordered list):

image-20211203233459296
image-20211203233459296

Binary Search (order list)

image-20211203233727084
image-20211203233727084
image-20211203233813284
image-20211203233813284

Binary Search Tree (BST)

  • Larger than all keys in its left subtree
  • Smaller than all keys in its right subtree
image-20211203235000143
image-20211203235000143
image-20211203235040809
image-20211203235040809
image-20211203235200022
image-20211203235200022
Insert
image-20211203235324192
image-20211203235324192
image-20211203235426256
image-20211203235426256
Floor

Largest key less to a given key.

image-20211212184837937
image-20211212184837937
subtree count

In each node, we store the number of nodes in the subtree rooted at that node;

remark. This facilitates efficient implementation of rank() and select()

implementation:

image-20211212185046094
image-20211212185046094
Rank

How much keys < \(k\)

image-20220926123353216
image-20220926123353216
Inorder traversal

from small to large

image-20220926123730615
image-20220926123730615
Deletion
  • Lazy approach (Tombstone)

    image-20220926124150541
  • Hibbard deletion

    • case 0: 0 children

      image-20220926125026950
    • case 1: 1 child

      image-20220926125101190
    • case 2: 2 children

      image-20220926125129935
      image-20220926125129935

    Implementation:

    image-20220926125219834
    image-20220926125219834

    NOT SYMMETRIC:

    Hibbard algorithm lack of symmetry that tends to lead to difficulties.

    image-20220926125508278
    image-20220926125508278
    image-20220926125543012
    image-20220926125543012
Summary
image-20220926123828411
image-20220926123828411

Balance Search Tree

2-3 Search Tree

  • Perfect balance

    Every path from root to null link has same length.
  • Symmetric order

    inoder traversal yields keys in ascending order.

image-20220926131541098
image-20220926131541098
Insert
  • Insert into a 2-node at bottom

    • Search for key.
    • Replace 2-node with 3-node
    image-20220926132043575
    image-20220926132043575
    image-20220926132206184
  • Insert into a 3-node at bottom

    • Add new key to 3-node to create temporary 4-node
    • Move middle key in 4-node into parent
    image-20220926132324553
    image-20220926132324553
    image-20220926132344007
    image-20220926132344007
    image-20220926132411904
  • Summary

    image-20220926132541204
    image-20220926132541204
Performance
  • Worst case: lg N
  • Best case: log3 N

Red-black Tree

  • Represent 2-3 tree as a BST
  • Use “internal” left-leaning links as “glue” for 3-nodes.
image-20220926151236026
image-20220926151236026
  • No node has two red links connected to it.
  • Every path from root to null link has the same number of black links.
  • Red links lean left.
search

Search is the same as for elementary BST (ignore color).

image-20220926151539680
image-20220926151539680
Representation
image-20220926151634311
image-20220926151634311
Operations
  • Left rotation

    image-20220926151732206
    image-20220926151732206
    image-20220926151746431
    image-20220926151746431
    image-20220926151759266
  • Right rotation

    image-20220926151845637
    image-20220926151845637
    image-20220926151858743
    image-20220926151858743
    image-20220926151908383
  • Color flip

    image-20220926151929745
    image-20220926151929745
    image-20220926151944111
    image-20220926151944111
    image-20220926151954953
    image-20220926151954953
Strategy
  • Insert into a tree with exactly 1 node.

    image-20220926152244170
  • Insert into a tree with exactly 2 nodes.

    image-20220926152420871
    image-20220926152420871
Implementation
  • Right child red, left child black: rotate left
  • Left child, left-left grandchild red: rotate right
  • Both children red: flip colors
image-20220926152519770
image-20220926152519770
image-20220926152717821
image-20220926152717821

B-trees

  • At least 2 key-link pairs in other nodes.
  • At least M/2 key-link pairs in other nodes.
  • External nodes contain client keys.
  • Internal nodes contain copies of keys to guide search.

我怎么觉得这是 B + 树(我也不懂啊)。。。

1664254650448
1664254650448
Search
  • Start at root.
  • Find interval for search key and take corresponding link.
  • Search terminates in external node.
1664254773321
1664254773321
Insertion
  • Search for new key.
  • Insert at bottom.
  • Split nodes with M key-link pairs on the way up the tree.
1664254963465
1664254963465

Geometric Appliations of BSTs

1664453671625
1664453671625
Task
1664255494991
1664255494991
Goal
1664255540222
1664255540222
BST implementation
1664255657703
1664255657703
Line segment intersection
1664256021977
1664256021977

KD-Trees

1664256135494
1664256135494
1664257233198
1664257233198
Implementation
1664257298366
1664257298366
1664257523751
1664257523751

Typical case: R + logN

Worst case (assuming tree is balanced): R + N^(1/2)

1664257663004
1664257663004

Typical case: logN

Worst case (even if tree is balanced): N

Interval search trees

1664452720484
1664452720484
1664452760980
1664452760980
Insert

To insert an interval (lo, hi):

  • Insert into BST, using lo as the key.
  • Update max in each node on search path.
1664452921481
1664452921481
Search

To search for any one interval that intersects query interval (lo, hi):

  • If interval in node intersects query interval, return it.
  • Else if left subtrees is null, go right.
  • Else if max endpoint in left subtrees is less than lo, go right.
  • Else go left.
1664452964320
1664452964320
Inplementation

Use a RB BST to guarantee performance.

1664453201119
1664453201119

Rectangle Intersection

1664453459724
1664453459724
1664453507299
1664453507299

(x 维换到时间维度,转换到 1d 问题了?)

1664453614919
1664453614919

Hash table

Hash function

Method for computing array index from key.

1664454080690
1664454080690
1664454145911
1664454145911

Separate Chaining

1664454357261
1664454357261
Implementation
  • get

    1664454550102
  • put

    1664454595670
    1664454595670
Analysis
1664454644486
1664454644486

Linear Probing

  • Hash. Map key to integer i between 0 and M-1.
  • Insert. Put at table index i if free; if not try i+1, i+2, etc.
  • Search. Search table index i; if occupied but no match, try i+1, i+2, etc.
  • Note. Array size M must be greater than number of key-value pairs N.
1664454726971
1664454726971
Impletementation
1664455063921
1664455063921
Analysis
1664455183290
1664455183290
1664455211599
1664455211599
1664455234515
1664455234515
Context
1664455659998
1664455659998
1664455708875
1664455708875
1664455680207
1664455680207
1664455733947
1664455733947