Chapter 2: Unsorted/Sorted Lists

CS/자료구조

Chapter 2: Unsorted/Sorted Lists

arsenic-dev 2025. 3. 23. 18:21

경희대학교 박제만 교수님의 자료구조 수업을 기반으로 정리한 글입니다.

List Definitions

Linear relationship:

Each element except the first has a unique predecessor(전임자), and each element except the last has a unique successor(후임자).

Length:

The number of items in a list; the length can vary over time. (e.g., append, remove)

Unsorted list vs Sorted list

※ 배열(Array) vs 리스트(List)

C 같은 언어에서 사용하는 배열(Array)은 크기가 고정되어 있어서 한 번 선언하면 크기를 바꿀 수 없음
Python의 list나 C++의 std::vector 같은 리스트(List) 구조는 크기를 동적으로 변경할 수 있음

Unsorted List

▶ A list in whitch data items are placed in no particular order.

Basic ADT Operations

Constructor:

creates a new instance (object) of an ADT

Transformer:

changes the state of one or more of the data values of an instance

Observer:

observes the state of the data values without changing them

ADT 연산	설명	C++ 예제
Constructor (생성자)	새로운 객체 생성	BankAccount("Alice", 1000.0); // 계좌 초기 상태 설정
Transformer (변경 연산)	객체 내부 데이터 변경	deposit(), withdraw() // 입금, 출금 -> balance 변경
Observer (조회 연산)	객체 내부 데이터 변경 없이 조회	showBalance() // 잔고 조회

▶ Basic ADT Operations

Unsorted List

Operations - Logical Level

▶ Operations

Definition

▶ Implementation Level

※ .h (헤더 파일): 선언 / .cpp: 구현

※ typedef: 기존 타입에 별칭(새 이름)을 붙여주는 기능으로, type을 바꿀 때 그냥 int, float 이런 식으로 지정하면 코드 수정시 여러 부분을 바꿔야 하지만, typedef를 사용하면 typedef 부분만 수정하면 되기에, 코드가 훨씬 flexible해진다.

▶ Basic Structure - Logical Level

Class Constructor

A special member function of a class that is implicitly invoked(호출되는) when a class object is created.

It cannot return a function value and has no return value type.
A class can have multiple constructors (with different parameter sets).
Parameterless constructor is the default constructor.

▶ Implementation Level

Observer

size()

▶ Implementation Level

isFull()

▶ Implementation Level

isEmpty()

▶ Implementation Level

Transformer

appendItem()

▶ Logical Level

▶ Implementation Level

insertItem()

▶ Logical Level

▶ Implementation Level

appendItem() vs insertItem()

appendItem() adds an item "value" at the end of the list.

insertItem() adds an item "value" at the specific position.

What is faster?

More Computations take more time!

Time Complexity

What is an efficient algorithm/program?

An algorithm/program that uses smaller resources and takes a shorter time.

How do we know?

More commputations take more time!

If we can estimate the number of computations, we can find a more efficient algorithm/program.

appendItem()

▶ appendItem(7)

1. add Item at the end of the list - 1 computation

2. add 1 to the length - 1 computation

-> 2 computations

If "length" == 1000?

It still takes 2 computations. (constant)

insertItem()

▶ insertItem(pos=1, value=7)

1. move data[2] to data[3] - 1 computation

2. move data[1] to data[2] - 1 computation

3. insert "value" to data[1] - 1 computation

4. add 1 to the length - 1 computation

-> 4 computations

If "length" == 1000 and "pos" ==1?

It should perform "move" 999 times, "insert" one time, and "add" one time.
It takes 1001 computations.

Time Complexity (Big-O)

How many computations should be done?

appendItem(): always 2 computations
insertItem(): depending on "length" and "pos"

When we compute an efficiency, we (commonly) consider the worst case.

Big-O notation - O(?)

What is the worst case for:

appendItem(): All cases are the same. (2 computations)
- O(2) -> O(1)
insertItem(): If the "length" == N and "pos" == 0, it should perform (N + 2 computations)
- O(N+2) -> O(N)

Theorems

Theorem 1: Ignore all coefficients(계수).

O(2N) -> O(N)
O(3N^2) -> O(N^2)
O(10) -> O(1)

Theorem 2: Ignore significantly smaller terms(항).

O(N+1) -> O(N)
O(N^2 + 3N + 5) -> O(N^2)
O(2^N + 4N^3 + 5N^2 + 7N + 6) -> O(2^N)

※ 지수 함수(2^N)는 다항 함수(N^3)보다 훨씬 빠르게 증가한다.

Comparison

▶ Big-O examples

▶ 연산이 가장 적은 O(1)이 가장 효율적이고, 연산이 가장 많은 O(2^N)이 가장 비효율적이다.

▶ P = NP problem

P = NP가 참이라면 NP 문제를 풀기 위한 알고리즘도 다항 시간 내에 해결이 가능하게 된다.

P = NP가 증명되면, 당신의 모든 암호화폐(디지털 화폐) 가치가 0원이 된다.

P = 컴퓨터가 풀기 쉬운 문제들의 집합

ex) 1+1

NP = 컴퓨터가 검증하기 쉬운 문제들의 집합

ex) 누군가의 아이폰 비밀번호, 스도쿠

쉬운 문제란?

input의 길이가 증가할 때 계산 시간이 적당히 같이 커지면 쉬운 문제이고,

급격하게 너무 커지면 어려운 문제이다.

즉, input의 길이가 N일 때 계산 시간이 N의 다항식에 비례하여 증가하게 되면 쉬운 문제라고 하고,

P 집합 안에 있다고 생각한다.

반대로, 지수함수로 커지게 되면 어려운 문제라고 한다.

P = NP인가?

풀기 쉬운 문제는 검증도 쉽다. (당연) NP ⊃ P

검증하기 쉬운 문제는 풀기도 쉬운가? (난제) NP ⊂ P?

P = NP이면, 즉 검증하기 쉬운 문제가 풀기도 쉬우면,

누군가의 아이폰 비밀번호가 맞는지 검증하기 쉬우니 풀기도 쉽다.

단지 알고리즘, 푸는 방법을 모를 뿐이다.

즉, P = NP이면 모든 암호체계를 무너뜨리는 알고리즘이 이론적으로 존재한다.

뿐만 아니라, 컴퓨터가 인간의 창의력을 알고리즘으로 대체할 수 있다는 소리이기도 하다.

ex) 좋은 노래인지 검증하는 건 쉽지만, 작곡하는 것은 어렵다.

물론 현재로서는 P ≠ NP가 일반적으로 받아들여지고 있지만, 이 문제는 아직 해결되지 않았다.

Exercises

O(5N + 3M^2) -> O(N + M^2)
Copy the Array A into empty Array B (A's length = N) -> O(N)
Find the max/min item in Array C (C's length = N) -> O(N)
Sort the array D (D's length = N) (e.g., [7, 2, 5, 6, 3] -> [2, 3, 5, 6, 7]) -> O(N^2)
- 최소값 찾는 걸 N번 한다고 생각하면 됨 (selection sort)
Find the N-th item in the Fibonacci series -> O(2^N)

피보나치수열의 시간복잡도

▶ 시간 복잡도: O(2^N)

▶ 시간 복잡도: O(N)

중복되는 부분에 대해, 값을 가지고 있면 재귀 호출을 하지 않고 값만 반환하는 방식으로 시간 복잡도 개선

You should be aware of how to calculate the time complexity of your program.

Rethink: insertItem()

O(N), is it the best? We can improve it.

▶ improved_insertItem(pos=1, value=7)

1. move data[1] to data[3] - 1 computation

2. insert "value" to data[1] - 1 computation

3. add 1 to the length - 1 computation

-> 3 computations

How about "length" == 1000?

Still, 3 computations. -> O(1)

Unsorted list, 즉 정렬되지 않은 리스트이기에 순서가 바뀌어도 상관이 없다.

▶ Implementation Level

removetItem()

▶ removeItem(2)

1. find "value" in the list - k computation (assume "value" is at [k])

2. move data[2] to data[1] - 1 computation

3. move data[3] to data [2] - 1 computation

2, 3 -> N-k moves

1, 2, 3 -> O(N)

4. subtract 1 from "length" - 1 computation

4 -> O(1)

-> O(N+1) = O(N)

We can make it shorter. N-k moves -> 1 move

▶ improved_removeItem(2)

1. find "value" in the list - k computation (assume "value" is at [k])

2. move the last item (data[3]) to data [1] - 1 computation

3. subtract 1 from "length" - 1 computation

-> O(k+1+1) = O(N)

Theoretically, removeItem() and improved_removeItem() are equally fast.

물론, 실질적으론 improved_removeItem()이 removeItem()보다 computation이 더 적다.

▶ Implementation Level

※ 일반적으로 for문이 1개 있으면 시간 복잡도는 O(N), 이중 for문이면 O(N^2)이다.

updateItem()

▶ Implementation Level

시간 복잡도: O(1)

clear()

▶ Implementation Level

시간 복잡도: O(1)

data[] still has values, but appendItem() will overwrite them.

위 코드 방식으로 하면, 리스트에 실제로 값들이 남아 있기는 하지만, 더 이상 유효한 값으로 보지 않는다.

이 방식 말고도, 리스트에 있는 값들을 하나하나 0 혹은 -1로 바꿔주어도 되기는 하다.

시간 복잡도: O(N)
- 값을 찾고, 값을 바꾸는 작업이 for문에 의해 반복되어 시간 복잡도가 O(N^2)이 되는 것이 아니라, for문(O(N)) 안에서 각 리스트의 값을 변경하는 작업은 시간 복잡도가 O(1) 이므로 전체 시간 복잡도는 O(N)이 됨

getItem()

▶ Implementation Level

시간 복잡도: O(1)

※ throw는 예외를 발생시키고, try-catch 구문으로 그 예외를 처리할 수 있다.

findItem()

▶ Implementation Level

시간 복잡도: O(N)

Sorted List

▶ A list that is sorted by the value in the key;

there is a semantic(의미론적인) relationship among the keys of the items in the list.

What is Key?

The attributes(속성) that are used to determine the logical order of the list.

▶ Sort by Name (Ascending, 오름차순)

▶ Sort by GPA (Descending, 내림차순)

Sorted List

Definition

▶ Implementation Level

appendItem()과 updateItem()은 Sorted List에선 활용성이 떨어져 굳이 사용하지 않는다.

insetItem()은 포지션을 찍어서 넣어준다기 보단, 특정한 값을 정해진 order에 맞게 넣어주는 식으로 바꿔주어야 한다.

Time Complexity (Big-O)

insertItem()

Can we use improved_insertItem() of Unsorted List for Sorted List?

No (violation, 위반)

▶ insertItem(5)

1. find the proper location for a new item (5) in the list

2. move all the later items by one slot

3. put an item in the list

4. add 1 to "length"

-> O(N+N+1+1) = O(N)

▶ Implementation Level

시간 복잡도의 경우 O(N)이 최선이지만, while문 부분을 binary search 방식을 사용하여 좀 더 개선할 수는 있다.

removetItem()

Can we use improved_removeItem() of Unsorted List for Sorted List?

No (violation, 위반)

▶ Implementation Level

시간 복잡도: O(N+N+1) = O(N)

findItem()

Can we use findItem of Unsorted List for Sorted List?

Yes, but.. inefficient

▶ Implementation Level (ver. 2)

정렬된 리스트에서 탐색할 때, 찾는 값보다 큰 값이 나오면 이후 요소들도 더 클 것이므로,

굳이 끝까지 탐색하며 불필요한 연산을 할 필요 없이 바로 탐색을 종료하면 된다.

물론, 이 방법도 Linear Search 방식이기에 시간 복잡도는 동일하게 O(N)이 걸린다.

다만, 뒤에 의미없는 연산을 줄여 실질적인 연산 횟수가 기존 방식에 비해 좀 더 효율적일 수 있다.

※ Linear Search: 직선의 형태로 앞에서부터 하나씩 비교하는 방식

It looks better. Is it the best?

Search in Sorted List

Ver. 2 is better than Ver. 1.

But, in both ver. 1 and ver. 2, we use Linear Search.

Improving Searching

We want to improve the search algorithm.

How?

Narrowing down the search scope.

Binary Search

▶ findItem(45)

한 번 비교할 때마다 한 개씩만 버리는 Linear Search와는 달리,

Binary Search는 한 번 비교할 때마다 리스트 길이의 절반씩 버려 탐색 범위를 기하급수적으로 줄인다.

findItem()

▶ Implementation Level (Binary)

How efficient is binary search?

Let's think about the worst case when the length is 1000.

R1. Mid 500.
R2. Mid 250.
R3. Mid 375.
R4. Mid 312.
R5. Mid 281.
R6. Mid 265.
R7. Mid 257.
R8. Mid 253.
R9. Mid 251.

-> 251 or 252

We can decide if the value is in the list in 10 computations.

이렇게 계속해서 범위가 반으로 나뉘어지는 방식의 시간 복잡도는 O(log N)이다.

※ 일반적인 수학에선 log 밑이 생략되면 10이 생략된 상용로그로 보지만, 컴퓨터 쪽에선 2가 생략된 걸로 본다.

⭐ Big-O Comparison of List Operations

▶ Time Complexity (Big-O)

참고자료

https://www.youtube.com/watch?v=vMjO4M4hFq8

https://www.youtube.com/watch?v=VcCkPrGaKrs

'CS > 자료구조' 카테고리의 다른 글

Chapter 4: Queue (0)	2025.03.31
Chapter 3: Stack (1)	2025.03.28
Programming Exercise: Lab #1 (5)	2025.03.25
Chapter 1: Software, Data Structure, and C++ (2)	2025.03.22
Chapter 0: Introduction (1)	2025.03.21

현재글Chapter 2: Unsorted/Sorted Lists

arsenic 알쓰닉?arsenic-dev 님의 블로그입니다.

arsenic 알쓰닉?

array, 연관 관계, Memory, stack, computer architecture, Branch, http, addition, shellcode, control signal, direct-mapped cache, 소프트웨어 개발 생명주기 모델, write-back, reordering, 실체화 관계, memread, regwrite, burp suite, XSS, 일반화 관계,

Today :
Yesterday :

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

arsenic 알쓰닉?