통계적 신호 처리 Overview

통계적 신호 처리란 무엇인지 알아보고 estimation, detection의 정의와 classical, bayesian 접근법에 대해 정리해봅시다.

Statistical signal processing

통계적 신호 처리는 검출 및 추정과 시계열 분석을 다루는 디지털 신호처리의 한 분야로, noise를 포함한 관측값을 사용하여 실제로 관찰할 수 없는 미지의 상태 또는 매개변수 최적값을 추정하는 걸 목표로 합니다. 이중에서, 신호 검출 및 추정 (Signal Detection and Estimation)은 신호를 처리하여 유의미한 정보를 추출하는 것을 목적으로 합니다.

Statistics vs Machine learning

통계학과 머신러닝 모두 '데이터로부터 어떻게 학습할 것인가?' 란 질문에서 시작합니다. 통계학에서는 가설 검정, 추정, 신뢰 구간 등 통계적 추론 방법을 사용하여 데이터를 분석하고 결론을 도출합니다. 반면에, 머신러닝에서는 주로 많은 변수를 가지고 있는 고차원 데이터의 패턴을 학습하여 예측 문제를 해결합니다.

Estimation / Detection

추정 (Estimation): 연속적인 가설 집합을 가짐 → 특정 매개변수나 값을 연속적인 범위로 가정함, 실제로 정확한 값을 찾기 어렵기 때문에 실제 값과의 오차를 최소화하는 것을 목표로 함
- Ex) Radar (항공기 위치 추정), Sonar (잠수함 위치 추정), Speech recognition (음소 추정), image analysis (물체 위치/방향 추정), biomedicine (심박수 추정), Communication (전송된 신호에서 원래 정보를 추출할 수 있도록 carrier frequency 추정)
검출 (Detection): 이산적인 가설 집합을 가짐 → 가능한 가설이 명확히 구분된 이산적인 값으로 이루어짐, 보통 가설이 맞거나 틀리는 경우로 나뉨
- Ex) Radar (항공기 존재 유무 검출), Digital communication (0 또는 1이 전송됐는지 검출), Speech recognition (발음한 숫자 인식), Sonar (잠수함 존재 유무 검출), Image processing (적외선으로 항공기 존재 유무 검출), Biomedicine (부정맥 여부 검출), Seismology (지하 유전 존재 유무 검출)

Classical / Bayesian

Classical: 가설/매개변수가 fixed, non-random
Bayesian: 가설/매개변수를 확률 변수로 취급하며 사전 확률 분포를 가정함 (random value, random variable, random parameter)

- 가설 (Hypothesis): 주어진 입력 데이터에 대해 어떤 출력을 예측하는 함수
- 매개변수 (Parameter): 가설 함수 내부의 조정가능한 요소 (ex. weight, bias)

Mathematical estimation problem

$θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 에 dependent한 N개의 샘플을 가지고 있는 이산 데이터 세트인 경우, 아래와 같은 식으로 나타낼 수 있습니다. 여기서 $g <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>g</mi></math>$ 를 esimate function이라 하는데 $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 를 추정하는 최적의 $g <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>g</mi></math>$ 를 찾는 걸 목표로 합니다.

$x [0], x [1], . . ., x [N - 1] \to ˆ θ = g (x [0], x [1], . . ., x [N - 1]) <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi>x</mi><mo stretchy="false">[</mo><mn>0</mn><mo stretchy="false">]</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mn>1</mn><mo stretchy="false">]</mo><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mi>N</mi><mo>-</mo><mn>1</mn><mo stretchy="false">]</mo></mrow><mo stretchy="false">\to</mo><mrow data-mjx-texclass="ORD"><mover><mi>θ</mi><mo stretchy="false">^</mo></mover></mrow><mo>=</mo><mi>g</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">[</mo><mn>0</mn><mo stretchy="false">]</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mn>1</mn><mo stretchy="false">]</mo><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mi>N</mi><mo>-</mo><mn>1</mn><mo stretchy="false">]</mo><mo stretchy="false">)</mo></math>$

데이터가 random한 특성을 가지고 있어서 정확히 예측할 수 없는 경우, 특정 값이 나올 확률을 $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 의 linear function인 확률밀도함수 (PDF)로 표현합니다.

$p (x [0], x [1], . . ., x [N - 1]; θ) <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mi>p</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">[</mo><mn>0</mn><mo stretchy="false">]</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mn>1</mn><mo stretchy="false">]</mo><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>x</mi><mo stretchy="false">[</mo><mi>N</mi><mo>-</mo><mn>1</mn><mo stretchy="false">]</mo><mo>;</mo><mi>θ</mi><mo stretchy="false">)</mo></math>$

관측값 $x [0] <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi><mo stretchy="false">[</mo><mn>0</mn><mo stretchy="false">]</mo></math>$ 가 가우시안 분포를 따른다고 가정한 경우를 예시로 살펴보겠습니다. $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 는 관측값 $x [0] <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi><mo stretchy="false">[</mo><mn>0</mn><mo stretchy="false">]</mo></math>$ 으로부터 추정됩니다. 만약 관측값이 20이라면, $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 값도 20에 근사하도록 추정되어야 할 것입니다. 현재 가정에서, $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 값은 고정돼 있기 때문에 $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 값을 구하면 확률 분포를 알 수 있습니다. 하지만, 실제 상황에서는 PDF가 주어지지 않기 때문에 constraint와 prior knowledge에 부합하면서 수학적으로 다루기 쉬운 수학적 모델을 선택해야 합니다.

$x [n] <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi><mo stretchy="false">[</mo><mi>n</mi><mo stretchy="false">]</mo></math>$ 은 미지의 파라미터 $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>A</mi></math>$ , $B <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>B</mi></math>$ 와 noise $w [n] <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>w</mi><mo stretchy="false">[</mo><mi>n</mi><mo stretchy="false">]</mo></math>$ 로 구성돼있습니다. $w [n] <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>w</mi><mo stretchy="false">[</mo><mi>n</mi><mo stretchy="false">]</mo></math>$ 은 white gaussian noise로, 평균이 0인 가우시안 분포를 따른다고 가정하기 때문에 iid (independent and identically distribution) 입니다. 따라서, PDF는 주어진 파라미터 $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>A</mi></math>$ , $B <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>B</mi></math>$ 하에서 관측된 신호 $x <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext mathvariant="bold">x</mtext></math>$ 가 발생할 확률을 나타나게 됩니다. 이 예시에서, $Θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi mathvariant="normal">Θ</mi></math>$ 는 모델에서 추정할 파라미터 벡터로, 상수인 $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>A</mi></math>$ 와 기울기 성분인 $B <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>B</mi></math>$ 를 포함하며 모델과 데이터 간 관계를 나타냅니다.

Estimation type

Classical estimation
- 파라미터가 고정 (fixed, deterministic)되어 있다고 가정함
Bayesian estimation
- 파라미터를 확률 변수로 가정하기 때문에 prior knowledge를 활용한 유연한 추정이 가능
- 데이터를 나타내는 joint PDF는 prior knowledge $p (θ) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>p</mi><mo stretchy="false">(</mo><mi>θ</mi><mo stretchy="false">)</mo></math>$ 와 조건부 확률 $p (x | θ) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>p</mi><mo stretchy="false">(</mo><mi>x</mi><mrow data-mjx-texclass="ORD"><mo stretchy="false">|</mo></mrow><mi>θ</mi><mo stretchy="false">)</mo></math>$ 을 결합한 식으로 정의됨

- Estimator: 확률 변수 $x <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext mathvariant="bold">x</mtext></math>$ 를 입력으로 받아 파라미터 $θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>θ</mi></math>$ 를 추정하는 함수 또는 규칙 (RV → RV)
- Estimate: 특정 데이터로부터 얻은 추정값 $ˆ θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mover><mi>θ</mi><mo stretchy="false">^</mo></mover></mrow></math>$

$ˆ θ = g (x) <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mover><mi>θ</mi><mo stretchy="false">^</mo></mover></mrow><mo>=</mo><mi>g</mi><mo stretchy="false">(</mo><mtext mathvariant="bold">x</mtext><mo stretchy="false">)</mo></math>$
- $g <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>g</mi></math>$ : estimator
- $ˆ θ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mover><mi>θ</mi><mo stretchy="false">^</mo></mover></mrow></math>$ : estimate
- $x <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext mathvariant="bold">x</mtext></math>$ : particular value

Accessing estimator performance

estimator의 성능을 어떻게 평가할 수 있는지 살펴보겠습니다. 먼저, noise에 DC 성분이 포함되어 있다고 가정해봅시다. (일반적인 sound signal은 평균값이 0인 AC 성분만 가지는데 DC 성분이 추가되면 평균 값이 0이 아닌 상태가 됨)

candidate estimator로 샘플 평균값과 첫 번째 샘플을 쓴다고 했을 때, 각 estimaor의 mean과 variance를 구함으로써 어떤 것이 더 쓰기 좋은지 알 수 있습니다.

아래 그래프를 살펴보시면, 한 샘플의 값만 사용하는 것보다 샘플 여러 개의 평균값을 사용하는 게 상대적으로 변동성이 작은 걸 확인할 수 있습니다. $var(ˆA)=σ2N<math xmlns="http://www.w3.org/1998/Math/MathML"><mi>v</mi><mi>a</mi><mi>r</mi><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mover><mi>A</mi><mo stretchy="false">^</mo></mover></mrow><mo stretchy="false">)</mo><mo>=</mo><mfrac><msup><mi>σ</mi><mn>2</mn></msup><mi>N</mi></mfrac></math>$ 에서도 알 수 있듯이, 샘플 수가 증가할수록 variance가 줄어들기 때문에 더 stable하고 reliable한 값을 기대할 수 있습니다.

이번에는 이진 가설 검정 (binary hypothesis test) 예시를 살펴보겠습니다. signal이 없고 noise만 존재하는 경우를 귀무가설로, signal이 존재하는 경우를 대립가설로 두었습니다.

threshold 값이 1/2를 넘으면 signal이 존재하고 넘지 않으면 noise만 있다는 판정을 내릴 때, $P F A <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mrow data-mjx-texclass="ORD"><mi>F</mi><mi>A</mi></mrow></msub></math>$ 와 $P D <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mrow data-mjx-texclass="ORD"><mi>D</mi></mrow></msub></math>$ 는 아래와 같이 정의됩니다.

$P F A <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mrow data-mjx-texclass="ORD"><mi>F</mi><mi>A</mi></mrow></msub></math>$ : 실제로 noise만 존재할 때 signal도 있다고 잘못 판정할 확률
$P D <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mi>D</mi></msub></math>$ : 실제로 signal이 존재할 때 있다고 맞게 판정할 확률

이상적인 detector라면 $P D <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mi>D</mi></msub></math>$ 가 높고 $P F A <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mrow data-mjx-texclass="ORD"><mi>F</mi><mi>A</mi></mrow></msub></math>$ 는 낮아야 좋겠지만, ROC curve에서도 알 수 있듯이, 대부분의 시스템에서는 $P D <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mi>D</mi></msub></math>$ 값이 커질수록 $P F A <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>P</mi><mrow data-mjx-texclass="ORD"><mi>F</mi><mi>A</mi></mrow></msub></math>$ 값도 커지는 trade-off가 존재합니다.

GIST 황의석 교수님 '검출및추정' 수업 자료를 바탕으로 쓴 글입니다.

'연구 노트 > 적응신호처리' 카테고리의 다른 글

Linear Prediction 바로 알기 (0)	2024.10.21
Wiener filter 총정리 (0)	2024.10.19
NLMS filter의 Stability (0)	2024.10.18
NLMS filter 수식 유도 (0)	2024.10.17
Adaptive filter 정의, 사용 이유, 활용 예시 (0)	2024.09.03

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Sunny Archive ☀️

통계적 신호 처리 Overview

Statistical signal processing

Statistics vs Machine learning

Estimation / Detection

Classical / Bayesian

Mathematical estimation problem

Estimation type

Accessing estimator performance

'연구 노트 > 적응신호처리' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

통계적 신호 처리 Overview

Statistical signal processing

Statistics vs Machine learning

Estimation / Detection

Classical / Bayesian

Mathematical estimation problem

Estimation type

Accessing estimator performance

'연구 노트 > 적응신호처리' 카테고리의 다른 글

관련글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역