Machine Learning - Hypothesis
Hypothesis
H is a set of hypotheses, h is a member of H
Hypothesis h
一種想法或提議的解釋
透過實驗調查驗證
Hypothesis space H
所有可能假設的集合
透過某些表示法定義,例如線性函數
會透過演算法尋找 H 之中最符合該資料集觀察出的 h
- Most Specific Hypotheses
與觀察到的訓練結果一致,如果再將範圍縮小,則會變得不一致 - Most General Hypotheses
與觀察到的訓練結果一致,如果再將範圍擴大,則會變得不一致
General Boundary:最寬鬆的 version space,只要不包含 negative example 的空間都屬於
Specific Boundary:最嚴謹的 version space,只有剛好包含所有 positive example 的空間屬於
Version Space
假設有一 H 和給定的訓練資料集,version space 是所有與該資料集一致的所有 H 的子集合,也就是上圖中長方形的區域
X: The Input Instance Space
Target function:
Six input attributes:
Feature | Category | Number of feature category |
---|---|---|
Price | 3 | |
Engine Power | 2 | |
Maintenance | 2 | |
Doors | ${2, 4&mores }$ |
2 |
Trunk Size | 2 | |
Safety | 2 |
Input attributes there in the data set is 6
Size of the Input instance space
Syntactically distinct number
Semantically distinct number 1
,表示所有值都不選擇的狀況)
Don’t care value:
No value allowed:
General-to-Specific Ordering over H
Let h1 = <?, ?, ?, 4&more, ?, High>
h2 = <?, Moderate, ?, 4&more, ?, High>
then any example that satisfies h2 also satisfies h1
Candidate-Elimination Algorithm
幫助 version space 的 boundary more specific 和 more general
- 找到 Positive example 時會讓 specific boundary more general
- 找到 Negative example 時會讓 general boundary more specific
Reference
- 黃貞瑛老師的機器學習課程 - Concept Learning