Sear 469, Blue, S9
- 1c i) typo The answer is 21/5, as many of you obtained. I have mistyped 2 with 4.
Week 1
iid Independent means one event’s outcome doesn’t provide any information about another event. Identical means that if a subset of data is sampled from different parts of the dataset, the distribution is the same (identical parameters) Deterministic means non-independent.
Models make a lot of assumptions and many of these assumptions are not met on test data.
The variable types define how we conduct data pre-processing
Data Scaling, Data Format JSON is schema flexible semi-structured data
Week 2
a change of one unit of feature changes the odds ratio by a factor
Week 3
Week 4
Garbage in, Garbage out Data Processing
- (Text Parsing paragraph level)
- Text Tokenizer word level
- Text Normalization generalization
- (Stop word removal information level)
- (Text Lemmatization information level)
- (Text Stemming information level)
Week 5
For better generalization performance, we prefer biased models within the Bias-Variance Trade-off to prevent the high variance that occurs in unbiased models
k-fold cross validation Divide the train data into k mutual exclusive folds. Use k-1 folds to train and validation on the remain fold. It gives us multiple estimates that provide the standard deviation of the performance metric. It gives a more statistically robust estimate. However, It is computationally expensive than using a single validation set.
LOOCV which is most precise estimate with high variance and low bias. Each data point is used as a test instance exactly once, and the remaining data points are used for training. (Leave-one-out cross-validation)
Confusion Matrix
Precision - 분자 고정
Recall score - 분모 고정
For a random classifier, the precision-recall tradeoff shows that precision converges to the actual dataset ratio, while recall can range from 0 to 1 depending on how the prediction threshold affects the proportions in the confusion matrix.
Week 6
- Elastic-net Regression L1 + L2
The elastic net generally performs at least as well as LASSO Regression. While L1 regularization helps with sparsity, using no regularization at all can negatively impact accuracy.
Week 7
Data at rest batch Data in transit streaming
Regret minimisation is used as the main objective when training online learning algorithms
The Perceptron is one of the first learning algorithms that use online learning by updating weights when mistakes are made.
Neural networks make a family of online learning algorithms that learn via SGD
Week 8
- BFS - Time , Space tree branch factor depth
- DFS - Time , Space tree branch factor depth
- Graph or addjacent for adjacent for time complexity with space complexity
- ,
Week 9
- BFS uses Queue while DFS uses Stack. BFS is better when the goal is closer while DFS is better when the goal is far from source
- BFS can find the optimal solution while DFS cannot
- Sometimes BFS explores all states (Uninformed Search BFS, DFS), so
- GBFS - Time , Space tree branch factor depth without optimal guarantee
- guarantees optimal using Time , Space
Week 10
reduces variance e.g.
The weighting can be unequal between weak learners (accuracy confidence context). iterative weak learners to create strong learner
Seonglae Cho