IR CW2

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2025 Apr 9 14:13
Editor
Edited
Edited
2025 Apr 10 13:9
Refs
Refs
  1. devide files first with shared.py
google fire 로 만들어서 common.py --task1 true 일때만 eval bm25 실행하고 --task2 일때만 logistic 실행하고 --task3 일때만 실행하고 그래 cache load 도 필요한거만 하고 ㅇㅇ
 
remote all unnecessary comments and codes wioth maintainineg same features and output
 

Results

  • task1
❯ python task1.py Using device: cuda Loading validation data... Loaded 1103039 rows from validation_data.tsv Fitting BM25 on validation data... Fitting BM25... Tokenizing passages: 100%|█████████████████████████████| 955211/955211 [00:32<00:00, 29173.89it/s] Calculating IDF: 100%|███████████████████████████████| 692567/692567 [00:00<00:00, 2904682.59it/s] Calculating TF: 100%|█████████████████████████████████| 955211/955211 [00:06<00:00, 137888.89it/s] Saving doc_tf_955211 to cache took 18.84 seconds Saving bm25_fit_955211 to cache took 25.29 seconds BM25 fit complete. --- Task 1: Evaluation Metrics --- Evaluating BM25 on validation data... Scoring validation with BM25: 100%|███████████████████████████| 1148/1148 [00:17<00:00, 66.52it/s] BM25 Validation Performance: AP: 0.2152 NDCG: 0.3566 NDCG@10: 0.2657 NDCG@20: 0.2923 NDCG@30: 0.3046 NDCG@40: 0.3111 NDCG@50: 0.3156 NDCG at different cutoffs: NDCG@10: 0.2657 NDCG@20: 0.2923 NDCG@30: 0.3046 NDCG@40: 0.3111 NDCG@50: 0.3156 ap: 0.21519320769731304 ndcg: 0.3566259066774915 ndcg@10: 0.2657309767687027 ndcg@20: 0.2923048464994246 ndcg@30: 0.30463093629540466 ndcg@40: 0.31113522220035617 ndcg@50: 0.31564234358348847
  • task2
❯ python task2.py Using device: cuda Loading data... Loaded 4364339 rows from train_data.tsv Loaded 1103039 rows from validation_data.tsv Loaded 200 rows from test-queries.tsv Loaded 189877 rows from candidate_passages_top1000.tsv --- Task 2: Logistic Regression --- Loading GloVe embeddings from cache Loading glove_embeddings_100 from cache took 0.36 seconds Loading embedding features from cache Loading embedding_features_4364339 from cache took 154.88 seconds Loading embedding features from cache Loading embedding_features_1103039 from cache took 42.93 seconds Training Logistic Regression models... Training with Learning Rate: 0.1 wandb: Currently logged in as: seonglae (texonom). Use `wandb login --relogin` to force relogin wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information. wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_023530-mrf47xlw wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run lr_learning_rate=0.1_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/mrf47xlw Training LR (LR=0.1, Batch Size=512): 100%|█████| 8525/8525 [00:21<00:00, 388.06it/s, Batch Loss=0.0268] Evaluating LR (LR=0.1) on validation data... Ranking validation (LR=0.1): 100%|████████████████████████████████| 1148/1148 [00:00<00:00, 1829.73it/s] LR (LR=0.1) Validation Performance: AP: 0.0115 NDCG: 0.1317 NDCG@10: 0.0112 NDCG@20: 0.0139 NDCG@30: 0.0151 NDCG@40: 0.0182 NDCG@50: 0.0207 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▄▆▂▄▄▄▄▂▂▃▁▃▄▃▆▅▁▄▃▂▄▆▆▄▂▄▂▂▁▃▃▁▃▂▆▅█▇▄▆ wandb: wandb: Run summary: wandb: eval/ap 0.01154 wandb: eval/ndcg 0.13166 wandb: eval/ndcg@10 0.01115 wandb: eval/ndcg@20 0.01395 wandb: eval/ndcg@30 0.01506 wandb: eval/ndcg@40 0.01823 wandb: eval/ndcg@50 0.02074 wandb: train/loss 0.02432 wandb: wandb: 🚀 View run lr_learning_rate=0.1_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/mrf47xlw wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_023530-mrf47xlw/logs Training with Learning Rate: 0.01 wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_023557-ihrtm85p wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run lr_learning_rate=0.01_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/ihrtm85p Training LR (LR=0.01, Batch Size=512): 100%|████| 8525/8525 [00:23<00:00, 356.62it/s, Batch Loss=0.0196] Evaluating LR (LR=0.01) on validation data... Ranking validation (LR=0.01): 100%|███████████████████████████████| 1148/1148 [00:00<00:00, 1683.86it/s] LR (LR=0.01) Validation Performance: AP: 0.0115 NDCG: 0.1315 NDCG@10: 0.0109 NDCG@20: 0.0135 NDCG@30: 0.0150 NDCG@40: 0.0179 NDCG@50: 0.0204 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▂▅▂▅▂▄█▂▄▂▅▂▄▄▅▂▄▂█▂▂▂▂▄▂▅▂▂▂▂▃▁▄▂▆▂▄▄▄▂ wandb: wandb: Run summary: wandb: eval/ap 0.01149 wandb: eval/ndcg 0.13152 wandb: eval/ndcg@10 0.01091 wandb: eval/ndcg@20 0.01353 wandb: eval/ndcg@30 0.01503 wandb: eval/ndcg@40 0.01788 wandb: eval/ndcg@50 0.0204 wandb: train/loss 0.02445 wandb: wandb: 🚀 View run lr_learning_rate=0.01_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/ihrtm85p wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_023557-ihrtm85p/logs Training with Learning Rate: 0.001 wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_023629-v0pkfknj wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run lr_learning_rate=0.001_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/v0pkfknj Training LR (LR=0.001, Batch Size=512): 100%|███| 8525/8525 [00:30<00:00, 278.92it/s, Batch Loss=0.0206] Evaluating LR (LR=0.001) on validation data... Ranking validation (LR=0.001): 100%|██████████████████████████████| 1148/1148 [00:00<00:00, 1600.86it/s] LR (LR=0.001) Validation Performance: AP: 0.0114 NDCG: 0.1315 NDCG@10: 0.0111 NDCG@20: 0.0139 NDCG@30: 0.0150 NDCG@40: 0.0180 NDCG@50: 0.0207 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▆▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01145 wandb: eval/ndcg 0.13147 wandb: eval/ndcg@10 0.01111 wandb: eval/ndcg@20 0.01388 wandb: eval/ndcg@30 0.01499 wandb: eval/ndcg@40 0.01799 wandb: eval/ndcg@50 0.02067 wandb: train/loss 0.02824 wandb: wandb: 🚀 View run lr_learning_rate=0.001_n_iterations=1_weight_decay=0.1_embedding_dim=100_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/v0pkfknj wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_023629-v0pkfknj/logs Learning Rate Analysis (Loss): Loss plot saved to ./output/lr_loss_analysis.png Best LR model based on NDCG@10 has Learning Rate: 0.1 Creating LR features for test set... Loading embedding features from cache Loading embedding_features_189877 from cache took 8.60 seconds Predicting with best LR model... Ranking test (LR): 100%|████████████████████████████████████████████| 200/200 [00:00<00:00, 1230.79it/s] Writing LR results: 100%|███████████████████████████████████████████| 200/200 [00:00<00:00, 9274.51it/s] Results saved to ./output/LR.txt ap: 0.011544634226747933 ndcg: 0.13166429680479658 ndcg@10: 0.011152848740558226 ndcg@20: 0.013947077461143872 ndcg@30: 0.01506142024664088 ndcg@40: 0.018232905143929216 ndcg@50: 0.020743151615587558
  • task3
❯ python task3.py Using device: cuda Loading data... Loaded 4364339 rows from train_data.tsv Loaded 1103039 rows from validation_data.tsv Loaded 200 rows from test-queries.tsv Loaded 189877 rows from candidate_passages_top1000.tsv Fitting BM25 on Train+Validation data... Loading BM25 model from bm25_fit_3429679 Loading bm25_fit_3429679 from cache took 12.08 seconds Precomputing TF-IDF vectors for validation passages... Loading TF-IDF vectors from cache Loading tfidf_vectors_955211_1768065 from cache took 3.57 seconds --- Task 3: LambdaMART (XGBoost) --- Loading GloVe embeddings from cache Loading glove_embeddings_100 from cache took 0.37 seconds Loading LambdaMART features from cache Loading lm_features_-8945891972231880236 from cache took 0.60 seconds Loading LambdaMART features from cache Loading lm_features_-5156596684037014774 from cache took 0.16 seconds Hyperparameter tuning for LambdaMART... Syncing run lm_eta=0.1_max_depth=3_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.22019 train-ndcg:0.26077 eval-ndcg@10:0.20570 eval-ndcg:0.25111 [20] train-ndcg@10:0.26531 train-ndcg:0.30598 eval-ndcg@10:0.25978 eval-ndcg:0.30193 [40] train-ndcg@10:0.27352 train-ndcg:0.31471 eval-ndcg@10:0.27033 eval-ndcg:0.30994 [48] train-ndcg@10:0.27647 train-ndcg:0.31596 eval-ndcg@10:0.26947 eval-ndcg:0.30976 Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:01<00:00, 1006.47it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2196 NDCG: 0.3603 NDCG@10: 0.2690 *** New best model found with NDCG@10: 0.2690 *** Run summary: eval/ap 0.21957 eval/ndcg 0.36031 lm_eta=0.1_max_depth=3_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=3_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.21560 train-ndcg:0.25952 eval-ndcg@10:0.19777 eval-ndcg:0.24654 [20] train-ndcg@10:0.26758 train-ndcg:0.30770 eval-ndcg@10:0.25609 eval-ndcg:0.30035 [40] train-ndcg@10:0.27384 train-ndcg:0.31425 eval-ndcg@10:0.26658 eval-ndcg:0.30771 [60] train-ndcg@10:0.28032 train-ndcg:0.32033 eval-ndcg@10:0.26733 eval-ndcg:0.30825 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2406.29it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2198 NDCG: 0.3606 NDCG@10: 0.2695 *** New best model found with NDCG@10: 0.2695 *** Run summary: eval/ap 0.2198 eval/ndcg 0.36058 lm_eta=0.1_max_depth=3_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=5_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.21777 train-ndcg:0.25999 eval-ndcg@10:0.18900 eval-ndcg:0.23203 [20] train-ndcg@10:0.30406 train-ndcg:0.34280 eval-ndcg@10:0.25687 eval-ndcg:0.30328 [36] train-ndcg@10:0.31730 train-ndcg:0.35624 eval-ndcg@10:0.26207 eval-ndcg:0.30335 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2480.90it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2140 NDCG: 0.3561 NDCG@10: 0.2624 Run summary: eval/ap 0.21404 eval/ndcg 0.35613 lm_eta=0.1_max_depth=5_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=5_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.22380 train-ndcg:0.26677 eval-ndcg@10:0.19401 eval-ndcg:0.23902 [20] train-ndcg@10:0.30390 train-ndcg:0.34278 eval-ndcg@10:0.25962 eval-ndcg:0.29974 [23] train-ndcg@10:0.30544 train-ndcg:0.34488 eval-ndcg@10:0.25909 eval-ndcg:0.30129 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|███████████████████████████████████████████| 1148/1148 [00:01<00:00, 995.35it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2132 NDCG: 0.3553 NDCG@10: 0.2652 Run summary: eval/ap 0.21316 eval/ndcg 0.3553 lm_eta=0.1_max_depth=5_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=3_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.22019 train-ndcg:0.26077 eval-ndcg@10:0.20570 eval-ndcg:0.25111 [20] train-ndcg@10:0.26508 train-ndcg:0.30537 eval-ndcg@10:0.25584 eval-ndcg:0.29945 [37] train-ndcg@10:0.26712 train-ndcg:0.30800 eval-ndcg@10:0.25649 eval-ndcg:0.29974 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2479.89it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2154 NDCG: 0.3569 NDCG@10: 0.2633 Run summary: eval/ap 0.21535 eval/ndcg 0.35685 lm_eta=0.05_max_depth=3_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=3_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.21560 train-ndcg:0.25952 eval-ndcg@10:0.19777 eval-ndcg:0.24654 [20] train-ndcg@10:0.26551 train-ndcg:0.30606 eval-ndcg@10:0.25580 eval-ndcg:0.30035 [28] train-ndcg@10:0.26588 train-ndcg:0.30702 eval-ndcg@10:0.25241 eval-ndcg:0.29921 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2537.79it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2132 NDCG: 0.3549 NDCG@10: 0.2616 Run summary: eval/ap 0.21317 eval/ndcg 0.35486 lm_eta=0.05_max_depth=3_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=5_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.21777 train-ndcg:0.25999 eval-ndcg@10:0.18900 eval-ndcg:0.23203 [20] train-ndcg@10:0.29526 train-ndcg:0.33497 eval-ndcg@10:0.26208 eval-ndcg:0.30177 [24] train-ndcg@10:0.29800 train-ndcg:0.33742 eval-ndcg@10:0.26368 eval-ndcg:0.30334 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|███████████████████████████████████████████| 1148/1148 [00:01<00:00, 991.07it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2116 NDCG: 0.3539 NDCG@10: 0.2611 Run summary: eval/ap 0.21158 eval/ndcg 0.35388 lm_eta=0.05_max_depth=5_subsample=0.8_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=5_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.22380 train-ndcg:0.26677 eval-ndcg@10:0.19401 eval-ndcg:0.23902 [20] train-ndcg@10:0.29625 train-ndcg:0.33569 eval-ndcg@10:0.25923 eval-ndcg:0.30296 [25] train-ndcg@10:0.29998 train-ndcg:0.33807 eval-ndcg@10:0.25884 eval-ndcg:0.30167 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2453.44it/s] Validation Performance (Params: {'colsample_bytree': 0.8, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2146 NDCG: 0.3562 NDCG@10: 0.2620 Run summary: eval/ap 0.21463 eval/ndcg 0.3562 lm_eta=0.05_max_depth=5_subsample=1.0_colsample_bytree=0.8_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=3_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.16085 train-ndcg:0.20432 eval-ndcg@10:0.14774 eval-ndcg:0.19503 [20] train-ndcg@10:0.25772 train-ndcg:0.29968 eval-ndcg@10:0.24434 eval-ndcg:0.28878 [40] train-ndcg@10:0.26886 train-ndcg:0.30968 eval-ndcg@10:0.26218 eval-ndcg:0.30321 [51] train-ndcg@10:0.27459 train-ndcg:0.31535 eval-ndcg@10:0.25891 eval-ndcg:0.30009 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2470.87it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2182 NDCG: 0.3593 NDCG@10: 0.2690 Run summary: eval/ap 0.21817 eval/ndcg 0.35932 lm_eta=0.1_max_depth=3_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=3_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.16332 train-ndcg:0.20676 eval-ndcg@10:0.14969 eval-ndcg:0.19568 [20] train-ndcg@10:0.25789 train-ndcg:0.29986 eval-ndcg@10:0.24398 eval-ndcg:0.28736 [40] train-ndcg@10:0.26933 train-ndcg:0.31059 eval-ndcg@10:0.25828 eval-ndcg:0.29862 [48] train-ndcg@10:0.27294 train-ndcg:0.31414 eval-ndcg@10:0.25657 eval-ndcg:0.29739 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|███████████████████████████████████████████| 1148/1148 [00:01<00:00, 996.76it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2120 NDCG: 0.3538 NDCG@10: 0.2614 Run summary: eval/ap 0.21204 eval/ndcg 0.35384 lm_eta=0.1_max_depth=3_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=5_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.20338 train-ndcg:0.24589 eval-ndcg@10:0.18701 eval-ndcg:0.23185 [20] train-ndcg@10:0.29523 train-ndcg:0.33387 eval-ndcg@10:0.25883 eval-ndcg:0.30422 [31] train-ndcg@10:0.30443 train-ndcg:0.34396 eval-ndcg@10:0.26321 eval-ndcg:0.30477 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2450.34it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2154 NDCG: 0.3568 NDCG@10: 0.2627 Run summary: eval/ap 0.21542 eval/ndcg 0.35685 lm_eta=0.1_max_depth=5_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.1_max_depth=5_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.20201 train-ndcg:0.24555 eval-ndcg@10:0.18321 eval-ndcg:0.22893 [20] train-ndcg@10:0.29706 train-ndcg:0.33638 eval-ndcg@10:0.26394 eval-ndcg:0.30550 [33] train-ndcg@10:0.31134 train-ndcg:0.34996 eval-ndcg@10:0.25742 eval-ndcg:0.30081 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2472.17it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2204 NDCG: 0.3610 NDCG@10: 0.2718 *** New best model found with NDCG@10: 0.2718 *** Run summary: eval/ap 0.22044 eval/ndcg 0.36105 lm_eta=0.1_max_depth=5_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=3_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.16085 train-ndcg:0.20432 eval-ndcg@10:0.14774 eval-ndcg:0.19503 [19] train-ndcg@10:0.25531 train-ndcg:0.29734 eval-ndcg@10:0.24073 eval-ndcg:0.28499 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|███████████████████████████████████████████| 1148/1148 [00:01<00:00, 987.10it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2022 NDCG: 0.3448 NDCG@10: 0.2508 Run summary: eval/ap 0.20224 eval/ndcg 0.34477 lm_eta=0.05_max_depth=3_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=3_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.16332 train-ndcg:0.20676 eval-ndcg@10:0.14969 eval-ndcg:0.19568 [20] train-ndcg@10:0.25461 train-ndcg:0.29604 eval-ndcg@10:0.23916 eval-ndcg:0.28411 [40] train-ndcg@10:0.26027 train-ndcg:0.30192 eval-ndcg@10:0.24627 eval-ndcg:0.29084 [60] train-ndcg@10:0.26350 train-ndcg:0.30541 eval-ndcg@10:0.25016 eval-ndcg:0.29323 [65] train-ndcg@10:0.26526 train-ndcg:0.30691 eval-ndcg@10:0.25122 eval-ndcg:0.29402 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2418.63it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 3, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2094 NDCG: 0.3519 NDCG@10: 0.2576 Run summary: eval/ap 0.20944 eval/ndcg 0.35191 lm_eta=0.05_max_depth=3_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=5_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.20338 train-ndcg:0.24589 eval-ndcg@10:0.18701 eval-ndcg:0.23185 [20] train-ndcg@10:0.28664 train-ndcg:0.32758 eval-ndcg@10:0.26051 eval-ndcg:0.30217 [26] train-ndcg@10:0.29121 train-ndcg:0.33146 eval-ndcg@10:0.26119 eval-ndcg:0.30329 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|██████████████████████████████████████████| 1148/1148 [00:00<00:00, 2395.67it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 0.8, 'tree_method': 'gpu_hist'}): AP: 0.2147 NDCG: 0.3555 NDCG@10: 0.2621 Run summary: eval/ap 0.21473 eval/ndcg 0.35548 lm_eta=0.05_max_depth=5_subsample=0.8_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Syncing run lm_eta=0.05_max_depth=5_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg Training with params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} E.g. tree_method = "hist", device = "cuda" self.starting_round = model.num_boosted_rounds() [0] train-ndcg@10:0.20201 train-ndcg:0.24555 eval-ndcg@10:0.18321 eval-ndcg:0.22893 [20] train-ndcg@10:0.29043 train-ndcg:0.32965 eval-ndcg@10:0.25734 eval-ndcg:0.29810 [40] train-ndcg@10:0.30567 train-ndcg:0.34512 eval-ndcg@10:0.26130 eval-ndcg:0.30292 [50] train-ndcg@10:0.31354 train-ndcg:0.35216 eval-ndcg@10:0.25838 eval-ndcg:0.29965 E.g. tree_method = "hist", device = "cuda" return func(**kwargs) Ranking validation (LM): 100%|███████████████████████████████████████████| 1148/1148 [00:01<00:00, 982.98it/s] Validation Performance (Params: {'colsample_bytree': 1.0, 'eta': 0.05, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'}): AP: 0.2162 NDCG: 0.3576 NDCG@10: 0.2656 Run summary: eval/ap 0.21616 eval/ndcg 0.35763 lm_eta=0.05_max_depth=5_subsample=1.0_colsample_bytree=1.0_tree_method=gpu_hist_objective=rank:ndcg at: Best LambdaMART model found with params: {'colsample_bytree': 1.0, 'eta': 0.1, 'eval_metric': ['ndcg@10', 'ndcg'], 'max_depth': 5, 'objective': 'rank:ndcg', 'subsample': 1.0, 'tree_method': 'gpu_hist'} Best Validation NDCG@10: 0.2718 Best LambdaMART model saved to ./output/best_lambdamart.xgb Fitting BM25/TF-IDF components on test candidate passages... Loading BM25 model from bm25_fit_182469 Loading bm25_fit_182469 from cache took 11.16 seconds Precomputing TF-IDF vectors for test passages... Loading TF-IDF vectors from cache Loading tfidf_vectors_182469_207533 from cache took 14.73 seconds Creating LM features for test set... Creating features for LambdaMART... Generating LM Features: 100%|███████████████████████████████████████████████| 200/200 [00:11<00:00, 17.63it/s] Saving lm_features_2325484224358829478 to cache took 2.94 seconds Predicting with LM model... Ranking test (LM): 100%|██████████████████████████████████████████████████| 200/200 [00:00<00:00, 2633.12it/s] Writing LM results: 100%|█████████████████████████████████████████████████| 200/200 [00:00<00:00, 3580.21it/s] Results saved to ./output/LM.txt ap: 0.22044265887001221 ndcg: 0.36104720928566997 ndcg@10: 0.27184225627625747
  • task4
❯ python task4.py Using device: cuda Loading data... Loaded 4364339 rows from train_data.tsv Loaded 1103039 rows from validation_data.tsv Loaded 200 rows from test-queries.tsv Loaded 189877 rows from candidate_passages_top1000.tsv Fitting BM25 on Train+Validation data... Loading BM25 model from bm25_fit_3429679 Loading bm25_fit_3429679 from cache took 12.55 seconds Precomputing TF-IDF vectors for validation passages... Time taken to precompute TF-IDF vectors: 0.31 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_955211_1768065 from cache took 3.71 seconds --- Task 4: Neural Networks (PyTorch) --- Loading GloVe embeddings from cache Loading glove_embeddings_100 from cache took 0.99 seconds Building vocabulary... Building vocabulary: 100%|████████████████████████████████████| 10934756/10934756 [01:00<00:00, 179836.47it/s] Vocabulary size: 404937 Preparing embedding matrix... Found 150222 / 404937 words in GloVe --- Preparing features for RankMLP model --- Pre-computing TF-IDF features for all datasets... Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 13.26 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.93 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.75 seconds Creating feature vectors for training data... 100%|████████████████████████████████████████████████████████████| 4364339/4364339 [02:25<00:00, 30087.33it/s] Creating feature vectors for validation data... 100%|████████████████████████████████████████████████████████████| 1103039/1103039 [00:37<00:00, 29442.78it/s] Creating feature vectors for test data... 100%|██████████████████████████████████████████████████████████████| 189877/189877 [00:06<00:00, 30543.35it/s] --- Training RankMLP Model with mse loss --- wandb: Currently logged in as: seonglae (texonom). Use `wandb login --relogin` to force relogin wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information. wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021434-ns1y1je9 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/ns1y1je9 RankMLP-mse with mse: 100%|█████████████████████████████████| 8525/8525 [00:58<00:00, 146.30it/s, Loss=0.0000] RankMLP-mse with mse: Train Loss: 0.0017, Val NDCG@10: 0.2307 *** New best RankMLP model found with mse loss: Val NDCG@10: 0.2307 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: train/loss ▄▃▂▂▁▄▁▄█▁▁▁▅▄▁▄▄▁▁▁▁▄▁▁▁▁▁▁▁▁█▅▁███▄▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.19298 wandb: eval/ndcg 0.32968 wandb: eval/ndcg@10 0.23066 wandb: train/loss 0.00168 wandb: wandb: 🚀 View run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/ns1y1je9 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_021434-ns1y1je9/logs --- Training RankMLP Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021549-aunhlues wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/aunhlues RankMLP-bce with bce: 100%|█████████████████████████████████| 8525/8525 [00:46<00:00, 184.68it/s, Loss=0.1466] RankMLP-bce with bce: Train Loss: 0.0111, Val NDCG@10: 0.2610 *** New best RankMLP model found with bce loss: Val NDCG@10: 0.2610 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: train/loss █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.21314 wandb: eval/ndcg 0.3525 wandb: eval/ndcg@10 0.26101 wandb: train/loss 0.01114 wandb: wandb: 🚀 View run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/aunhlues wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_021549-aunhlues/logs --- Training RankMLP Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021644-3u4mjkv6 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/3u4mjkv6 RankMLP-hinge with hinge: 100%|█████████████████████████████| 8525/8525 [00:33<00:00, 255.21it/s, Loss=0.0000] RankMLP-hinge with hinge: Train Loss: 0.0040, Val NDCG@10: 0.2488 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: train/loss ▂▁▁▁▁▃▃█▃▁▁▁▆▁▁▁▁▁▁▁▃▁▄▁▁▁▃▃▃▁▁▃▁▃▁▃▃▃▃▁ wandb: wandb: Run summary: wandb: eval/ap 0.20307 wandb: eval/ndcg 0.34325 wandb: eval/ndcg@10 0.24878 wandb: train/loss 0.00396 wandb: wandb: 🚀 View run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/3u4mjkv6 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_021644-3u4mjkv6/logs --- Training RankMLP Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021733-s9j6hsdd wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/s9j6hsdd RankMLP-ranknet with ranknet: 100%|█████████████████████████| 8525/8525 [00:57<00:00, 147.38it/s, Loss=0.0012] RankMLP-ranknet with ranknet: Train Loss: 0.0121, Val NDCG@10: 0.2619 *** New best RankMLP model found with ranknet loss: Val NDCG@10: 0.2619 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: train/loss █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.21582 wandb: eval/ndcg 0.35586 wandb: eval/ndcg@10 0.26194 wandb: train/loss 0.01209 wandb: wandb: 🚀 View run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/s9j6hsdd wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_021733-s9j6hsdd/logs --- Training RankMLP Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021846-i012whjn wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/i012whjn RankMLP-approx_ndcg with approx_ndcg: Train Loss: 0.0000, Val NDCG@10: 0.0083 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: train/loss ▁ wandb: wandb: Run summary: wandb: eval/ap 0.00857 wandb: eval/ndcg 0.11984 wandb: eval/ndcg@10 0.00833 wandb: train/loss 0 wandb: wandb: 🚀 View run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/i012whjn wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_021846-i012whjn/logs --- Training Siamese CNN Model with mse loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_021900-g7vzh4ue wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_mse_model=SiameseCNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=64 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/g7vzh4ue
 

Report

구현된 hyperparameter 등 사실에 어긋나는거없도록하는게중요
ranknet 뭐지 methodology 로 내려야하나

Task1

task2

Describe how you perform input processing & representation or features used

Task3

histogram-based splitting method

Task4

Justify your choice by describing why you chose a particular architecture and how it fits to our problem.

Figures

cluster - neural net
histogram - LambdaMART
bar chart - logistic regression

Tables

다댐?
 
 
 
 
epoch 같은거 없다고
 
loss 를 지금처런 mse 하는거랑 Binary Cross-Entropy Loss 하는거랑 pairwise hinge loss, RankNet loss and NDCG 전부 각각 architecture 별로도 하도록 비교. 반복되는 코드는 함수로 빼 run 에서
mlp 뿐만 아니라 SiameseCNN 도 이용해서 두개 모델 전부 트레이닝 한다음 최고 뽑기 ㄱㄱ 형평성과 호환성 생각해서 SiameseRankingModel, SiameseCNNEncoder 코드 작동하도록 수정도 필요함
 
 
run1
❯ python task4.py Using device: cuda Loading data... Loaded 4364339 rows from train_data.tsv Loaded 1103039 rows from validation_data.tsv Loaded 200 rows from test-queries.tsv Loaded 189877 rows from candidate_passages_top1000.tsv Fitting BM25 on Train+Validation data... Loading BM25 model from bm25_fit_3429679 Loading bm25_fit_3429679 from cache took 11.81 seconds Precomputing TF-IDF vectors for validation passages... Time taken to precompute TF-IDF vectors: 0.28 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_955211_1768065 from cache took 3.55 seconds --- Task 4: Neural Networks (PyTorch) --- Loading GloVe embeddings from cache Loading glove_embeddings_100 from cache took 0.90 seconds Building vocabulary... Building vocabulary: 100%|██████████████████████████████| 10934756/10934756 [00:56<00:00, 192147.43it/s] Vocabulary size: 404937 Preparing embedding matrix... Found 150222 / 404937 words in GloVe --- Preparing features for RankMLP model --- Pre-computing TF-IDF features for all datasets... Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.15 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 11.89 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.44 seconds Creating feature vectors for training data... 100%|██████████████████████████████████████████████████████| 4364339/4364339 [02:20<00:00, 31111.42it/s] Creating feature vectors for validation data... 100%|██████████████████████████████████████████████████████| 1103039/1103039 [00:35<00:00, 30929.58it/s] Creating feature vectors for test data... 100%|████████████████████████████████████████████████████████| 189877/189877 [00:05<00:00, 32054.86it/s] --- Training Siamese RNN Model with mse loss --- wandb: Currently logged in as: seonglae (texonom). Use `wandb login --relogin` to force relogin wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information. wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_040743-4p2p1vpc wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_mse_model=SiameseRNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/4p2p1vpc SiameseRNN-mse with mse: 100%|█████████████████████████| 8525/8525 [09:22<00:00, 15.15it/s, Loss=0.0000] Evaluating SiameseRNN-mse: 100%|████████████████████████████████████| 2155/2155 [01:09<00:00, 31.03it/s] Ranking validation (SiameseRNN-mse): 100%|█████████████████████████| 1148/1148 [00:01<00:00, 755.90it/s] SiameseRNN-mse Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseRNN-mse with mse: Train Loss: 0.8168 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 *** New best Siamese RNN model found with mse loss: Val NDCG@10: 0.0145 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▃▂▁▁▁▁▂▂▂▁▁▁▂▁▁▁▁▂▁▁▁▂▂▂▁▁▁▁▁▁▂▂▂▁▃▁▃▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 0.81685 wandb: wandb: 🚀 View run siamese_rnn_mse_model=SiameseRNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/4p2p1vpc wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_040743-4p2p1vpc/logs --- Training Siamese RNN Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_041824-v0zoahwu wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_bce_model=SiameseRNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/v0zoahwu SiameseRNN-bce with bce: 100%|█████████████████████████| 8525/8525 [09:18<00:00, 15.25it/s, Loss=0.0011] Evaluating SiameseRNN-bce: 100%|████████████████████████████████████| 2155/2155 [01:10<00:00, 30.70it/s] Ranking validation (SiameseRNN-bce): 100%|████████████████████████| 1148/1148 [00:00<00:00, 2598.77it/s] SiameseRNN-bce Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseRNN-bce with bce: Train Loss: 6.0640 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▂▃▁▂▂▃▃▁▁▁▁▃▁▄▃▃▁▁▁▃▂▁█▃▁▁▁▃▁▁▃▄▁▃▁▃▁▃▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 6.064 wandb: wandb: 🚀 View run siamese_rnn_bce_model=SiameseRNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/v0zoahwu wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_041824-v0zoahwu/logs --- Training Siamese RNN Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_042858-o2m91ifu wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_hinge_model=SiameseRNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/o2m91ifu SiameseRNN-hinge with hinge: 100%|█████████████████████| 8525/8525 [09:19<00:00, 15.25it/s, Loss=0.0000] Evaluating SiameseRNN-hinge: 100%|██████████████████████████████████| 2155/2155 [01:07<00:00, 31.74it/s] Ranking validation (SiameseRNN-hinge): 100%|██████████████████████| 1148/1148 [00:00<00:00, 2559.32it/s] SiameseRNN-hinge Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseRNN-hinge with hinge: Train Loss: 2.8103 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▇▂▄▄▁▇▄▄▄▄▁▃▆▁▄▁▄▃▁▁▁▄▄▁▃▄▁▃▄▁▄▁▁▁▁▁▁█▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 2.81025 wandb: wandb: 🚀 View run siamese_rnn_hinge_model=SiameseRNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/o2m91ifu wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_042858-o2m91ifu/logs --- Training Siamese RNN Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043929-r1bmiaya wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_ranknet_model=SiameseRNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/r1bmiaya SiameseRNN-ranknet with ranknet: 100%|█████████████████| 8525/8525 [09:19<00:00, 15.23it/s, Loss=0.0013] Evaluating SiameseRNN-ranknet: 100%|████████████████████████████████| 2155/2155 [01:10<00:00, 30.72it/s] Ranking validation (SiameseRNN-ranknet): 100%|████████████████████| 1148/1148 [00:00<00:00, 2574.64it/s] SiameseRNN-ranknet Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseRNN-ranknet with ranknet: Train Loss: 7.2062 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 7.20618 wandb: wandb: 🚀 View run siamese_rnn_ranknet_model=SiameseRNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/r1bmiaya wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043929-r1bmiaya/logs --- Training Siamese RNN Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_045004-sq88a2ng wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_approx_ndcg_model=SiameseRNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/sq88a2ng SiameseRNN-approx_ndcg with approx_ndcg: 100%|█████████| 8525/8525 [10:02<00:00, 14.14it/s, Loss=1.0000] Evaluating SiameseRNN-approx_ndcg: 100%|████████████████████████████| 2155/2155 [01:09<00:00, 31.02it/s] Ranking validation (SiameseRNN-approx_ndcg): 100%|████████████████| 1148/1148 [00:00<00:00, 2375.20it/s] SiameseRNN-approx_ndcg Validation Metrics: NDCG@10: 0.0126 NDCG@20: 0.0150 NDCG@30: 0.0180 NDCG@40: 0.0195 NDCG@50: 0.0203 SiameseRNN-approx_ndcg with approx_ndcg: Train Loss: -0.0647 NDCG Metrics: NDCG@10: 0.0126 NDCG@20: 0.0150 NDCG@30: 0.0180 NDCG@40: 0.0195 NDCG@50: 0.0203 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▇▆▇▇██▆▇███▇██▇██▆█▅█▆██▆██▁█▆██▃███████ wandb: wandb: Run summary: wandb: eval/ap 0.01224 wandb: eval/ndcg 0.12977 wandb: eval/ndcg@10 0.01261 wandb: eval/ndcg@20 0.01498 wandb: eval/ndcg@30 0.01799 wandb: eval/ndcg@40 0.0195 wandb: eval/ndcg@50 0.02029 wandb: train/loss -0.06471 wandb: wandb: 🚀 View run siamese_rnn_approx_ndcg_model=SiameseRNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/sq88a2ng wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_045004-sq88a2ng/logs --- Training Siamese CNN Model with mse loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_050125-8wbyzk0j wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_mse_model=SiameseCNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/8wbyzk0j SiameseCNN-mse with mse: 100%|█████████████████████████| 8525/8525 [09:26<00:00, 15.05it/s, Loss=0.0000] Evaluating SiameseCNN-mse: 100%|████████████████████████████████████| 2155/2155 [01:12<00:00, 29.75it/s] Ranking validation (SiameseCNN-mse): 100%|████████████████████████| 1148/1148 [00:00<00:00, 2652.38it/s] SiameseCNN-mse Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseCNN-mse with mse: Train Loss: 0.7950 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 *** New best Siamese CNN model found with mse loss: Val NDCG@10: 0.0145 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▅▁▁█▁▁▁▆▁▃▁▁▃▁▃▄▁▃▁▃▁▄▃▄▁▁▁▁▁▃▄▁▁▁▁▁▃▃▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 0.79496 wandb: wandb: 🚀 View run siamese_cnn_mse_model=SiameseCNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/8wbyzk0j wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_050125-8wbyzk0j/logs --- Training Siamese CNN Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_051210-gvv45w38 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_bce_model=SiameseCNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/gvv45w38 SiameseCNN-bce with bce: 100%|█████████████████████████| 8525/8525 [09:15<00:00, 15.35it/s, Loss=0.0013] Evaluating SiameseCNN-bce: 100%|████████████████████████████████████| 2155/2155 [01:12<00:00, 29.77it/s] Ranking validation (SiameseCNN-bce): 100%|████████████████████████| 1148/1148 [00:00<00:00, 2381.61it/s] SiameseCNN-bce Validation Metrics: NDCG@10: 0.0140 NDCG@20: 0.0174 NDCG@30: 0.0190 NDCG@40: 0.0222 NDCG@50: 0.0249 SiameseCNN-bce with bce: Train Loss: 6.7011 NDCG Metrics: NDCG@10: 0.0140 NDCG@20: 0.0174 NDCG@30: 0.0190 NDCG@40: 0.0222 NDCG@50: 0.0249 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▄▂█▁▁▁▁▁▄▆▁▄▁▄▄▃█▄▁▁▁▇▁▁▅▁▁▁▁▁▆▁▁▁▁▃▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01547 wandb: eval/ndcg 0.13287 wandb: eval/ndcg@10 0.01401 wandb: eval/ndcg@20 0.01739 wandb: eval/ndcg@30 0.01903 wandb: eval/ndcg@40 0.02224 wandb: eval/ndcg@50 0.02491 wandb: train/loss 6.70114 wandb: wandb: 🚀 View run siamese_cnn_bce_model=SiameseCNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/gvv45w38 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_051210-gvv45w38/logs --- Training Siamese CNN Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_052242-6kujxb8e wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_hinge_model=SiameseCNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/6kujxb8e SiameseCNN-hinge with hinge: 100%|█████████████████████| 8525/8525 [09:14<00:00, 15.36it/s, Loss=0.0000] Evaluating SiameseCNN-hinge: 100%|██████████████████████████████████| 2155/2155 [01:12<00:00, 29.76it/s] Ranking validation (SiameseCNN-hinge): 100%|██████████████████████| 1148/1148 [00:00<00:00, 2564.60it/s] SiameseCNN-hinge Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseCNN-hinge with hinge: Train Loss: 2.6537 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▃▂▁▆▄▅▁▁▂▅▅▁▅▅▅▅▁▁▁▁▁▁█▅█▁▁▅▄▄▁▁▁▄▄▁▄▁▄▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 2.65367 wandb: wandb: 🚀 View run siamese_cnn_hinge_model=SiameseCNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/6kujxb8e wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_052242-6kujxb8e/logs --- Training Siamese CNN Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_053313-mwa0td9b wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_ranknet_model=SiameseCNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/mwa0td9b SiameseCNN-ranknet with ranknet: 100%|█████████████████| 8525/8525 [09:15<00:00, 15.36it/s, Loss=0.0011] Evaluating SiameseCNN-ranknet: 100%|████████████████████████████████| 2155/2155 [01:12<00:00, 29.81it/s] Ranking validation (SiameseCNN-ranknet): 100%|████████████████████| 1148/1148 [00:00<00:00, 2357.11it/s] SiameseCNN-ranknet Validation Metrics: NDCG@10: 0.0114 NDCG@20: 0.0152 NDCG@30: 0.0170 NDCG@40: 0.0189 NDCG@50: 0.0212 SiameseCNN-ranknet with ranknet: Train Loss: 6.8330 NDCG Metrics: NDCG@10: 0.0114 NDCG@20: 0.0152 NDCG@30: 0.0170 NDCG@40: 0.0189 NDCG@50: 0.0212 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▃▂▃▄▁▄▄▁▆▁▁▄▄▁▃▃▅▆▃▁▁▃█▃▁▃▃▁▃▁▁▃▁▁▁▁▁▁▃▁ wandb: wandb: Run summary: wandb: eval/ap 0.01187 wandb: eval/ndcg 0.13038 wandb: eval/ndcg@10 0.01142 wandb: eval/ndcg@20 0.01516 wandb: eval/ndcg@30 0.01701 wandb: eval/ndcg@40 0.01885 wandb: eval/ndcg@50 0.02122 wandb: train/loss 6.83299 wandb: wandb: 🚀 View run siamese_cnn_ranknet_model=SiameseCNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/mwa0td9b wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_053313-mwa0td9b/logs --- Training Siamese CNN Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_054348-zjt2hk2h wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_approx_ndcg_model=SiameseCNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/zjt2hk2h SiameseCNN-approx_ndcg with approx_ndcg: 100%|█████████| 8525/8525 [09:31<00:00, 14.91it/s, Loss=1.0000] Evaluating SiameseCNN-approx_ndcg: 100%|████████████████████████████| 2155/2155 [01:12<00:00, 29.91it/s] Ranking validation (SiameseCNN-approx_ndcg): 100%|████████████████| 1148/1148 [00:00<00:00, 2382.17it/s] SiameseCNN-approx_ndcg Validation Metrics: NDCG@10: 0.0157 NDCG@20: 0.0216 NDCG@30: 0.0246 NDCG@40: 0.0276 NDCG@50: 0.0300 SiameseCNN-approx_ndcg with approx_ndcg: Train Loss: -0.2350 NDCG Metrics: NDCG@10: 0.0157 NDCG@20: 0.0216 NDCG@30: 0.0246 NDCG@40: 0.0276 NDCG@50: 0.0300 *** New best Siamese CNN model found with approx_ndcg loss: Val NDCG@10: 0.0157 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ████████▁██████████████████████▆████████ wandb: wandb: Run summary: wandb: eval/ap 0.01656 wandb: eval/ndcg 0.13857 wandb: eval/ndcg@10 0.01572 wandb: eval/ndcg@20 0.02164 wandb: eval/ndcg@30 0.0246 wandb: eval/ndcg@40 0.02764 wandb: eval/ndcg@50 0.03001 wandb: train/loss -0.23505 wandb: wandb: 🚀 View run siamese_cnn_approx_ndcg_model=SiameseCNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/zjt2hk2h wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_054348-zjt2hk2h/logs --- Training RankMLP Model with mse loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_055438-w9fw53w0 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/w9fw53w0 RankMLP-mse with mse: 100%|███████████████████████████| 8525/8525 [00:20<00:00, 411.62it/s, Loss=0.0000] RankMLP-mse with mse: Train Loss: 0.0016 NDCG Metrics: NDCG@10: 0.2632 NDCG@20: 0.2917 NDCG@30: 0.3017 NDCG@40: 0.3085 NDCG@50: 0.3121 *** New best RankMLP model found with mse loss: Val NDCG@10: 0.2632 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▂▂▁▁▂▁▂▁▂▂▅▁▂▂▁▂▁▁▁▁▁▃▃▁▂▁▁▁▂▁▃▁▂▂▂▁▂▂▁ wandb: wandb: Run summary: wandb: eval/ap 0.21343 wandb: eval/ndcg 0.35348 wandb: eval/ndcg@10 0.26325 wandb: eval/ndcg@20 0.29169 wandb: eval/ndcg@30 0.30174 wandb: eval/ndcg@40 0.30853 wandb: eval/ndcg@50 0.31213 wandb: train/loss 0.00163 wandb: wandb: 🚀 View run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/w9fw53w0 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_055438-w9fw53w0/logs --- Training RankMLP Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_055508-dobiucbs wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/dobiucbs RankMLP-bce with bce: 100%|███████████████████████████| 8525/8525 [00:21<00:00, 395.11it/s, Loss=0.0011] RankMLP-bce with bce: Train Loss: 0.0123 NDCG Metrics: NDCG@10: 0.2532 NDCG@20: 0.2784 NDCG@30: 0.2920 NDCG@40: 0.2980 NDCG@50: 0.3037 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▂▂▂▂▁▂▁▂▁▁▁▁▁▁▁▁▁▂▁▁▂▁▁▁▂▁▁▂▁▂▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.20839 wandb: eval/ndcg 0.34796 wandb: eval/ndcg@10 0.2532 wandb: eval/ndcg@20 0.27839 wandb: eval/ndcg@30 0.29204 wandb: eval/ndcg@40 0.29803 wandb: eval/ndcg@50 0.30367 wandb: train/loss 0.01228 wandb: wandb: 🚀 View run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/dobiucbs wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_055508-dobiucbs/logs --- Training RankMLP Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_055538-v0e914lh wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/v0e914lh RankMLP-hinge with hinge: 100%|███████████████████████| 8525/8525 [00:21<00:00, 392.79it/s, Loss=0.0000] RankMLP-hinge with hinge: Train Loss: 0.0044 NDCG Metrics: NDCG@10: 0.2522 NDCG@20: 0.2764 NDCG@30: 0.2874 NDCG@40: 0.2957 NDCG@50: 0.3008 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.2049 wandb: eval/ndcg 0.34455 wandb: eval/ndcg@10 0.25218 wandb: eval/ndcg@20 0.27639 wandb: eval/ndcg@30 0.2874 wandb: eval/ndcg@40 0.29565 wandb: eval/ndcg@50 0.30083 wandb: train/loss 0.00443 wandb: wandb: 🚀 View run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/v0e914lh wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_055538-v0e914lh/logs --- Training RankMLP Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_055608-4inji73g wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/4inji73g RankMLP-ranknet with ranknet: 100%|███████████████████| 8525/8525 [00:20<00:00, 406.83it/s, Loss=0.0013] RankMLP-ranknet with ranknet: Train Loss: 0.0131 NDCG Metrics: NDCG@10: 0.2546 NDCG@20: 0.2815 NDCG@30: 0.2929 NDCG@40: 0.3013 NDCG@50: 0.3060 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▄▃▂▅▃▃▂▂▃▄▃▂▂▁▄▁▂▁▁▃▂▂▂▄▁▁▄▃▁▁▁▂▂▁▂▁▁▁▂ wandb: wandb: Run summary: wandb: eval/ap 0.20852 wandb: eval/ndcg 0.34855 wandb: eval/ndcg@10 0.25462 wandb: eval/ndcg@20 0.28152 wandb: eval/ndcg@30 0.29291 wandb: eval/ndcg@40 0.3013 wandb: eval/ndcg@50 0.30596 wandb: train/loss 0.01313 wandb: wandb: 🚀 View run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/4inji73g wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_055608-4inji73g/logs --- Training RankMLP Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_055638-7hepi8u2 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/7hepi8u2 RankMLP-approx_ndcg with approx_ndcg: Train Loss: 0.0000 NDCG Metrics: NDCG@10: 0.0247 NDCG@20: 0.0314 NDCG@30: 0.0347 NDCG@40: 0.0372 NDCG@50: 0.0401 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁ wandb: wandb: Run summary: wandb: eval/ap 0.02435 wandb: eval/ndcg 0.14652 wandb: eval/ndcg@10 0.02472 wandb: eval/ndcg@20 0.03137 wandb: eval/ndcg@30 0.03471 wandb: eval/ndcg@40 0.03725 wandb: eval/ndcg@50 0.04009 wandb: train/loss 0 wandb: wandb: 🚀 View run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/7hepi8u2 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_055638-7hepi8u2/logs === Model Comparison === SiameseCNN best with approx_ndcg loss, NDCG@10: 0.0157 SiameseRNN best with mse loss, NDCG@10: 0.0145 RankMLP best with mse loss, NDCG@10: 0.2632 Best overall model: RankMLP with mse loss, NDCG@10: 0.2632 Detailed metrics for best model: NDCG@10: 0.2632 NDCG@20: 0.2917 NDCG@30: 0.3017 NDCG@40: 0.3085 NDCG@50: 0.3121 Generating test predictions with RankMLP-mse model... RankMLP Prediction: 100%|████████████████████████████████████████████| 371/371 [00:00<00:00, 933.25it/s] Writing RankMLP_mse results: 100%|█████████████████████████████████| 200/200 [00:00<00:00, 18147.34it/s] Results saved to ./output/RankMLP_mse.txt Writing Best_Model results: 100%|██████████████████████████████████| 200/200 [00:00<00:00, 17623.87it/s] Results saved to ./output/Best_RankMLP_mse.txt ap: 0.21343283574475128 ndcg: 0.353481648594875 ndcg@10: 0.26324605001985785 ndcg@20: 0.2916853339440412 ndcg@30: 0.3017411930961967 ndcg@40: 0.30853299503096737 ndcg@50: 0.3121346807175971
❯ python task4.py Using device: cuda Loading data... Loaded 4364339 rows from train_data.tsv Loaded 1103039 rows from validation_data.tsv Loaded 200 rows from test-queries.tsv Loaded 189877 rows from candidate_passages_top1000.tsv Fitting BM25 on Train+Validation data... Loading BM25 model from bm25_fit_3429679 Loading bm25_fit_3429679 from cache took 12.47 seconds Precomputing TF-IDF vectors for validation passages... Time taken to precompute TF-IDF vectors: 0.30 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_955211_1768065 from cache took 3.68 seconds --- Task 4: Neural Networks (PyTorch) --- Loading GloVe embeddings from cache Loading glove_embeddings_100 from cache took 0.99 seconds Building vocabulary... Building vocabulary: 100%|████████████████████████| 10934756/10934756 [00:59<00:00, 185253.85it/s] Vocabulary size: 404937 Preparing embedding matrix... Found 150222 / 404937 words in GloVe --- Preparing features for RankMLP model --- Pre-computing TF-IDF features for all datasets... Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.63 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.95 seconds Loading TF-IDF vectors from cache Loading tfidf_vectors_3429679_1768065 from cache took 12.74 seconds Creating feature vectors for training data... 100%|████████████████████████████████████████████████| 4364339/4364339 [02:25<00:00, 29943.22it/s] Creating feature vectors for validation data... 100%|████████████████████████████████████████████████| 1103039/1103039 [00:36<00:00, 29977.50it/s] Creating feature vectors for test data... 100%|██████████████████████████████████████████████████| 189877/189877 [00:06<00:00, 31037.75it/s] --- Training Siamese RNN Model with mse loss --- wandb: Currently logged in as: seonglae (texonom). Use `wandb login --relogin` to force relogin wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information. wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_031945-7b5p7ph6 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_mse_model=SiameseRNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/7b5p7ph6 SiameseRNN-mse with mse: 100%|███████████████████| 8525/8525 [07:29<00:00, 18.98it/s, Loss=0.0000] Evaluating SiameseRNN-mse: 100%|██████████████████████████████| 2155/2155 [01:03<00:00, 34.08it/s] Ranking validation (SiameseRNN-mse): 100%|███████████████████| 1148/1148 [00:01<00:00, 672.71it/s] SiameseRNN-mse Validation Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 SiameseRNN-mse with mse: Train Loss: 0.5686 NDCG Metrics: NDCG@10: 0.0145 NDCG@20: 0.0169 NDCG@30: 0.0182 NDCG@40: 0.0229 NDCG@50: 0.0284 *** New best Siamese RNN model found with mse loss: Val NDCG@10: 0.0145 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁▃▁▃▃▁▁▆▁▃▁▁▃▁▃▁▁▃▁▃▁▁▃▆▁▁▁█▁▃▁▁▁▁▁▁▃▁▃▃ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.134 wandb: eval/ndcg@10 0.01452 wandb: eval/ndcg@20 0.01692 wandb: eval/ndcg@30 0.01818 wandb: eval/ndcg@40 0.02289 wandb: eval/ndcg@50 0.02841 wandb: train/loss 0.56861 wandb: wandb: 🚀 View run siamese_rnn_mse_model=SiameseRNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/7b5p7ph6 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_031945-7b5p7ph6/logs --- Training Siamese RNN Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_032827-6gcoxi20 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_bce_model=SiameseRNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/6gcoxi20 SiameseRNN-bce with bce: 100%|███████████████████| 8525/8525 [07:12<00:00, 19.72it/s, Loss=0.0007] Evaluating SiameseRNN-bce: 100%|██████████████████████████████| 2155/2155 [01:04<00:00, 33.17it/s] Ranking validation (SiameseRNN-bce): 100%|██████████████████| 1148/1148 [00:00<00:00, 2332.45it/s] SiameseRNN-bce Validation Metrics: NDCG@10: 0.0117 NDCG@20: 0.0153 NDCG@30: 0.0192 NDCG@40: 0.0224 NDCG@50: 0.0243 SiameseRNN-bce with bce: Train Loss: 4.7682 NDCG Metrics: NDCG@10: 0.0117 NDCG@20: 0.0153 NDCG@30: 0.0192 NDCG@40: 0.0224 NDCG@50: 0.0243 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁▁▁▁▂▃▁▁▃▃▁▁▁▅▃▁▄▃▁▁▁▃▃▁▁▁▃▁▅▁█▃▁▁▁▁▁▇▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01357 wandb: eval/ndcg 0.13458 wandb: eval/ndcg@10 0.01173 wandb: eval/ndcg@20 0.01526 wandb: eval/ndcg@30 0.01917 wandb: eval/ndcg@40 0.02238 wandb: eval/ndcg@50 0.02427 wandb: train/loss 4.76825 wandb: wandb: 🚀 View run siamese_rnn_bce_model=SiameseRNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/6gcoxi20 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_032827-6gcoxi20/logs --- Training Siamese RNN Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_033648-db57nmg3 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_hinge_model=SiameseRNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/db57nmg3 SiameseRNN-hinge with hinge: 100%|███████████████| 8525/8525 [07:30<00:00, 18.90it/s, Loss=0.0000] Evaluating SiameseRNN-hinge: 100%|████████████████████████████| 2155/2155 [01:03<00:00, 34.07it/s] Ranking validation (SiameseRNN-hinge): 100%|████████████████| 1148/1148 [00:00<00:00, 2317.50it/s] SiameseRNN-hinge Validation Metrics: NDCG@10: 0.0108 NDCG@20: 0.0148 NDCG@30: 0.0177 NDCG@40: 0.0201 NDCG@50: 0.0227 SiameseRNN-hinge with hinge: Train Loss: 1.5222 NDCG Metrics: NDCG@10: 0.0108 NDCG@20: 0.0148 NDCG@30: 0.0177 NDCG@40: 0.0201 NDCG@50: 0.0227 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▅▇▁▁▁▃▁▁▃▁▃▅▃▁▁▁▃▁▃▆▁▁▁▁▁▁▇▃▁▁▃▁▁▁▃▃▃▁█▁ wandb: wandb: Run summary: wandb: eval/ap 0.01248 wandb: eval/ndcg 0.13285 wandb: eval/ndcg@10 0.01081 wandb: eval/ndcg@20 0.01476 wandb: eval/ndcg@30 0.01772 wandb: eval/ndcg@40 0.02008 wandb: eval/ndcg@50 0.02274 wandb: train/loss 1.5222 wandb: wandb: 🚀 View run siamese_rnn_hinge_model=SiameseRNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/db57nmg3 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_033648-db57nmg3/logs --- Training Siamese RNN Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_034527-8snkbban wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_ranknet_model=SiameseRNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/8snkbban SiameseRNN-ranknet with ranknet: 100%|███████████| 8525/8525 [07:15<00:00, 19.57it/s, Loss=0.0008] Evaluating SiameseRNN-ranknet: 100%|██████████████████████████| 2155/2155 [01:04<00:00, 33.28it/s] Ranking validation (SiameseRNN-ranknet): 100%|██████████████| 1148/1148 [00:00<00:00, 2329.88it/s] SiameseRNN-ranknet Validation Metrics: NDCG@10: 0.0108 NDCG@20: 0.0150 NDCG@30: 0.0183 NDCG@40: 0.0203 NDCG@50: 0.0230 SiameseRNN-ranknet with ranknet: Train Loss: 4.7202 NDCG Metrics: NDCG@10: 0.0108 NDCG@20: 0.0150 NDCG@30: 0.0183 NDCG@40: 0.0203 NDCG@50: 0.0230 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▅▄▁▅▁▅▁▁▅▅▁█▅▁▅▅▄▆▁▁▁▁▁▁▅▅▁▁▁▄▅▁▅▅▁▁▁▄█▁ wandb: wandb: Run summary: wandb: eval/ap 0.01257 wandb: eval/ndcg 0.13349 wandb: eval/ndcg@10 0.01081 wandb: eval/ndcg@20 0.01495 wandb: eval/ndcg@30 0.01831 wandb: eval/ndcg@40 0.02033 wandb: eval/ndcg@50 0.023 wandb: train/loss 4.72022 wandb: wandb: 🚀 View run siamese_rnn_ranknet_model=SiameseRNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/8snkbban wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_034527-8snkbban/logs --- Training Siamese RNN Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_035354-iqhfinzc wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_rnn_approx_ndcg_model=SiameseRNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/iqhfinzc SiameseRNN-approx_ndcg with approx_ndcg: 100%|███| 8525/8525 [08:42<00:00, 16.31it/s, Loss=1.0000] Evaluating SiameseRNN-approx_ndcg: 100%|██████████████████████| 2155/2155 [01:06<00:00, 32.22it/s] Ranking validation (SiameseRNN-approx_ndcg): 100%|██████████| 1148/1148 [00:00<00:00, 2335.74it/s] SiameseRNN-approx_ndcg Validation Metrics: NDCG@10: 0.0102 NDCG@20: 0.0142 NDCG@30: 0.0153 NDCG@40: 0.0170 NDCG@50: 0.0184 SiameseRNN-approx_ndcg with approx_ndcg: Train Loss: -1.2648 NDCG Metrics: NDCG@10: 0.0102 NDCG@20: 0.0142 NDCG@30: 0.0153 NDCG@40: 0.0170 NDCG@50: 0.0184 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ██▇▅▁██████████████████▄████████████████ wandb: wandb: Run summary: wandb: eval/ap 0.01112 wandb: eval/ndcg 0.1255 wandb: eval/ndcg@10 0.01021 wandb: eval/ndcg@20 0.01419 wandb: eval/ndcg@30 0.01531 wandb: eval/ndcg@40 0.017 wandb: eval/ndcg@50 0.01842 wandb: train/loss -1.26481 wandb: wandb: 🚀 View run siamese_rnn_approx_ndcg_model=SiameseRNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_hidden_dim=128_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/iqhfinzc wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_035354-iqhfinzc/logs --- Training Siamese CNN Model with mse loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_040351-0zfqtq7e wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_mse_model=SiameseCNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/0zfqtq7e SiameseCNN-mse with mse: 100%|███████████████████| 8525/8525 [04:32<00:00, 31.29it/s, Loss=0.0000] Evaluating SiameseCNN-mse: 100%|██████████████████████████████| 2155/2155 [00:44<00:00, 48.05it/s] Ranking validation (SiameseCNN-mse): 100%|███████████████████| 1148/1148 [00:01<00:00, 620.07it/s] SiameseCNN-mse Validation Metrics: NDCG@10: 0.0139 NDCG@20: 0.0161 NDCG@30: 0.0174 NDCG@40: 0.0214 NDCG@50: 0.0264 SiameseCNN-mse with mse: Train Loss: 0.6235 NDCG Metrics: NDCG@10: 0.0139 NDCG@20: 0.0161 NDCG@30: 0.0174 NDCG@40: 0.0214 NDCG@50: 0.0264 *** New best Siamese CNN model found with mse loss: Val NDCG@10: 0.0139 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁▃▃▃▁▁▃▁▃▅▁█▁▃▁▆▁▃▁▁▁▅▁▁▃▁▃▃▆▃▁▃▁▃▁▁▃▁▄▁ wandb: wandb: Run summary: wandb: eval/ap 0.01311 wandb: eval/ndcg 0.13302 wandb: eval/ndcg@10 0.01393 wandb: eval/ndcg@20 0.01609 wandb: eval/ndcg@30 0.01736 wandb: eval/ndcg@40 0.0214 wandb: eval/ndcg@50 0.02644 wandb: train/loss 0.62352 wandb: wandb: 🚀 View run siamese_cnn_mse_model=SiameseCNN_loss=mse_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/0zfqtq7e wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_040351-0zfqtq7e/logs --- Training Siamese CNN Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_040921-67f5ngbr wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_bce_model=SiameseCNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/67f5ngbr SiameseCNN-bce with bce: 100%|███████████████████| 8525/8525 [04:22<00:00, 32.49it/s, Loss=0.0012] Evaluating SiameseCNN-bce: 100%|██████████████████████████████| 2155/2155 [00:47<00:00, 45.32it/s] Ranking validation (SiameseCNN-bce): 100%|██████████████████| 1148/1148 [00:00<00:00, 2301.84it/s] SiameseCNN-bce Validation Metrics: NDCG@10: 0.0130 NDCG@20: 0.0170 NDCG@30: 0.0200 NDCG@40: 0.0231 NDCG@50: 0.0252 SiameseCNN-bce with bce: Train Loss: 5.0111 NDCG Metrics: NDCG@10: 0.0130 NDCG@20: 0.0170 NDCG@30: 0.0200 NDCG@40: 0.0231 NDCG@50: 0.0252 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▅▁▃▁▂▁▁▁▁▁▄▇▅▁▁▁▁▁▁▁▁▁▁▁▄▃▃▆▁▄▆▁█▇▅▄▁▆▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01456 wandb: eval/ndcg 0.13419 wandb: eval/ndcg@10 0.01298 wandb: eval/ndcg@20 0.01704 wandb: eval/ndcg@30 0.01998 wandb: eval/ndcg@40 0.02315 wandb: eval/ndcg@50 0.02524 wandb: train/loss 5.01115 wandb: wandb: 🚀 View run siamese_cnn_bce_model=SiameseCNN_loss=bce_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/67f5ngbr wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_040921-67f5ngbr/logs --- Training Siamese CNN Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_041442-bm0klm1t wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_hinge_model=SiameseCNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/bm0klm1t SiameseCNN-hinge with hinge: 100%|███████████████| 8525/8525 [04:33<00:00, 31.18it/s, Loss=0.0000] Evaluating SiameseCNN-hinge: 100%|████████████████████████████| 2155/2155 [00:44<00:00, 48.38it/s] Ranking validation (SiameseCNN-hinge): 100%|█████████████████| 1148/1148 [00:01<00:00, 612.91it/s] SiameseCNN-hinge Validation Metrics: NDCG@10: 0.0130 NDCG@20: 0.0152 NDCG@30: 0.0179 NDCG@40: 0.0193 NDCG@50: 0.0221 SiameseCNN-hinge with hinge: Train Loss: 1.4933 NDCG Metrics: NDCG@10: 0.0130 NDCG@20: 0.0152 NDCG@30: 0.0179 NDCG@40: 0.0193 NDCG@50: 0.0221 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁█▁▅▅▁▁▁▄▁▄▁▁▁▄▄▇▁▅▁▇▁▄▁▆▁▃▃▁▁▁▁▆▁▁▁▁▃▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01219 wandb: eval/ndcg 0.12987 wandb: eval/ndcg@10 0.01305 wandb: eval/ndcg@20 0.01521 wandb: eval/ndcg@30 0.01794 wandb: eval/ndcg@40 0.01929 wandb: eval/ndcg@50 0.02212 wandb: train/loss 1.49331 wandb: wandb: 🚀 View run siamese_cnn_hinge_model=SiameseCNN_loss=hinge_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/bm0klm1t wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_041442-bm0klm1t/logs --- Training Siamese CNN Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_042012-m56a5dpy wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_ranknet_model=SiameseCNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/m56a5dpy SiameseCNN-ranknet with ranknet: 100%|███████████| 8525/8525 [04:21<00:00, 32.56it/s, Loss=0.0006] Evaluating SiameseCNN-ranknet: 100%|██████████████████████████| 2155/2155 [00:47<00:00, 45.46it/s] Ranking validation (SiameseCNN-ranknet): 100%|██████████████| 1148/1148 [00:00<00:00, 2307.31it/s] SiameseCNN-ranknet Validation Metrics: NDCG@10: 0.0112 NDCG@20: 0.0145 NDCG@30: 0.0177 NDCG@40: 0.0197 NDCG@50: 0.0225 SiameseCNN-ranknet with ranknet: Train Loss: 5.0479 NDCG Metrics: NDCG@10: 0.0112 NDCG@20: 0.0145 NDCG@30: 0.0177 NDCG@40: 0.0197 NDCG@50: 0.0225 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▄▃▇▁▅▁▅▄▁▃▄▁▁▁▁█▁▄▁▁▁▁▄▄▁▂▁▁▄▄▁▇▄▁▃▄▄▇▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01265 wandb: eval/ndcg 0.13192 wandb: eval/ndcg@10 0.01123 wandb: eval/ndcg@20 0.01454 wandb: eval/ndcg@30 0.01767 wandb: eval/ndcg@40 0.01967 wandb: eval/ndcg@50 0.02251 wandb: train/loss 5.04786 wandb: wandb: 🚀 View run siamese_cnn_ranknet_model=SiameseCNN_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/m56a5dpy wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_042012-m56a5dpy/logs --- Training Siamese CNN Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_042534-9xvp05ee wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run siamese_cnn_approx_ndcg_model=SiameseCNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/9xvp05ee SiameseCNN-approx_ndcg with approx_ndcg: 100%|███| 8525/8525 [05:45<00:00, 24.67it/s, Loss=1.0000] Evaluating SiameseCNN-approx_ndcg: 100%|██████████████████████| 2155/2155 [00:38<00:00, 55.57it/s] Ranking validation (SiameseCNN-approx_ndcg): 100%|██████████| 1148/1148 [00:00<00:00, 2351.20it/s] SiameseCNN-approx_ndcg Validation Metrics: NDCG@10: 0.0149 NDCG@20: 0.0184 NDCG@30: 0.0223 NDCG@40: 0.0243 NDCG@50: 0.0278 SiameseCNN-approx_ndcg with approx_ndcg: Train Loss: -0.2618 NDCG Metrics: NDCG@10: 0.0149 NDCG@20: 0.0184 NDCG@30: 0.0223 NDCG@40: 0.0243 NDCG@50: 0.0278 *** New best Siamese CNN model found with approx_ndcg loss: Val NDCG@10: 0.0149 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.01595 wandb: eval/ndcg 0.13555 wandb: eval/ndcg@10 0.01488 wandb: eval/ndcg@20 0.01844 wandb: eval/ndcg@30 0.02231 wandb: eval/ndcg@40 0.02433 wandb: eval/ndcg@50 0.02779 wandb: train/loss -0.26177 wandb: wandb: 🚀 View run siamese_cnn_approx_ndcg_model=SiameseCNN_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_vocab_size=404937_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/9xvp05ee wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_042534-9xvp05ee/logs --- Training RankMLP Model with mse loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043214-qjx8apns wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/qjx8apns RankMLP-mse with mse: 100%|█████████████████████| 8525/8525 [00:57<00:00, 149.32it/s, Loss=0.0000] RankMLP-mse with mse: Train Loss: 0.0014 NDCG Metrics: NDCG@10: 0.2453 NDCG@20: 0.2650 NDCG@30: 0.2734 NDCG@40: 0.2775 NDCG@50: 0.2817 *** New best RankMLP model found with mse loss: Val NDCG@10: 0.2453 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁▅▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁█▅▅▁▄▄█▁▁▁▄▁▁▅▄▄▅▄█▄ wandb: wandb: Run summary: wandb: eval/ap 0.20114 wandb: eval/ndcg 0.33198 wandb: eval/ndcg@10 0.24533 wandb: eval/ndcg@20 0.26505 wandb: eval/ndcg@30 0.27339 wandb: eval/ndcg@40 0.27747 wandb: eval/ndcg@50 0.28167 wandb: train/loss 0.00144 wandb: wandb: 🚀 View run rankmlp_mse_model=RankMLP_loss=mse_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/qjx8apns wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043214-qjx8apns/logs --- Training RankMLP Model with bce loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043334-89n9xnkh wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/89n9xnkh RankMLP-bce with bce: 100%|█████████████████████| 8525/8525 [00:57<00:00, 148.31it/s, Loss=0.0013] RankMLP-bce with bce: Train Loss: 0.0105 NDCG Metrics: NDCG@10: 0.2555 NDCG@20: 0.2789 NDCG@30: 0.2906 NDCG@40: 0.2994 NDCG@50: 0.3032 *** New best RankMLP model found with bce loss: Val NDCG@10: 0.2555 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▁▁▁▁▂▁▁▁▁▂▂▁▃▁▁▂▂▁▁▁▁▁▂▁▁▁▁▂▂▂▁▁▁▂▁▁▁▂▁ wandb: wandb: Run summary: wandb: eval/ap 0.21309 wandb: eval/ndcg 0.35021 wandb: eval/ndcg@10 0.25546 wandb: eval/ndcg@20 0.27887 wandb: eval/ndcg@30 0.29062 wandb: eval/ndcg@40 0.29939 wandb: eval/ndcg@50 0.30321 wandb: train/loss 0.01045 wandb: wandb: 🚀 View run rankmlp_bce_model=RankMLP_loss=bce_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/89n9xnkh wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043334-89n9xnkh/logs --- Training RankMLP Model with hinge loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043451-aitp6551 wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/aitp6551 RankMLP-hinge with hinge: 100%|█████████████████| 8525/8525 [00:58<00:00, 145.55it/s, Loss=0.0460] RankMLP-hinge with hinge: Train Loss: 0.0051 NDCG Metrics: NDCG@10: 0.2556 NDCG@20: 0.2791 NDCG@30: 0.2938 NDCG@40: 0.3009 NDCG@50: 0.3066 *** New best RankMLP model found with hinge loss: Val NDCG@10: 0.2556 *** wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▃█▃▁▁▁▂▃▃▁▁▁▁▂▃▁▁▁▃▄▃▃▃▁▁▁▂▂▁▂▁▁▁▁▂▁▂▁▁▁ wandb: wandb: Run summary: wandb: eval/ap 0.20871 wandb: eval/ndcg 0.34878 wandb: eval/ndcg@10 0.25564 wandb: eval/ndcg@20 0.27911 wandb: eval/ndcg@30 0.29378 wandb: eval/ndcg@40 0.3009 wandb: eval/ndcg@50 0.3066 wandb: train/loss 0.00512 wandb: wandb: 🚀 View run rankmlp_hinge_model=RankMLP_loss=hinge_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/aitp6551 wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043451-aitp6551/logs --- Training RankMLP Model with ranknet loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043610-wm84a5gk wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/wm84a5gk RankMLP-ranknet with ranknet: 100%|█████████████| 8525/8525 [00:57<00:00, 148.28it/s, Loss=0.0012] RankMLP-ranknet with ranknet: Train Loss: 0.0131 NDCG Metrics: NDCG@10: 0.2551 NDCG@20: 0.2841 NDCG@30: 0.2971 NDCG@40: 0.3049 NDCG@50: 0.3100 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss █▆▅▄▄▃▃▃▁▁█▁▁▁▁▃▃▃▃▃▁▁▃▃▁▃▂▂▁▁▁▁▁▂▄▁▁▁▁▃ wandb: wandb: Run summary: wandb: eval/ap 0.21045 wandb: eval/ndcg 0.35093 wandb: eval/ndcg@10 0.25513 wandb: eval/ndcg@20 0.28406 wandb: eval/ndcg@30 0.29706 wandb: eval/ndcg@40 0.30485 wandb: eval/ndcg@50 0.31 wandb: train/loss 0.01306 wandb: wandb: 🚀 View run rankmlp_ranknet_model=RankMLP_loss=ranknet_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/wm84a5gk wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043610-wm84a5gk/logs --- Training RankMLP Model with approx_ndcg loss --- wandb: Tracking run with wandb version 0.19.4 wandb: Run data is saved locally in /cs/student/projects2/aisd/2024/seongcho/ucl/comp0084/cw2/part2/wandb/run-20250410_043730-9wpg2a4p wandb: Run `wandb offline` to turn off syncing. wandb: Syncing run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 wandb: ⭐️ View project at https://wandb.ai/texonom/ir_cw2 wandb: 🚀 View run at https://wandb.ai/texonom/ir_cw2/runs/9wpg2a4p RankMLP-approx_ndcg with approx_ndcg: Train Loss: 0.0000 NDCG Metrics: NDCG@10: 0.1132 NDCG@20: 0.1253 NDCG@30: 0.1341 NDCG@40: 0.1414 NDCG@50: 0.1465 wandb: wandb: wandb: Run history: wandb: eval/ap ▁ wandb: eval/ndcg ▁ wandb: eval/ndcg@10 ▁ wandb: eval/ndcg@20 ▁ wandb: eval/ndcg@30 ▁ wandb: eval/ndcg@40 ▁ wandb: eval/ndcg@50 ▁ wandb: train/loss ▁ wandb: wandb: Run summary: wandb: eval/ap 0.09875 wandb: eval/ndcg 0.22705 wandb: eval/ndcg@10 0.11317 wandb: eval/ndcg@20 0.12535 wandb: eval/ndcg@30 0.13406 wandb: eval/ndcg@40 0.14141 wandb: eval/ndcg@50 0.14648 wandb: train/loss 0 wandb: wandb: 🚀 View run rankmlp_approx_ndcg_model=RankMLP_loss=approx_ndcg_learning_rate=0.001_weight_decay=0.0001_input_dim=5_batch_size=512 at: https://wandb.ai/texonom/ir_cw2/runs/9wpg2a4p wandb: ⭐️ View project at: https://wandb.ai/texonom/ir_cw2 wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: Find logs at: ./wandb/run-20250410_043730-9wpg2a4p/logs === Model Comparison === SiameseCNN best with approx_ndcg loss, NDCG@10: 0.0149 SiameseRNN best with mse loss, NDCG@10: 0.0145 RankMLP best with hinge loss, NDCG@10: 0.2556 Best overall model: RankMLP with hinge loss, NDCG@10: 0.2556 Detailed metrics for best model: NDCG@10: 0.2556 NDCG@20: 0.2791 NDCG@30: 0.2938 NDCG@40: 0.3009 NDCG@50: 0.3066 Generating test predictions with RankMLP-hinge model... RankMLP Prediction: 100%|██████████████████████████████████████| 371/371 [00:01<00:00, 239.01it/s] Writing RankMLP_hinge results: 100%|██████████████████████████| 200/200 [00:00<00:00, 4619.07it/s] Results saved to ./output/RankMLP_hinge.txt Writing Best_Model results: 100%|█████████████████████████████| 200/200 [00:00<00:00, 4921.65it/s] Results saved to ./output/Best_RankMLP_hinge.txt ap: 0.20870946785707564 ndcg: 0.348775424273482 ndcg@10: 0.25563562083042696 ndcg@20: 0.27910927430877763 ndcg@30: 0.2937806958992941 ndcg@40: 0.3009040880493508 ndcg@50: 0.3065956809292813
 
 

 

Recommendations