Loading views...

Research Note HAI July 3th

Date
Date
2025 Jul 15 0:0 β†’ 2025 Jul 17 0:0
Created by
Created by
Seonglae ChoSeonglae Cho
Created time
Created time
2025 Jul 15 16:47
Last edited by
Last edited by
Seonglae ChoSeonglae Cho
Last edited time
Last edited time
2025 Jul 16 13:31
Refs
Refs

Replaced before prompt

πŸ“ˆ METHOD PERFORMANCE: unified_method: Success rate: 7/7 Avg exact score: 0.107 Avg semantic score: 0.610 Avg processing time: 69.50s Avg matching score: 0.375 πŸ’° Avg tokens per run: 250952 πŸ’° Avg cost per run: $0.0404 πŸ’° Total cost: $0.2825 πŸ€– Model: unknown original_method: Success rate: 7/7 Avg exact score: 0.087 Avg semantic score: 0.426 Avg processing time: 47.50s Avg matching score: 0.249 πŸ’° Avg tokens per run: 212647 πŸ’° Avg cost per run: $0.0346 πŸ’° Total cost: $0.2419 πŸ€– Model: unknown hybrid_method: Success rate: 7/7 Avg exact score: 0.106 Avg semantic score: 0.646 Avg processing time: 114.85s 08 πŸ’° Avg tokens per run: 213449 πŸ’° Avg cost per run: $0.0358 πŸ’° Total cost: $0.2505 πŸ€– Model: unknown clustering_method: Success rate: 7/7 Avg exact score: 0.143 Avg semantic score: 0.623 Avg processing time: 135.22s Avg matching score: 0.392 πŸ’° Avg tokens per run: 231979 πŸ’° Avg cost per run: $0.0420 πŸ’° Total cost: $0.2940 πŸ€– Model: gpt-4o-mini direct_llm_method: Success rate: 7/7 Avg exact score: 0.085 Avg semantic score: 0.592 Avg processing time: 33.04s Avg matching score: 0.363 πŸ’° Avg tokens per run: 80605 πŸ’° Avg cost per run: $0.0127 πŸ’° Total cost: $0.0889 πŸ€– Model: gpt-4o-mini pydantic_ai_method: Success rate: 7/7 Avg exact score: 0.162 Avg semantic score: 0.459 Avg processing time: 34.58s Avg matching score: 0.292 πŸ’° Avg tokens per run: 82415 πŸ’° Avg cost per run: $0.0133 πŸ’° Total cost: $0.0928 πŸ€– Model: unknown sequential_pydantic: Success rate: 7/7 Avg exact score: 0.080 Avg semantic score: 0.316 Avg processing time: 140.56s Avg matching score: 0.209 πŸ’° Avg tokens per run: 82568 πŸ’° Avg cost per run: $0.0133 πŸ’° Total cost: $0.0666 πŸ€– Model: unknown perfect_method: Success rate: 7/7 Avg exact score: 1.000 Avg semantic score: 1.000 Avg processing time: 0.00s Avg matching score: 1.000 πŸ’° Avg tokens per run: 0 πŸ’° Avg cost per run: $0.0000 πŸ’° Total cost: $0.0000 πŸ€– Model: unknown dumb_method: Success rate: 7/7 Avg exact score: 0.000 Avg semantic score: 0.000 Avg processing time: 0.00s Avg matching score: 0.000 πŸ’° Token usage: Not available πŸ’Ύ CACHE PERFORMANCE: Total cache hits: 46 Total cache misses: 17 Cache hit rate: 73.0%
πŸ“ˆ METHOD PERFORMANCE: unified_method: Success rate: 7/7 Avg exact score: 0.122 Avg semantic score: 0.700 Avg processing time: 73.35s Avg matching score: 0.440 πŸ’° Avg tokens per run: 318206 πŸ’° Avg cost per run: $0.0508 πŸ’° Total cost: $0.3558 πŸ€– Model: unknown original_method: Success rate: 7/7 Avg exact score: 0.106 Avg semantic score: 0.489 Avg processing time: 77.57s Avg matching score: 0.317 πŸ’° Avg tokens per run: 335146 πŸ’° Avg cost per run: $0.0555 πŸ’° Total cost: $0.3883 πŸ€– Model: unknown hybrid_method: Success rate: 7/7 Avg exact score: 0.093 Avg semantic score: 0.720 Avg processing time: 528.95s Avg matching score: 0.445 πŸ’° Avg tokens per run: 335928 πŸ’° Avg cost per run: $0.0573 πŸ’° Total cost: $0.4010 πŸ€– Model: unknown clustering_method: Success rate: 7/7 Avg exact score: 0.122 Avg semantic score: 0.585 Avg processing time: 209.73s Avg matching score: 0.341 πŸ’° Avg tokens per run: 360825 πŸ’° Avg cost per run: $0.0651 πŸ’° Total cost: $0.4556 πŸ€– Model: gpt-4o-mini direct_llm_method: Success rate: 7/7 Avg exact score: 0.143 Avg semantic score: 0.767 Avg processing time: 111.73s Avg matching score: 0.579 πŸ’° Avg tokens per run: 81373 πŸ’° Avg cost per run: $0.0131 πŸ’° Total cost: $0.0916 πŸ€– Model: gpt-4o-mini pydantic_ai_method: Success rate: 7/7 Avg exact score: 0.075 Avg semantic score: 0.385 Avg processing time: 208.52s Avg matching score: 0.231 πŸ’° Avg tokens per run: 94392 πŸ’° Avg cost per run: $0.0182 πŸ’° Total cost: $0.0908 πŸ€– Model: unknown sequential_pydantic: Success rate: 7/7 Avg exact score: 0.126 Avg semantic score: 0.593 Avg processing time: 58.05s Avg matching score: 0.423 πŸ’° Avg tokens per run: 84436 πŸ’° Avg cost per run: $0.0139 πŸ’° Total cost: $0.0976 πŸ€– Model: unknown perfect_method: Success rate: 7/7 Avg exact score: 1.000 Avg semantic score: 1.000 Avg processing time: 0.00s Avg matching score: 1.000 πŸ’° Avg tokens per run: 0 πŸ’° Avg cost per run: $0.0000 πŸ’° Total cost: $0.0000 πŸ€– Model: unknown dumb_method: Success rate: 7/7 Avg exact score: 0.000 Avg semantic score: 0.000 Avg processing time: 0.00s Avg matching score: 0.000 πŸ’° Token usage: Not available πŸ’Ύ CACHE PERFORMANCE: Total cache hits: 0 Total cache misses: 63 Cache hit rate: 0.0%

Trimmed before/after prompt

πŸ“ˆ METHOD PERFORMANCE: unified_method: Success rate: 7/7 Avg exact score: 0.069 Avg semantic score: 0.422 Avg processing time: 25.94s Avg matching score: 0.235 πŸ’° Avg tokens per run: 410098 πŸ’° Avg cost per run: $0.0636 πŸ’° Total cost: $0.4451 πŸ€– Model: unknown original_method: Success rate: 7/7 Avg exact score: 0.069 Avg semantic score: 0.375 Avg processing time: 39.26s Avg matching score: 0.200 πŸ’° Avg tokens per run: 422211 πŸ’° Avg cost per run: $0.0666 πŸ’° Total cost: $0.4661 πŸ€– Model: unknown hybrid_method: Success rate: 7/7 Avg exact score: 0.089 Avg semantic score: 0.566 Avg processing time: 159.75s Avg matching score: 0.339 πŸ’° Avg tokens per run: 425239 πŸ’° Avg cost per run: $0.0693 πŸ’° Total cost: $0.4849 πŸ€– Model: unknown clustering_method: Success rate: 7/7 Avg exact score: 0.092 Avg semantic score: 0.551 Avg processing time: 607.81s Avg matching score: 0.322 πŸ’° Avg tokens per run: 454201 πŸ’° Avg cost per run: $0.0789 πŸ’° Total cost: $0.5523 πŸ€– Model: gpt-4o-mini direct_llm_method: Success rate: 7/7 Avg exact score: 0.099 Avg semantic score: 0.474 Avg processing time: 31.50s Avg matching score: 0.244 πŸ’° Avg tokens per run: 103817 πŸ’° Avg cost per run: $0.0161 πŸ’° Total cost: $0.1126 πŸ€– Model: gpt-4o-mini pydantic_ai_method: Success rate: 7/7 Avg exact score: 0.077 Avg semantic score: 0.373 Avg processing time: 29.15s Avg matching score: 0.185 πŸ’° Avg tokens per run: 105100 πŸ’° Avg cost per run: $0.0164 πŸ’° Total cost: $0.1150 πŸ€– Model: unknown sequential_pydantic: Success rate: 7/7 Avg exact score: 0.058 Avg semantic score: 0.346 Avg processing time: 23.68s Avg matching score: 0.168 πŸ’° Avg tokens per run: 104385 πŸ’° Avg cost per run: $0.0161 πŸ’° Total cost: $0.1127 πŸ€– Model: unknown perfect_method: Success rate: 7/7 Avg exact score: 1.000 Avg semantic score: 1.000 Avg processing time: 0.00s Avg matching score: 1.000 πŸ’° Avg tokens per run: 0 πŸ’° Avg cost per run: $0.0000 πŸ’° Total cost: $0.0000 πŸ€– Model: unknown dumb_method: Success rate: 7/7 Avg exact score: 0.000 Avg semantic score: 0.000 Avg processing time: 0.00s Avg matching score: 0.000 πŸ’° Token usage: Not available πŸ’Ύ CACHE PERFORMANCE: Total cache hits: 63 Total cache misses: 0 Cache hit rate: 100.0%
πŸ“ˆ METHOD PERFORMANCE: unified_method: Success rate: 7/7 Avg exact score: 0.057 Avg semantic score: 0.531 Avg processing time: 65.45s Avg matching score: 0.321 πŸ’° Avg tokens per run: 411793 πŸ’° Avg cost per run: $0.0644 πŸ’° Total cost: $0.4505 πŸ€– Model: unknown original_method: Success rate: 7/7 Avg exact score: 0.050 Avg semantic score: 0.351 Avg processing time: 56.29s Avg matching score: 0.196 πŸ’° Avg tokens per run: 422546 πŸ’° Avg cost per run: $0.0665 πŸ’° Total cost: $0.4658 πŸ€– Model: unknown hybrid_method: Success rate: 7/7 Avg exact score: 0.097 Avg semantic score: 0.635 Avg processing time: 124.07s Avg matching score: 0.412 πŸ’° Avg tokens per run: 427415 πŸ’° Avg cost per run: $0.0701 πŸ’° Total cost: $0.4908 πŸ€– Model: unknown clustering_method: Success rate: 7/7 Avg exact score: 0.083 Avg semantic score: 0.526 Avg processing time: 194.26s Avg matching score: 0.313 πŸ’° Avg tokens per run: 459523 πŸ’° Avg cost per run: $0.0809 πŸ’° Total cost: $0.5664 πŸ€– Model: gpt-4o-mini direct_llm_method: Success rate: 7/7 Avg exact score: 0.076 Avg semantic score: 0.587 Avg processing time: 459.17s Avg matching score: 0.408 πŸ’° Avg tokens per run: 104811 πŸ’° Avg cost per run: $0.0166 πŸ’° Total cost: $0.1163 πŸ€– Model: gpt-4o-mini pydantic_ai_method: Success rate: 7/7 Avg exact score: 0.064 Avg semantic score: 0.299 Avg processing time: 112.01s Avg matching score: 0.146 πŸ’° Avg tokens per run: 106761 πŸ’° Avg cost per run: $0.0168 πŸ’° Total cost: $0.1010 πŸ€– Model: unknown sequential_pydantic: Success rate: 7/7 Avg exact score: 0.067 Avg semantic score: 0.380 Avg processing time: 27.02s Avg matching score: 0.200 πŸ’° Avg tokens per run: 105026 πŸ’° Avg cost per run: $0.0163 πŸ’° Total cost: $0.1142 πŸ€– Model: unknown perfect_method: Success rate: 7/7 Avg exact score: 1.000 Avg semantic score: 1.000 Avg processing time: 0.00s Avg matching score: 1.000 πŸ’° Avg tokens per run: 0 πŸ’° Avg cost per run: $0.0000 πŸ’° Total cost: $0.0000 πŸ€– Model: unknown dumb_method: Success rate: 7/7 Avg exact score: 0.000 Avg semantic score: 0.000 Avg processing time: 0.00s Avg matching score: 0.000 πŸ’° Token usage: Not available πŸ’Ύ CACHE PERFORMANCE: Total cache hits: 0 Total cache misses: 63 Cache hit rate: 0.0%
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Recommendations