TPCx-AI

Creator
Creator
Seonglae Cho
Created
Created
2024 Mar 12 11:47
Editor
Edited
Edited
2024 Mar 30 13:59
Refs
Refs

Pipeline

  1. Data Acquisition
  1. Data Transformation
  1. Model Training
  1. Model Inference
  1. Model Scoring

Project folder structure

  • workload main use case script
    • python workload scripts for each use case for single-node system
    • spark workload scripts for each use case for multi-node system
    • spark3 workload scripts for each use case for multi-node system
  • data-gen
    • config
      • tpcxai-generation.xml dataset download specification
      • tpcxai-schema.xml dataset type schema specification
  • tools
    • python python setup scripts for single-node system
      • python.yaml default python conda environment file for spark engine
      • python-ks.yaml I guess ks stands for kitchen sink with additional packages
    • spark spark setup scripts for multi-node system
  • lib - java jar libraries
    • driver - project config & source code (tensorflow & keras base)

    Project scripts

    • setenv.sh set necessary environment variables and scale factor
    • setup-python.sh create virtual python environment for single-node system
    • setup-spark.sh create virtual python environment for multi-node system
    • setup-spark.sh create virtual python environment for multi-node system
    • TPCx-AI_Benchmarkrun.sh run benchmark
    • TPCx-AI_Validation.sh run validation
    • Full_TPCx-AI_Benchmarkrun run validation and benchmark
    notion image
    notion image
    notion image
    notion image
    notion image
    notion image
     
    notion image
     
     
    notion image
    notion image
     
     
     
     
     

    PPT

    Report

    Korean

     
     

    Recommendations