User Guide

Introduction

QuOptuna is a comprehensive platform for quantum-enhanced machine learning optimization. This guide will walk you through the complete workflow from dataset selection to model analysis and report generation.

Workflow Overview

The QuOptuna workflow consists of four main stages:

Dataset Selection - Load and prepare your data
Optimization - Find the best hyperparameters
Model Training - Train models with optimized parameters
SHAP Analysis - Understand and explain model behavior

Getting Started

Installation

Install QuOptuna using UV (recommended) or pip:

# Using UV (recommended)
uv pip install quoptuna

# Using pip
pip install quoptuna

Launching the Application

Start the Streamlit interface:

quoptuna --start

Or using Python:

python -m quoptuna.frontend.app run

Dataset Selection

UCI ML Repository

QuOptuna provides easy access to datasets from the UCI Machine Learning Repository:

Navigate to the Dataset Selection page
Select UCI ML Repository tab
Choose from popular datasets or enter a custom UCI ID
Click Load UCI Dataset

Popular Datasets: - Statlog (Australian Credit Approval) - ID: 143 - Blood Transfusion Service Center - ID: 176 - Banknote Authentication - ID: 267 - Heart Disease - ID: 45 - Ionosphere - ID: 225

Custom Dataset Upload

To use your own dataset:

Navigate to the Upload Custom Dataset tab
Upload a CSV file
Configure target and feature columns
Apply target transformation if needed

Data Configuration

Important: QuOptuna requires binary classification targets to be encoded as -1 and 1.

Select Target Column: Choose the column you want to predict
Select Features: Choose the features to use for prediction
Target Transformation: Map your target values to -1 and 1
Handle Missing Values: QuOptuna will automatically remove rows with missing values

Click Save Configuration to proceed to the next step.

Data Preparation & Optimization

Data Preparation

Once your dataset is configured:

Review the dataset summary (rows, columns, target distribution)
Click Prepare Data for Training
QuOptuna will automatically:
Split data into training and test sets
Scale features
Convert to the format required by models

Hyperparameter Optimization

Configure and run optimization:

Database Name: Name for storing optimization results
Study Name: Unique identifier for this optimization study
Number of Trials: How many hyperparameter combinations to try (recommended: 50-200)

Click Start Optimization to begin. This will: - Test multiple model types (both quantum and classical) - Try different hyperparameter combinations - Track the best performing configurations

Model Types Tested: - Data Reuploading Classifier (Quantum) - Circuit-Centric Classifier (Quantum) - Quantum Kitchen Sinks (Quantum) - Support Vector Classifier (Classical) - Multi-Layer Perceptron (Classical) - And more...

Understanding Results

After optimization completes, you'll see: - Best Trials: Top performing configurations - F1 Scores: Performance metrics for quantum and classical approaches - Hyperparameters: The configuration for each trial

SHAP Analysis & Reporting

Trial Selection

Navigate to the SHAP Analysis page
Select a trial from the dropdown (sorted by performance)
Review the trial details and parameters

Model Training

Click Train Model to train the selected model
The model will be trained on your data with the optimized hyperparameters

SHAP Analysis

Configure SHAP analysis:

Use Probability Predictions: Use probability outputs instead of class predictions
Use Subset of Data: Analyze a subset for faster computation
Subset Size: Number of samples to analyze (recommended: 50-100)

Click Run SHAP Analysis to calculate SHAP values.

SHAP Visualizations

QuOptuna provides multiple visualization types:

Bar Plot

Shows the mean absolute SHAP value for each feature, indicating overall importance.

Use Case: Quick overview of feature importance

Beeswarm Plot

Shows the distribution of SHAP values, with color indicating feature value (red = high, blue = low).

Use Case: Understanding how feature values affect predictions

Violin Plot

Shows the distribution of SHAP values for each feature.

Use Case: Understanding the variability in feature impact

Heatmap

Shows SHAP values for individual instances.

Use Case: Instance-level analysis, finding patterns in predictions

Waterfall Plot

Explains how features contribute to a single prediction.

Use Case: Understanding individual predictions in detail

Confusion Matrix

Shows classification performance.

Use Case: Evaluating overall model accuracy

Report Generation

Generate comprehensive AI-powered reports:

Select LLM Provider: Google (Gemini), OpenAI (GPT), or Anthropic (Claude)
Enter API Key: Your API key for the selected provider
Model Name: Specific model to use (e.g., "models/gemini-2.0-flash-exp")
Dataset Information (optional): Add context about your dataset

Click Generate Report to create a detailed analysis report.

Report Includes: - Performance metrics analysis - SHAP value interpretation - Feature importance ranking - Risk and fairness assessment - Governance recommendations

Best Practices

Optimization

Start Small: Begin with 50-100 trials to get quick results
Increase Gradually: Use 100-200 trials for production models
Monitor Performance: Check both quantum and classical model scores
Save Studies: Use descriptive names for databases and studies

SHAP Analysis

Use Subsets: Analyze 50-100 samples for faster computation
Multiple Plots: Generate several plot types for comprehensive understanding
Document Findings: Save plots and reports for future reference
Understand Context: Consider domain knowledge when interpreting SHAP values

Report Generation

Provide Context: Add dataset URL and description for better AI insights
Choose Appropriate Models:
Fast models (Gemini Flash): Quick exploratory reports
Advanced models (GPT-4, Gemini Pro): Detailed production reports
Review Carefully: AI-generated reports should be reviewed by domain experts

Troubleshooting

Common Issues

Dataset Loading Fails - Check UCI dataset ID is correct - Ensure CSV file is properly formatted - Verify file encoding (UTF-8 recommended)

Optimization Errors - Ensure data has no missing values - Check target column has exactly 2 unique values - Verify sufficient samples for train/test split

SHAP Analysis Slow - Reduce subset size - Use simpler model types - Check available memory

Report Generation Fails - Verify API key is valid - Check internet connection - Ensure model name is correct - Try a different LLM provider

Advanced Features

Loading Previous Studies

You can load and analyze previously run optimizations:

Go to the Optimization page
Enter the database name and study name
Click Load Optimizer
Results will be available for analysis

Batch Processing

For multiple datasets, you can: 1. Use the Python API directly (see API documentation) 2. Script the workflow using QuOptuna classes 3. Save results to different databases

Custom Models

Advanced users can integrate custom models by: 1. Following the model interface in quoptuna.backend.models 2. Adding model configurations to the optimizer 3. See API documentation for details

Next Steps

Explore the API Documentation for programmatic usage
Check out Examples for common use cases
Contribute on GitHub

Support

GitHub Issues: Report bugs or request features
Documentation: Full documentation
Community: Join our discussions on GitHub