Overview

The Quantum Thermochemical API provides programmatic access to a collection of quantum chemistry computational resources based on the W4-11 dataset. It enables researchers to:

Retrieve basis sets used for high-accuracy quantum thermochemical computations.
Submit and query research results related to molecular ground state energy calculations.
Explore community-contributed results, including accuracy metrics and metadata.

The API is designed to facilitate reproducible computational chemistry research and allow comparison of different algorithmic approaches to molecular energy computation. Key features include:

Basis Retrieval: Access curated computational basis sets for W4-11 benchmark calculations.
User Submissions: Submit algorithmic results and retrieve public research submissions.
Flexible Queries: Search by DOI, page, or submission date and aggregate results.

Endpoints

1. Fetching Basis Sets`GET /entries/:basis_name`

Description: Retrieves the specified basis set as a downloadable file
Parameters: The name of the requested dataset to be downloaded. A comprehensive list can be found at BasisSetExchange.org , but to find if a specific basis is offered, please refer to our supported bases.
Response: Returns a basis file containing corresponding information formatted as following:
- Each line of the file is formatted as an individual json object
- The first line contains the basis name formatted as {$basis: "basis_name"}
- The rest of the file consists of json objects representing the information of each of the 152 supported species:
  {name: String, ecore: int, nelecas: int, ncas: int, h1e: tensor, h2e: tensor, cct2: tensor}
  - name: String - identifier for the molecule
  - ecore: int - number of electrons in the core
  - nelecas: int - number of active electrons
  - ncas: int - number of configuration state functions
  - h1e: Tensor - one-electron integrals (kinetic + nuclear attraction)
  - h2e: Tensor - two-electron integrals (Coulomb repulsion)
  - cct2: Tensor - two-particle contraction tensor
- All tensor information is flat mapped and encoded in base-64, returned as a struct containing and integer list representing the shape of the tensor and a b64 string of its information: {shape: int[], data: String}

2. User Submissions

`POST /submission`

Description: Submit research results for review
Request Body: Contains the DOI, email and ground state energy data computed: {doi: String, email: String, data: {...species: float}}
- DOI - Valid digital object identifier available for cross-reference on Crossref REST API
- Email - A valid email to be contacted for submission confirmation and subsequent notifications
- Data - Map of string-float key value pairs, containing every species in the provided data set and its computed ground state energy
Response: JSON message on completion of submission verification and confirmation email

`GET /submission/fetch?...`

Description: Multipurpose endpoint for querying public submission data
Query Parameters: There are 2 primary types of requests:
- SINGLE - Fetches individual submission data based on their DOI identifiers.
  - Request: GET /submission/fetch?type=single&doi=x,y,z,...
  - Description: Returns a json list of entries with doi's matching x, y, or z, containing the doi and submission info, containing the data on submission and its accuracy. The data is formatted as a map of species to float tuple, such that the first index contains the value and the second is the error from precomputed ground state energies
  - Response: [{doi: String, info: {data: { ...species: [value, absolute-error] }, accuracy: mean-absolute-error}}, ...]
- PAGE - Fetches submissions in bulk, responding with doi, data and accuracy information. The max number of submissions per page is 200, and both pages and limit must be positive integer values.
  - Request: GET /submission/fetch?type=page&page=x&limit=y
  - Description: Returns a struct containing the page number x, the limit of submissions per page y, the total number of submissions, the total number of pages for y entries per page, and the result for the x'th page of y entries as a list of submissions.
  - Response: {page: int, limit: int, total: int, totalPages: int, results: [ same format as single query ]}
- SORT & FILTER - Provided query functionality for searching and filtering for specific submissions by page.
  - Fields: The following fields are supported for both sorting and filtering:
    - DOI
    - Date
    - Accuracy
  - Sorting: Queries can be sorted with comma separated fields, in the order provided. The order is specified with either :asc or :desc suffix on their respective field.
    Example: ?type=page&sort=date:desc,accuracy:desc yields pages sorted first by submission time, then by accuracy.
  - Filtering: Queries can be filtered
Defaults:
- An unspecified query to "/submission/fetch" is equivalent to "type=page&page=1&limit=15"
- A partially specified query to "/submission/fetch?type=single" will always throw a 404 with no items found
- A partially specified query to "/submission/fetch?type=page" will default to page 1 of 15 entries

Examples

Default Fetch:

Paginated Fetch:

Specific DOI Fetch:

Timeline Fetch:

Overview

Python Package

The W4Benchmark Python library is designed for benchmarking algorithms on the W4-11 dataset. It provides a consistent framework for controlled experiments, helping ensure results are reproducible and comparable. It supports both decorator-based workflows (for quick setup and automated iteration over all molecules) and manual workflows (for fine-grained control over computation).

Repository: w4benchmark on GitHub Deployment: w4benchmark on PyPI

Installation

Install the package from PyPI with pip install w4benchmark

Requirements:

Python 3.6+
numpy >= 2.2.4
requests >= 2.32.3

Usage

This package provides two main decorators:

@W4Decorators.process(...): For processing each molecule (e.g., computing energies)
@W4Decorators.analyze(...): For analyzing the results (e.g., comparing predictions to reference data)

Each decorated function must accept exactly two parameters: the molecule name (str) and a Molecule object.

Each decorator accepts a list of runtime parameters, with a predefined set of variables set by default at decoration time:

geominfo_url: specifies the directory of species geometries data file
resources_url: specifies the root of the resources directory for finding cached basis sets, geometries and more
api_url: the URL to query for basis set info if the specified basis is not found in resources directory
basis: the basis information to parse from resource directory (defaults to "sto6g", and if set to a value not found in resources will query from `api_url`)
debug: the debug level for logging output (defaults to `logging.WARNING`)

All parameters can be queried and modified during runtime via the `W4.parameters` object. Arbitrary parameters can be added and queried at runtime to keep track of other runtime specifics (ex. `@W4Decorators.process( printValues = true )` will add a field `printValues` to `W4.parameters` with a value of `true`).

Example Script:

Create a script like compute.py:

from w4benchmark import W4Decorators, Molecule, W4

@W4Decorators.process(basis="sto6g", debug=logging.DEBUG)
def compute_energy(name: str, mol: Molecule):
    # Replace with real computation

@W4Decorators.analyze(basis="sto6g")
def analyze_results(name: str, mol: Molecule):
    # Replace with real analytics

Then run from the command line with either:
python compute.py --process
or:
python compute.py --analyze

Each command will iterate over every molecule in the dataset and apply the corresponding decorated function.

The GitHub repository contains more examples of library usage.

Advanced

Manual Iteration

If you want full control, you can manually run the W4 benchmark from within a __main__ block:

from w4benchmark import W4

if __name__ == '__main__':
    W4.parameters.basis = "sto6g"  # Set runtime parameters
    W4.init() # manual execution requires the .init() function to be called

    # Example usage
    for name, mol in W4:
        print(f"{name}: spin = {mol.spin}, charge = {mol.charge}")

This allows you to iterate through the dataset as a normal iterable. You can also dereference W4 with a specific species if you want to select a singular molecule object (ex. W4["acetaldehyde"]).

Multiple Decorations

You can apply the same decorator to multiple functions to group computations. This is useful when an algorithm is composed of sequential steps, since each decorated function runs in order. Using multiple decorated functions can make the data flow easier to follow and visually distinct.

For example, here’s how you might calculate the root-mean-square (RMS) radius of a molecule using two sequential @process functions:

from w4benchmark import W4Decorators, Molecule
import math

centroid = {}

@W4Decorators.process()
def process_a(species: str, mol: Molecule):
    coords = [pos for _, pos in mol.geom]
    centroid[species] = tuple(sum(values) / len(coords) for values in zip(*coords))

def dist2(a, b):
    return sum((ai - bi) ** 2 for ai, bi in zip(a, b))

@W4Decorators.process()
def process_b(species: str, mol: Molecule):
    dist_sq = [dist2(pos, centroid[species]) for _, pos in mol.geom]
    rms_radius = math.sqrt(sum(dist_sq) / len(dist_sq))
    print(f'species: "{species}", centroid: {centroid[species]}, rms: {rms_radius}')

All functions decorated with the same decorator (@W4Decorators.process in this case) will run sequentially, enabling advanced workflows that act on shared results. In this case, the centroid dict will be completely filled before any RMS calculations take place, which can help with debugging intermediary values.

Dataset Attribution

This tool builds on the W4-11 Dataset

Goerigk, L., & Grimme, S. (2011).
A thorough benchmark of density functional methods for general main group thermochemistry, kinetics, and noncovalent interactions.
Phys. Chem. Chem. Phys., 13, 6670–6688.
https://doi.org/10.1039/C0CP02984J

License

The library is licensed under CC BY-NC 4.0

You may use and adapt it for non-commercial purposes with proper attribution.

See the LICENSE for details

Overview

Endpoints

1. Fetching Basis SetsGET /entries/:basis_name

2. User Submissions

POST /submission

GET /submission/fetch?...

Examples

Default Fetch:

Paginated Fetch:

Specific DOI Fetch:

Timeline Fetch:

Overview

Python Package

Installation

Usage

Example Script:

Advanced

Manual Iteration

Multiple Decorations

Dataset Attribution

License

1. Fetching Basis Sets`GET /entries/:basis_name`

`POST /submission`

`GET /submission/fetch?...`