API Documentation

This part of the documentation is automatically generated from the ChemSpiPy source code and comments.

chemspipy.api

Core API for interacting with ChemSpider web services.

class chemspipy.api.ChemSpider[source]

Provides access to the ChemSpider API.

Usage:

>>> from chemspipy import ChemSpider
>>> cs = ChemSpider('<YOUR-API-KEY>')
Parameters:
  • api_key (string) – Your ChemSpider API key.
  • user_agent (string) – (Optional) Identify your application to ChemSpider servers.
  • api_url (string) – (Optional) API server. Default https://api.rsc.org.
  • api_version (string) – (Optional) API version. Default v1.
request(method, api, namespace, endpoint, params=None, json=None)[source]

Make a request to the ChemSpider API.

Parameters:
  • method (string) – HTTP method.
  • api (string) – Top-level API, e.g. compounds.
  • namespace (string) – API namespace, e.g. filter, lookups, records, or tools.
  • endpoint (string) – Web service endpoint URL.
  • params (dict) – Query parameters to add to the URL.
  • json (dict) – JSON data to send in the request body.
Returns:

Web Service response JSON.

Return type:

dict

get(api, namespace, endpoint, params=None)[source]

Convenience method for making GET requests.

Parameters:
  • api (string) – Top-level API, e.g. compounds.
  • namespace (string) – API namespace, e.g. filter, lookups, records, or tools.
  • endpoint (string) – Web service endpoint URL.
  • params (dict) – Query parameters to add to the URL.
Returns:

Web Service response JSON.

Return type:

dict

post(api, namespace, endpoint, json=None)[source]

Convenience method for making POST requests.

Parameters:
  • api (string) – Top-level API, e.g. compounds.
  • namespace (string) – API namespace, e.g. filter, lookups, records, or tools.
  • endpoint (string) – Web service endpoint URL.
  • json (dict) – JSON data to send in the request body.
Returns:

Web Service response content.

Return type:

dict or string

get_compound(csid)[source]

Return a Compound object for a given ChemSpider ID.

Parameters:csid (string|int) – ChemSpider ID.
Returns:The Compound with the specified ChemSpider ID.
Return type:Compound
get_compounds(csids)[source]

Return a list of Compound objects, given a list ChemSpider IDs.

Parameters:csids (list[string|int]) – List of ChemSpider IDs.
Returns:List of Compounds with the specified ChemSpider IDs.
Return type:list[Compound]
search(query, order=None, direction='ascending', raise_errors=False)[source]

Search ChemSpider for the specified query and return the results.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • query (string|int) – Search query.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
  • raise_errors (bool) – (Optional) If True, raise exceptions. If False, store on Results exception property.
Returns:

Search Results list.

Return type:

Results

get_datasources()[source]

Get the list of datasources in ChemSpider.

Many other endpoints let you restrict which sources are used to lookup the requested query. Restricting the sources makes queries faster.

Returns:List of datasources.
Return type:list[string]
get_details(record_id, fields=['SMILES', 'Formula', 'AverageMass', 'MolecularWeight', 'MonoisotopicMass', 'NominalMass', 'CommonName', 'ReferenceCount', 'DataSourceCount', 'PubMedCount', 'RSCCount', 'Mol2D', 'Mol3D'])[source]

Get details for a compound record.

The available fields are listed in FIELDS.

Parameters:
  • record_id (int) – Record ID.
  • fields (list[string]) – (Optional) List of fields to include in the result.
Returns:

Record details.

Return type:

dict

get_details_batch(record_ids, fields=['SMILES', 'Formula', 'AverageMass', 'MolecularWeight', 'MonoisotopicMass', 'NominalMass', 'CommonName', 'ReferenceCount', 'DataSourceCount', 'PubMedCount', 'RSCCount', 'Mol2D', 'Mol3D'])[source]

Get details for a list of compound records.

The available fields are listed in FIELDS.

Parameters:
  • record_ids (list[int]) – List of record IDs (up to 100).
  • fields (list[string]) – (Optional) List of fields to include in the results.
Returns:

List of record details.

Return type:

list[dict]

get_external_references(record_id, datasources=None)[source]

Get external references for a compound record.

Optionally filter the results by data source. Use get_datasources() to get the available datasources.

Parameters:
  • record_id (int) – Record ID.
  • datasources (list[string]) – (Optional) List of datasources to restrict the results to.
Returns:

External references.

Return type:

list[dict]

get_image(record_id)[source]

Get image for a compound record.

Parameters:record_id (int) – Record ID.
Returns:Image.
Return type:bytes
get_mol(record_id)[source]

Get MOLfile for a compound record.

Parameters:record_id (int) – Record ID.
Returns:MOLfile.
Return type:string
filter_element(include_elements, exclude_elements=None, include_all=False, complexity=None, isotopic=None, order=None, direction=None)[source]

Search compounds by element.

Set include_all to true to only consider records that contain all of the elements in include_elements, otherwise all records that contain any of the elements will be returned.

A compound with a complexity of ‘multiple’ has more than one disconnected system in it or a metal atom or ion.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • include_elements (list[string]) – List of up to 15 elements to search for compounds containing.
  • exclude_elements (list[string]) – List of up to 100 elements to exclude compounds containing.
  • include_all (bool) – (Optional) Whether to only include compounds that have all include_elements.
  • complexity (string) – (Optional) ‘any’, ‘single’, or ‘multiple’
  • isotopic (string) – (Optional) ‘any’, ‘labeled’, or ‘unlabeled’.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_status and filter_results.

Return type:

string

filter_formula(formula, datasources=None, order=None, direction=None)[source]

Search compounds by formula.

Optionally filter the results by data source. Use get_datasources() to get the available datasources.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • formula (string) – Molecular formula.
  • datasources (list[string]) – (Optional) List of datasources to restrict the results to.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_status and filter_results.

Return type:

string

filter_formula_batch(formulas, datasources=None, order=None, direction=None)[source]

Search compounds with a list of formulas.

Optionally filter the results by data source. Use get_datasources() to get the available datasources.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • formulas (list[string]) – Molecular formula.
  • datasources (list[string]) – (Optional) List of datasources to restrict the results to.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_formula_batch_status and filter_formula_batch_results.

Return type:

string

filter_formula_batch_status(query_id)[source]

Get formula batch filter status using a query ID that was returned by a previous filter request.

Parameters:query_id (string) – Query ID from a previous formula batch filter request.
Returns:Status dict with ‘status’, ‘count’, and ‘message’ fields.
Return type:dict
filter_formula_batch_results(query_id)[source]

Get formula batch filter results using a query ID that was returned by a previous filter request.

Each result is a dict containing a formula key and a results key.

Parameters:query_id (string) – Query ID from a previous formula batch filter request.
Returns:List of results.
Return type:list[dict]
filter_inchi(inchi)[source]

Search compounds by InChI.

Parameters:inchi (string) – InChI.
Returns:Query ID that may be passed to filter_status and filter_results.
Return type:string
filter_inchikey(inchikey)[source]

Search compounds by InChIKey.

Parameters:inchikey (string) – InChIKey.
Returns:Query ID that may be passed to filter_status and filter_results.
Return type:string
filter_intrinsicproperty(formula=None, molecular_weight=None, nominal_mass=None, average_mass=None, monoisotopic_mass=None, molecular_weight_range=None, nominal_mass_range=None, average_mass_range=None, monoisotopic_mass_range=None, complexity=None, isotopic=None, order=None, direction=None)[source]

Search compounds by intrinsic property, such as formula and mass.

At least one of formula, molecular_weight, nominal_mass, average_mass, monoisotopic_mass must be specified.

A compound with a complexity of ‘multiple’ has more than one disconnected system in it or a metal atom or ion.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • formula (string) – Molecular formula.
  • molecular_weight (float) – Molecular weight.
  • nominal_mass (float) – Nominal mass.
  • average_mass (float) – Average mass.
  • monoisotopic_mass (float) – Monoisotopic mass.
  • molecular_weight_range (float) – Molecular weight range.
  • nominal_mass_range (float) – Nominal mass range.
  • average_mass_range (float) – Average mass range.
  • monoisotopic_mass_range (float) – Monoisotopic mass range.
  • complexity (string) – (Optional) ‘any’, ‘single’, or ‘multiple’
  • isotopic (string) – (Optional) ‘any’, ‘labeled’, or ‘unlabeled’.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_status and filter_results.

Return type:

string

filter_mass(mass, mass_range, datasources=None, order=None, direction=None)[source]

Search compounds by mass.

Filter to compounds within mass_range of the given mass.

Optionally filter the results by data source. Use get_datasources() to get the available datasources.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • mass (float) – Mass between 1 and 11000 Atomic Mass Units.
  • mass_range (float) – Mass range between 0.0001 and 100 Atomic Mass Units.
  • datasources (list[string]) – (Optional) List of datasources to restrict the results to.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_status and filter_results.

Return type:

string

filter_mass_batch(masses, datasources=None, order=None, direction=None)[source]

Search compounds with a list of masses and mass ranges.

The masses parameter should be a list of tuples, each with two elements: A mass, and a mass range:

qid = cs.filter_mass_batch(masses=[(12, 0.001), (24, 0.001)])

Optionally filter the results by data source. Use get_datasources() to get the available datasources.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • float]] masses (list[tuple[float,) – List of (mass, range) tuples.
  • datasources (list[string]) – (Optional) List of datasources to restrict the results to.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_formula_batch_status and filter_formula_batch_results.

Return type:

string

filter_mass_batch_status(query_id)[source]

Get formula batch filter status using a query ID that was returned by a previous filter request.

Parameters:query_id (string) – Query ID from a previous formula batch filter request.
Returns:Status dict with ‘status’, ‘count’, and ‘message’ fields.
Return type:dict
filter_mass_batch_results(query_id)[source]

Get formula batch filter results using a query ID that was returned by a previous filter request.

Each result is a dict containing a formula key and a results key.

Parameters:query_id (string) – Query ID from a previous formula batch filter request.
Returns:List of results.
Return type:list[dict]
filter_name(name, order=None, direction=None)[source]

Search compounds by name.

The accepted values for order are: RECORD_ID, MASS_DEFECT, MOLECULAR_WEIGHT, REFERENCE_COUNT, DATASOURCE_COUNT, PUBMED_COUNT or RSC_COUNT.

Parameters:
  • name (string) – Compound name.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Query ID that may be passed to filter_status and filter_results.

Return type:

string

filter_smiles(smiles)[source]

Search compounds by SMILES.

Parameters:smiles (string) – Compound SMILES.
Returns:Query ID that may be passed to filter_status and filter_results.
Return type:string
filter_status(query_id)[source]

Get filter status using a query ID that was returned by a previous filter request.

Parameters:query_id (string) – Query ID from a previous filter request.
Returns:Status dict with ‘status’, ‘count’, and ‘message’ fields.
Return type:dict
filter_results(query_id, start=None, count=None)[source]

Get filter results using a query ID that was returned by a previous filter request.

Parameters:
  • query_id (string) – Query ID from a previous filter request.
  • start (int) – Zero-based results offset.
  • count (int) – Number of results to return.
Returns:

List of results.

Return type:

list[int]

filter_results_sdf(query_id)[source]

Get filter results as SDF file using a query ID that was returned by a previous filter request.

Parameters:query_id (string) – Query ID from a previous filter request.
Returns:SDF file containing the results.
Return type:bytes
convert(input, input_format, output_format)[source]

Convert a chemical from one format to another.

Format: SMILES, InChI, InChIKey or Mol.

Allowed conversions: from InChI to InChIKey, from InChI to Mol file, from InChI to SMILES, from InChIKey to InChI, from InChIKey to Mol file, from Mol file to InChI, from Mol file to InChIKey, from SMILES to InChI.

Parameters:
  • input (string) – Input chemical.
  • input_format (string) – Input format.
  • output_format (string) – Output format.
Returns:

Input chemical in output format.

Return type:

string

validate_inchikey(inchikey)[source]

Return whether inchikey is valid.

Parameters:inchikey (string) – The InChIKey to validate.
Returns:Whether the InChIKey is valid.
Return type:bool
get_databases()[source]

Get the list of datasources in ChemSpider.

Deprecated since version 2.0.0: Use get_datasources() instead.

get_extended_compound_info(csid)[source]

Get extended record details for a CSID.

Deprecated since version 2.0.0: Use get_details() instead.

Parameters:csid (string|int) – ChemSpider ID.
get_extended_compound_info_list(csids)[source]

Get extended record details for a list of CSIDs.

Deprecated since version 2.0.0: Use get_details_batch() instead.

Parameters:csids (list[string|int]) – ChemSpider IDs.
get_extended_mol_compound_info_list(csids, mol_type='2d', include_reference_counts=False, include_external_references=False)[source]

Get extended record details (including MOL) for a list of CSIDs.

A maximum of 250 CSIDs can be fetched per request.

Deprecated since version 2.0.0: Use get_details_batch() instead.

Parameters:
  • csids (list[string|int]) – ChemSpider IDs.
  • mol_type (string) – MOL2D, MOL3D or BOTH.
  • include_reference_counts (bool) – Whether to include reference counts.
  • include_external_references (bool) – Whether to include external references.
get_record_mol(csid, calc3d=False)[source]

Get ChemSpider record in MOL format.

Deprecated since version 2.0.0: Use get_mol() instead.

Parameters:
  • csid (string|int) – ChemSpider ID.
  • calc3d (bool) – Whether 3D coordinates should be calculated before returning record data.

Search ChemSpider with arbitrary query, returning results in order of the best match found.

This method returns a transaction ID which can be used with other methods to get search status and results.

Deprecated since version 2.0.0: Use filter_name() instead.

Parameters:query (string) – Search query - a name, SMILES, InChI, InChIKey, CSID, etc.
Returns:Transaction ID.
Return type:string
async_simple_search_ordered(query, order='csid', direction='ascending')[source]

Search ChemSpider with arbitrary query, returning results with a custom order.

This method returns a transaction ID which can be used with other methods to get search status and results.

Deprecated since version 2.0.0: Use filter_name() instead.

Parameters:
  • query (string) – Search query - a name, SMILES, InChI, InChIKey, CSID, etc.
  • order (string) – (Optional) Field to sort the result by.
  • direction (string) – (Optional) ASCENDING or DESCENDING.
Returns:

Transaction ID.

Return type:

string

get_async_search_status(rid)[source]

Check the status of an asynchronous search operation.

Deprecated since version 2.0.0: Use filter_status() instead.

Parameters:rid (string) – A transaction ID, returned by an asynchronous search method.
Returns:Unknown, Created, Scheduled, Processing, Suspended, PartialResultReady, ResultReady, Failed, TooManyRecords
Return type:string
get_async_search_status_and_count(rid)[source]

Check the status of an asynchronous search operation. If ready, a count and message are also returned.

Deprecated since version 2.0.0: Use filter_status() instead.

Parameters:rid (string) – A transaction ID, returned by an asynchronous search method.
Return type:dict
get_async_search_result(rid)[source]

Get the results from a asynchronous search operation.

Deprecated since version 2.0.0: Use filter_results() instead.

Parameters:rid (string) – A transaction ID, returned by an asynchronous search method.
Returns:A list of Compounds.
Return type:list[Compound]
get_async_search_result_part(rid, start=0, count=-1)[source]

Get a slice of the results from a asynchronous search operation.

Deprecated since version 2.0.0: Use filter_results() instead.

Parameters:
  • rid (string) – A transaction ID, returned by an asynchronous search method.
  • start (int) – The number of results to skip.
  • count (int) – The number of results to return. -1 returns all through to end.
Returns:

A list of Compounds.

Return type:

list[Compound]

get_compound_info(csid)[source]

Get SMILES, StdInChI and StdInChIKey for a given CSID.

Deprecated since version 2.0.0: Use get_details() instead.

Parameters:csid (string|int) – ChemSpider ID.
Return type:dict
get_compound_thumbnail(csid)[source]

Get PNG image as binary data.

Deprecated since version 2.0.0: Use get_image() instead.

Parameters:csid (string|int) – ChemSpider ID.
Return type:bytes

Search ChemSpider with arbitrary query.

Deprecated since version 2.0.0: Use search() instead.

Parameters:query (string) – Search query - a chemical name.
Returns:Search Results list.
Return type:Results
chemspipy.api.ASCENDING = 'ascending'

Ascending sort direction

chemspipy.api.DESCENDING = 'descending'

Descending sort direction

chemspipy.api.RECORD_ID = 'record_id'

Record ID sort order

chemspipy.api.MASS_DEFECT = 'mass_defect'

Mass defect sort order

chemspipy.api.MOLECULAR_WEIGHT = 'molecular_weight'

Molecular weight sort order

chemspipy.api.REFERENCE_COUNT = 'reference_count'

Reference count sort order

chemspipy.api.DATASOURCE_COUNT = 'datasource_count'

Datasource count sort order

chemspipy.api.PUBMED_COUNT = 'pubmed_count'

Pubmed count sort order

chemspipy.api.RSC_COUNT = 'rsc_count'

RSC count sort order

chemspipy.api.ORDERS = {'csid': 'recordId', 'datasource_count': 'dataSourceCount', 'mass_defect': 'massDefect', 'molecular_weight': 'molecularWeight', 'pubmed_count': 'pubMedCount', 'record_id': 'recordId', 'reference_count': 'referenceCount', 'rsc_count': 'rscCount'}

Map sort orders to strings required by REST API.

chemspipy.api.DIRECTIONS = {'ascending': 'ascending', 'descending': 'descending'}

Map sort directions to strings required by REST API.

chemspipy.api.FIELDS = ['SMILES', 'Formula', 'AverageMass', 'MolecularWeight', 'MonoisotopicMass', 'NominalMass', 'CommonName', 'ReferenceCount', 'DataSourceCount', 'PubMedCount', 'RSCCount', 'Mol2D', 'Mol3D']

All available compound details fields.

chemspipy.objects

Objects returned by ChemSpiPy API methods.

class chemspipy.objects.Compound(cs, record_id)[source]

A class for retrieving and caching details about a specific ChemSpider record.

The purpose of this class is to provide access to various parts of the ChemSpider API that return information about a compound given its ChemSpider ID. Information is loaded lazily when requested, and cached for future access.

Parameters:
  • cs (ChemSpider) – ChemSpider session.
  • record_id (int|string) – Compound record ID.
record_id

Compound record ID.

Return type:int
csid

ChemSpider ID.

Deprecated since version 2.0.0: Use record_id instead.

Return type:int
image_url

Return the URL of a PNG image of the 2D chemical structure.

Return type:string
molecular_formula

Return the molecular formula for this Compound.

Return type:string
smiles

Return the SMILES for this Compound.

Return type:string
stdinchi

Return the Standard InChI for this Compound.

Deprecated since version 2.0.0: Use inchi instead.

Return type:string
stdinchikey

Return the Standard InChIKey for this Compound.

Deprecated since version 2.0.0: Use inchikey instead.

Return type:string
inchi

Return the InChI for this Compound.

Return type:string
inchikey

Return the InChIKey for this Compound.

Return type:string
average_mass

Return the average mass of this Compound.

Return type:float
molecular_weight

Return the molecular weight of this Compound.

Return type:float
monoisotopic_mass

Return the monoisotopic mass of this Compound.

Return type:float
nominal_mass

Return the nominal mass of this Compound.

Return type:float
common_name

Return the common name for this Compound.

Return type:string
mol_2d

Return the MOL file for this Compound with 2D coordinates.

Return type:string
mol_3d

Return the MOL file for this Compound with 3D coordinates.

Return type:string
image

Return a 2D depiction of this Compound.

Return type:bytes
external_references

Return external references for this Compound.

Return type:list[dict]

chemspipy.errors

Exceptions raised by ChemSpiPy.

exception chemspipy.errors.ChemSpiPyError[source]

Root ChemSpiPy Exception.

exception chemspipy.errors.ChemSpiPyHTTPError(message=None, http_code=None, *args, **kwargs)[source]

Base exception to handle HTTP errors.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
MESSAGE = 'ChemSpiPy Error'

Default message if none supplied. Override in subclasses.

exception chemspipy.errors.ChemSpiPyBadRequestError(message=None, http_code=None, *args, **kwargs)[source]

Raised for a bad request.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyAuthError(message=None, http_code=None, *args, **kwargs)[source]

Raised when API key authorization fails.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyNotFoundError(message=None, http_code=None, *args, **kwargs)[source]

Raised when the requested resource was not found.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyMethodError(message=None, http_code=None, *args, **kwargs)[source]

Raised when an invalid HTTP method is used.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyPayloadError(message=None, http_code=None, *args, **kwargs)[source]

Raised when a request payload is too large.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyRateError(message=None, http_code=None, *args, **kwargs)[source]

Raised when too many requests are sent in a given amount of time.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyServerError(message=None, http_code=None, *args, **kwargs)[source]

Raised when an internal server error occurs.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyUnavailableError(message=None, http_code=None, *args, **kwargs)[source]

Raised when the service is temporarily unavailable.

Parameters:
  • message (string|bytes) – Error message.
  • http_code – HTTP code.
exception chemspipy.errors.ChemSpiPyTimeoutError[source]

Raised when an asynchronous request times out.