Saphetor Variant API

Introduction

The Saphetor Variant API allows developers to easily retrieve information from over 30 genomic databases in a single call. The data is returned using Json, which is easily accessible from any language or platform. For example, it can be easily transformed to native Python objects if required for processing.

This is the same data as is visualised in the VarSome genomic search engine, and leverages Saphetor's proprietary high-performance genomic database.

Batch requests are also available, allowing data for up to 1000 variants to be efficiently retrieved in a single call.

The Saphetor Variant API is free to use for registered users. Please ensure you have registered as a VarSome user, then contact us with your user login in order to receive an authentication token to use the API.

This API is currently experimental, your feedback is much appreciated.

Example Queries

Here are a few simple examples queries:

All the annotation data is returned as a single dictionary, keyed by each source institution & database name. Example keys are "icgc_somatic", "iarc_tp53_germline" and "ncbi_clinvar".

Variant endpoints

Retrieve variant related data.

Schema Lookup request

[GET] [https://api.varsome.com/lookup/schema]

Retrieves the schema of a variant response object, containing relevant information for each field included in the variant lookup response.

Example

https://api.varsome.com/lookup/schema

Variant lookup

[GET] [https://api.varsome.com/lookup/{query}/{ref_genome}{?add-all-data=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases}]

The query parameter can be any of the following: HGVS Protein-level variant, HGVS DNA-level variant, rs_id, 4-part genomic variant specification and variant_id.

The response to all these types of queries has the same format and can be either an Array of variant response objects, or a single variant response object.

Parameters

  • query - A String representation of the variant to query. Can be any of the following:

    • HGVS Protein-level variant - Gene/transcipt name followed by HGVS Protein level notation. Examples: BRAF:V600E, NM_001252678:I182T
    • HGVS DNA-level variant - Gene/transcipt name followed by HGVS DNA level notation. Example: FTO:c.46-43098T>C
    • rs_id - the dbSNP accession number. String “rs” followed by one or more digits. Example: rs113488022
    • 4-part genomic variant specification- chromosome:position:reference_allele:alternate_allele or chromosome:position:reference_length:alternate_allele. The separator may be ‘:' or ‘-‘, the chromosome number is optionally preceded by the string “chr”, and position is the 1-based chromosomal position.
    • variant_id - our 20-digit integer value (example '10190150730273780002’). If you call “region_variants” below, it is faster to then obtain variant data using the variant_ids returned.
  • ref_genome(optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
  • add-ACMG-annotation (optional) - Can be 0 or 1. Include ACMG classification (only available in specific API plans. Requires that either add-all-data parameter is set to 1 or that at least one of ClinVar or Uniprot or both source databases are enabled with the add-source-databases setting.
  • minimum-clinvar-stars (optional) - Can be 0 to 4. Define the minimum ClinVar rating to take into account when calculating the ACMG Verdict.
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples

Batch Lookup for many variants

[POST] [https://api.varsome.com/lookup/batch/{ref_genome}{?add-all-data=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases=all}]

Retrieve variant data for more than one variants which are passed in the POST request payload, based on a refrence genome id. This is currently limited to 1000 variants per request.

Parameters

  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
  • add-varsome-user-entries (optional) - Can be 0 or 1. Include VarSome's user submitted publications in the response
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Request body

  • variants (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `{"variants": ["rs113", "chr22:39777823::CAA"]}`

Get variants in a genomic region

[GET] [https://api.varsome.com/region_variants/{ref_genome}/{chromosome_id}/{position}/{length}{?add-all-data=1&add-source-databases=all}]

Retrieve all known variants inside the genomic region described, using the ref_genome, chromosome_id, position and length.

Parameters

  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • chromosome_id - A number representing the chromosome, 1-22, 23 for X and 24 for Y.(example `1`)
  • position - the 1-based chromosomal position of the start of the region.
  • length - the length of the region in base pairs
  • add-all-data (optional) - Can be 0 or 1. Include all databases in the response. Same as add-source-databases=all
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Example

Gene endpoints

Retrieve gene related data.

Gene response schema

[GET] [https://api.varsome.com/lookup/schema/genes]

Retrieves the schema of a gene response object, containing relevant information for each field included in the gene lookup response.

Example

https://api.varsome.com/lookup/schema/genes

Gene lookup based on gene symbol

[GET] [https://api.varsome.com/lookup/gene/{gene_symbol}/{ref_genome}{?add-all-data=1&add-souce-databases=all}]

Retrieve gene data for the given 'gene_symbol'. Also based on a reference genome id.

Parameters

  • gene_symbol - The gene's symbol
  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples

Batch Lookup for many genes

[POST] [https://api.varsome.com/lookup/genes/batch{?add-all-data=1&expand-pubmed-articles=1&add-source-databases=all}]

Retrieve variant data for more than one variants which are passed in the POST request payload, based on a reference genome id. This is currently limited to 1000 variants per request.

Parameters

  • add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Request body

  • genes (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `genes: ['BRAF', 'TP53']}`

Transcript endpoints

Retrieve transcript related data.

Transcript lookup based on transcript name

[GET] [https://api.varsome.com/lookup/transcript/{transcript_name}/{ref_genome}]

Retrieve transcript data for the given transcript name. Also based on a reference genome id.

Parameters

  • transcript_name - The transcript name
  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples