> ## Documentation Index
> Fetch the complete documentation index at: https://docs.turingdb.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Query Language in TuringDB

> TuringDB uses Cypher as a query language and has some unique types of queries

TuringDB supports most of the **Cypher query language**, extended with versioning, metadata search, and flexible property matching.

This guide covers some examples of queries types, including `MATCH`, `CREATE`, property filters, procedures, and available data types.

# Basics

Queries are built by referencing nodes and edges. TuringDB supports the standard CYPHER syntax, in that nodes are denoted using parentheses `()`, whilst edges are denoted using square brackets `[]`.

For example, `(n)`  would denote a node named `n`, whilst `[e]` would denote an edge named `e`.

Nodes and edges can have both **label** and **property** constraints. As in standard CYPHER, property constraints are specified using curly brackets `{}`, whilst labels are specified using colon `:` syntax.

An example of a node with a label constraint would be `(n:Person)`.

An example of an edge with a property constraint would be `[e {duration: 10}]`.

Property and label constraints may be combined, for instance, `(n:Person {name: 'John'})` specifies a node which both has the label `Person`, and a `name` property with the value `John`.

Nodes and edges can have label and property multiple constraints, which are specified in a comma-separated list: `(n:Person:Man {name: 'John', age: 20})`. These lists can be arbitrarily long.

By convention:

* we prefer using single quotes around strings (even if double quotes alsowork )
* node label are written in Pascal case (no spaces): e.g. `Person`, `BankAccount`, `BloodType`
* edge label are written in upper case (spaces replaced by underscores): e.g. `TRANSACTION`, `FRIENDS_WITH`, `IS_CLIENT_OF`

### Summary:

Queries are built around **nodes** and **edges**:

* **Nodes** are written in parentheses `()` e.g. `(n)` - a node with alias `n`
* **Edges** are written in square brackets `[]` e.g. `[e]` - an edge with alias `e`

You can add:

* **Labels** with a colon `:` - e.g. `(n:Person)`
* **Property constraints** with curly braces `{}` - e.g. `[e {duration: 10}]`
* **Both** at once - e.g. `(n:Person {name: 'John'})`

Multiple labels and properties can be specified:

```jsx theme={null}
(n:Person:Man {name: 'John', age: 20})
```

# Queries

Queries are built up of combinations of nodes and edges, as specified above.

## `MATCH` queries

`MATCH` queries are used to retrieve nodes, edges, and their property values from the database via specification of relationships and properties.

It is easiest to demonstrate the syntax of a `MATCH` query with a concrete example:

```jsx theme={null}
MATCH (n)-[e]->(m) RETURN n, e, m
MATCH (m)<-[e]-(n) RETURN n, e, m  // also works
```

The above query will look for all the directed edges in the graph. The query will return the internal ID of the node `n`.

<Tip>
  When using `RETURN n`, other implementations of CYPHER may return all properties of `n`, whilst TuringDB only returns the internal ID of `n`.
</Tip>

`MATCH` queries are flexible: they can contain a single node and no edges, or any number of node, edge pairs. For example:

```jsx theme={null}
MATCH (n) RETURN n
```

will match all nodes in the database. An example of a multi-hop `MATCH` query would be:

```jsx theme={null}
MATCH (n)-[e]->(m)-[f]->(p) RETURN n, m, p
```

<Card horizontal>
  Queries have *variables*, which in the above examples are those such as `n`, `m`, `e`, etc. Variables are a way to give a name to a node or an edge, so that those nodes or edges can be specified in the `RETURN` clause. However, if you do not want to return an edge or its properties, the edge need not have a name. For example:

  ```jsx theme={null}
  MATCH (n)-->(m) RETURN m
  ```

  Both node and edges can omit a variable name if they specify at least a label constraint:

  ```jsx theme={null}
  MATCH (:Person)-[:FRIENDS_WITH]->(m) RETURN m
  ```
</Card>

You can also return multiple properties using a comma separated list:

```jsx theme={null}
MATCH (n:Person)-->(m:Person) RETURN n.name, m.age
```

Combining the syntax of `MATCH` queries, and the ability to specify constraints, here are a few examples of some syntactically correct TuringDB `MATCH` queries:

* `MATCH (n:Person) RETURN n.name`
* `MATCH (:Person)-->(n:Person) RETURN n.name`
* `MATCH (n:Person:Woman:SoftwareEngineer) RETURN n.name`
* `MATCH (n:Person:Woman:SoftwareEngineer)-->(m)-[e]->(p:Man)-[f]->(q) RETURN e, f`

<Warning>
  Note that whilst all the above the queries are all *syntactically* valid, if the graph does not have a node property which is used in a query, it will fail to execute.
</Warning>

## `CREATE` queries

`CREATE` queries follow exactly the same syntax as `MATCH` queries when it comes to specifying nodes, edges, and property/label constraints. `CREATE` queries may have `RETURN` clauses, but they do not need them. There is also the additional requirement that all nodes and edges *must have at least one label*. This means a query such as

```jsx theme={null}
CREATE (n)
```

is not valid, whilst

```jsx theme={null}
CREATE (n:Person)
```

is a valid query.

There is no requirement to declare any names for any variables. This means we can have queries such as

```jsx theme={null}
CREATE (:Person)-[:FRIENDS_WITH]->(:Person)
```

However, naming variables can be useful if you want to create multiple edges to or from a given node. For instance, if you would like to create a triangle pattern, this can be achieved using the following approach

```jsx theme={null}
CREATE (a:Corner)-[:EDGE]->(b:Corner), (b)-[:EDGE]->(c:Corner), (c)-[:EDGE]->(a)
```

## `MATCH ... CREATE ...` and `MATCH ... CREATE ... RETURN ...` queries

`MATCH`, `CREATE` and `RETURN` statements can be used together in queries.

For example, to create a edge between two existing graph, the following queries can be used:

```jsx theme={null}
// Create two nodes Person
CREATE (:Person {name: 'Alice', age: 24})
CREATE (:Person {name: 'John', age: 27})

// Match two nodes to create the edge between them
MATCH (n:Person {name: 'Alice'}), (m:Person {name: 'John'})
CREATE (n)-[:FRIENDS_WITH]->(m)

// RETURN clause can also be added
MATCH (n:Person {name: 'Alice'}), (m:Person {name: 'John'})
CREATE (n)-[:FRIENDS_WITH]->(m)
RETURN n.name, n.age, m.name, m.age
```

## `WHERE` queries

The `WHERE` clause allows to filter the results on node and/or edge labels and/or properties.

To filter on node label:

```jsx theme={null}
MATCH (n)
WHERE n:Person
RETURN n, n.age
```

To filter on node property:

```jsx theme={null}
MATCH (n)
WHERE n.name = 'Alice'
RETURN n, n.age
```

To filter on edge label:

```jsx theme={null}
MATCH (n)-[e]->(m)
WHERE e:PLAYPOKER
RETURN n.name, m.age
```

To filter on edge property:

```jsx theme={null}
MATCH (n)-[e]->(m:Person)
WHERE n.name = 'Gabby'
AND e.play_poker = true
RETURN n.name, m.age
```

## Multi-pattern queries (Joins and Cartesian Products)

TuringDB supports matching multiple patterns in a single query by separating them with commas. Depending on whether the patterns share variables, this results in either a **join** or a **cartesian product**.

### Cartesian Product

When patterns in a `MATCH` clause are separated by commas and do **not** share any variables, TuringDB computes a cartesian product of the results. Every row from the first pattern is combined with every row from the second.

```cypher theme={null}
MATCH (a:Person), (b:Interest)
RETURN a.name, b.name
```

This returns all combinations of Person nodes with Interest nodes. If there are 6 persons and 5 interests, the result contains 30 rows.

You can use `WHERE` to filter the cartesian product:

```cypher theme={null}
MATCH (p:Person), (i:Interest)
WHERE p.isFrench = true AND i.name = 'MegaHub'
RETURN p.name, i.name
```

Three or more patterns can be combined:

```cypher theme={null}
MATCH (p:Person), (i:Interest), (c:Category)
WHERE p.name = 'A' AND i.name = 'Shared' AND c.name = 'Cat1'
RETURN p.name, i.name, c.name
```

<Warning>
  Cartesian products can produce very large result sets. A product of N nodes by M nodes produces N × M rows. Use `WHERE` filters or `LIMIT` to keep result sizes manageable.
</Warning>

### Pattern-based Joins

When two paths converge on a shared node variable, TuringDB performs a **hash join** on that variable. This is useful for finding entities that share a common connection.

```cypher theme={null}
MATCH (a:Person)-->(b:Interest)<--(c:Person)
WHERE a.name <> c.name
RETURN a.name, b.name, c.name
```

This finds pairs of different persons who share a common interest. The variable `b` acts as the join point.

Multi-hop joins are also supported:

```cypher theme={null}
MATCH (a:Person)-->(i:Interest)-->(c:Category)
RETURN a.name, i.name, c.name
```

You can specify edge types explicitly:

```cypher theme={null}
MATCH (a:Person)-[:INTERESTED_IN]->(b:Interest)<-[:INTERESTED_IN]-(c:Person)
WHERE a.name <> c.name
RETURN a.name, b.name, c.name
```

### Mixing Paths and Cartesian Products

Connected paths and independent patterns can be combined in the same query. Shared variables create joins, while unrelated patterns create cartesian products.

```cypher theme={null}
MATCH (a:Person)-->(i:Interest), (c:Category)
WHERE c.name = 'Cat1'
RETURN a.name, i.name, c.name
```

This returns every Person→Interest connection combined with the Cat1 category node.

Two independent paths can also be combined:

```cypher theme={null}
MATCH (a:Person)-->(i1:Interest), (b:Person)-->(i2:Interest)
WHERE a.name = 'Alice' AND b.name = 'Bob' AND i1.name <> i2.name
RETURN a.name, i1.name, b.name, i2.name
```

This finds all combinations of Alice's interests with Bob's interests, excluding pairs where both interests are the same.

## `LIMIT` keyword

**LIMIT** restricts the number of returned results. The following query returns only the first 10 results:

```cypher theme={null}
MATCH (n)
RETURN n
LIMIT 10
```

## `SKIP` keyword

**SKIP** skips the first N results before returning the rest. The following query skips the first 10 results:

```cypher theme={null}
MATCH (n)
RETURN n
SKIP 10
```

`SKIP` and `LIMIT` can be combined for pagination. The following query will return the 11th to 20th results:

```cypher theme={null}
MATCH (n)
RETURN n
SKIP 10
LIMIT 10
```

## Sorting with `ORDER BY`

**ORDER BY** sorts the results by one or more properties. By default, results are sorted in ascending order.

```cypher theme={null}
MATCH (n:Person)
RETURN n.name, n.age
ORDER BY n.age
```

Use `DESC` to sort in descending order:

```cypher theme={null}
MATCH (n:Person)
RETURN n.name, n.age
ORDER BY n.age DESC
```

You can sort by multiple properties. Results are sorted by the first property, then ties are broken by subsequent properties:

```cypher theme={null}
MATCH (n:Person)
RETURN n.name, n.age, n.city
ORDER BY n.city, n.age DESC
```

`ORDER BY` can be combined with `SKIP` and `LIMIT` for sorted pagination:

```cypher theme={null}
MATCH (n:Person)
RETURN n.name, n.age
ORDER BY n.age DESC
SKIP 10
LIMIT 10
```

# Data Types

TuringDB offers the following data types for node and edge properties:

* String
* Boolean
* Integer (signed)
* Double (decimal)
* Embedding (vector of floats, e.g. `(1.2, 2.0, 0.0)`)

String properties can be enclosed using double quotes (`"`), single quotes (`'`), or backticks (\`\`\`).

# Operators

## Boolean operators

The `OR` and  `AND` operartors are used to filter on multiple conditions:

```jsx theme={null}
MATCH (n:Person)
WHERE n.medication = "Aspirin"
OR n.medication = "Ibuprofen"
RETURN n.name
```

```jsx theme={null}
MATCH (n)
WHERE n.name = 'Matt'
AND n.age = 20
AND n.hasPhD = false
RETURN n
```

## Comparison operators

TuringDB allows you to query against node and edge properties using the `:` operator for exact matching of the property value.

```jsx theme={null}
MATCH (n {name: 'Matt', age: 20, hasPhD: false}) RETURN n
```

You can also pass through the WHERE clause to do the exact equivalent query:

```text theme={null}
MATCH (n)
WHERE n.name = 'Matt'
AND n.age = 20
AND n.hasPhD = false
RETURN n
```

Implemented comparison operators:

* Equal: `=`

  ```text theme={null}
  # Find antibodies targeting proteins in Human
  MATCH (ab:Antibody)-->(prot:Protein)
  WHERE prot.host = 'Human'
  RETURN ab.name, prot.name
  ```
* Inequal: `<>`

  ```jsx theme={null}
  # Find antibodies (associated to a protein) used together
  # in same publication (2-hop)
  MATCH (ab1:Antibody)-->(prot:Protein), (ab2:Antibody)-->(prot:Protein)
  WHERE ab1.name <> ab2.name
  RETURN ab1.name, ab2.name, prot.name, prot.gene_name
  ```
* Less than: `<`
* Less than or equal to: `<=`
* Greater than: `>`
* Greater than or equal to: `>=`

  ```jsx theme={null}
  # Publications published on 2020 or after
  MATCH (pub:Publication)
  WHERE pub.published_year >= 2020
  RETURN pub.displayName, pub.pubmedid, pub.published_year, pub.country
  ```
* is null:`IS NULL`
* is not null:`IS NOT NULL`

  ```jsx theme={null}
  # Publications from the United States
  MATCH (pub:Publication)
  WHERE pub.country = 'United States'
  AND pub.institution IS NOT NULL
  RETURN pub.displayName, pub.institution, pub.published_year
  ```

# Expression Evaluation in RETURN

Since v1.22.0, TuringDB supports arithmetic expressions directly in `RETURN` projections.

**Supported operators:** `+`, `-`, `*`, `/`

```cypher theme={null}
-- Single property with constant
MATCH (n)
WHERE n.type = 'Patient'
RETURN n.age * 12

-- Two properties combined
MATCH ()-[r]->()
WHERE r.value_satoshi > 0
AND r.value_usd > 0
RETURN r.value_usd / r.value_satoshi * 100000000.0
```

# Type Conversion & Introspection Functions

TuringDB supports a set of built-in functions for type conversion and node introspection.

## `labels()`

Returns the label of a node as a string. Can only be used in `RETURN` — not in `WHERE`.

```jsx theme={null}
MATCH (n) RETURN labels(n), n.name
```

## `toInteger()`

Parses a string literal as an integer. Useful for type-safe comparisons and arithmetic.

```jsx theme={null}
MATCH (n)
WHERE n.year > toInteger("2020")
RETURN n.name, n.year
```

## `toFloat()`

Parses a string literal as a float. Useful for arithmetic on string-encoded numeric properties.

```jsx theme={null}
MATCH (n)
RETURN n.name, n.price * toFloat("1.07")
```

# Procedures

On top of supporting queries which return or alter information *in the graph*, TuringDB supports a number of *procedures* which return information, or metadata, *about the graph*.

These queries follow the `CALL` syntax, and the following variants are supported:

1. `CALL db.propertyTypes ()` - returns a column of all the different node and edge properties and their types in the database
2. `CALL db.labels ()` - returns a column of all the different node labels
3. `CALL db.edgeTypes()` - returns a column of all the different edge types (edge equivalent of node labels)
4. `CALL db.history ()` - returns a dataframe containing commit history

These procedures are useful for exploring the data which is available in the graph, and using this to plan `MATCH` queries.

# Commands

Outside of CYPHER, there are a number of commands which you can use to interact with the TuringDB engine.

| Command                                | Explanation                                                                                                                                                                            |
| -------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `CREATE GRAPH <graph name>`            | Create a graph with the specified name                                                                                                                                                 |
| `LOAD GRAPH <graph name>`              | Load the specified TuringDB graph. Requires the graph files to be accessible in TuringDB directory (`--turing-dir`, `graphs` subdir)                                                   |
| `LOAD GML 'mygraph.gml' AS my_graph`   | Load the specified GML as TuringDB graph. Requires the GML to be accessible in TuringDB directory (`--turing-dir`, `data` subdir)                                                      |
| `LOAD JSONL 'mygraph.gml' AS my_graph` | Load the specified JSONL as TuringDB graph. Requires the JSONL to be accessible in TuringDB directory (`--turing-dir`, `data` subdir)                                                  |
| `CHANGE NEW`                           | Creates a new change, returning a column with the ID of the created change                                                                                                             |
| `CHANGE SUBMIT`                        | When checked out on a specific change, submits all changes made to the “master branch”                                                                                                 |
| `CHANGE DELETE`                        | Deletes the currently checked out change                                                                                                                                               |
| `CHANGE LIST`                          | Lists the currently active (uncommitted) changes                                                                                                                                       |
| `LOAD COMMIT '<hash>'`                 | Load a past commit into memory for querying. Required when sending queries directly to the REST API for non-HEAD commits. The CLI `checkout` and Python SDK handle this automatically. |
| `LIST GRAPH`                           | Lists the available graphs                                                                                                                                                             |

# Roadmap

Whilst TuringDB currently supports most of CYPHER, 100% of CYPHER can be parsed but we are working on supporting query execution for some rare CYPHER queries types. TuringDB also has a CALL function which over time will contain more and more algorithms.
