JOIN
clause
Join is a fundamental relational operation used to horizontally join two tables or relations. Relationships are called the left and right sides of the join depending on how they are written in the JOIN
clause. Each row of the result contains columns from both relations.
The join uses a rule to match pairs of rows from each relationship. Often this is a predicate, but other rules may be specified.
Syntax
The following syntax variants may be used:
SELECT ...
FROM <object_ref1> [
{
INNER
| { LEFT | RIGHT | FULL } [ OUTER ]
}
]
JOIN <object_ref2>
[ ON <condition> ]
[ ... ]
SELECT *
FROM <object_ref1> [
{
INNER
| { LEFT | RIGHT | FULL } [ OUTER ]
}
]
JOIN <object_ref2>
[ USING( <column_list> ) ]
[ ... ]
SELECT ...
FROM <object_ref1> [
{
| NATURAL [ { LEFT | RIGHT | FULL } [ OUTER ] ]
| CROSS
}
]
JOIN <object_ref2>
[ ... ]
Parameters
-
<object_ref1>
and<object_ref2>
+. Each<object_ref>
represents a table or table-like data source.
-
JOIN
AJOIN
keyword indicating that tables should be joined.JOIN
is combined with other keywords (such asINNER
orOUTER
) to specify the type of join.
-
ON <condition>
A boolean expression that specifies the conditions of the connection.
For a detailed description, see Conditional connection.
-
USING <column_list>
A list of columns by which the join is performed. The columns must be of the same name in the tables being joined.
For a detailed description, see Conditional connection.
OUTER JOIN
— external connection
Strings that have no matches can be returned if an OUTER
join is specified. External joins can have one of the types:
-
LEFT
(All rows from the left relationship occur at least once) -
RIGHT
(All rows from the right relationship occur at least once) -
FULL
(All strings from both relations occur at least once)
A connection that is not outer (OUTER
) is inner (INNER
) — only those strings that satisfy the condition are returned.
If an unpaired row is returned, the attributes of the other table are set to NULL
.
CROSS JOIN
— cross-join (Cartesian product)
The simplest type of join is the CROSS JOIN
. There are no conditions for this join type and it simply returns all possible pairs.
Let’s return all pairs of strings:
SELECT a.*, b.*
FROM a
CROSS JOIN b;
This is equivalent to a query that simply omits the JOIN
condition:
SELECT a.*, b.*
FROM a, b;
Conditional connection
Most conjunctions are specified by a predicate that connects attributes on one side to attributes on the other side. Conditions can be explicitly specified using the ON
and WHERE
operators.
Let’s show the use of conditional joins on the example of two tables — capitals
with countries and their capitals and population
with countries and their population (in millions).
CREATE TABLE capitals (cap_country VARCHAR, capital VARCHAR);
CREATE TABLE population (pop_country VARCHAR, population_mil BIGINT);
INSERT INTO capitals VALUES
('Russia', 'Moscow'),
('Italy', 'Rome'),
('Spain', 'Madrid'),
('France', 'Paris');
INSERT INTO population VALUES
('Russia', 143),
('Spain', 48),
('Brazil', 211);
Let’s output the countries with their capitals and populations using JOIN
with the ON
operator:
SELECT *
FROM capitals
JOIN population ON (cap_country = pop_country);
+-------------+---------+-------------+----------------+
| cap_country | capital | pop_country | population_mil |
+-------------+---------+-------------+----------------+
| Russia | Moscow | Russia | 143 |
+-------------+---------+-------------+----------------+
| Spain | Madrid | Spain | 48 |
+-------------+---------+-------------+----------------+
Now let’s output the same data using JOIN
with the WHERE
operator:
SELECT t1.*, t2.*
FROM capitals t1, population t2
WHERE t1.cap_country = t2.pop_country
+-------------+---------+-------------+----------------+
| cap_country | capital | pop_country | population_mil |
+-------------+---------+-------------+----------------+
| Russia | Moscow | Russia | 143 |
+-------------+---------+-------------+----------------+
| Spain | Madrid | Spain | 48 |
+-------------+---------+-------------+----------------+
Not only equality but also other predicates can be used with the WHERE
operator.
If the names of columns in two tables are the same and the values in them must be equal, you can use a simpler syntax USING
.
Let’s define the same tables, but with the same name columns country
:
CREATE TABLE capitals (country VARCHAR, capital VARCHAR);
CREATE TABLE population (country VARCHAR, population_mil BIGINT);
INSERT INTO capitals VALUES
('Russia', 'Moscow'),
('Italy', 'Rome'),
('Spain', 'Madrid'),
('France', 'Paris');
INSERT INTO population VALUES
('Russia', 143),
('Spain', 48),
('Brazil', 211);
Now output the countries with their capital and population through a simpler query with USING
:
SELECT *
FROM capitals
JOIN population USING (country);
+---------+---------+----------------+
| country | capital | population_mil |
+---------+---------+----------------+
| Russia | Moscow | 143 |
+---------+---------+----------------+
| Spain | Madrid | 48 |
+---------+---------+----------------+
NATURAL JOIN
— natural connection
Natural joins join two tables based on columns with the same names.
Let’s show the use of this type of join on the example of the same pair of tables with the same name columns country
.
To join tables based on their common column, execute the command:
SELECT *
FROM capitals
NATURAL JOIN population;
+---------+---------+---------+----------------+
| country | capital | country | population_mil |
+---------+---------+---------+----------------+
| Russia | Moscow | Russia | 143 |
+---------+---------+---------+----------------+
| Spain | Madrid | Spain | 48 |
+---------+---------+---------+----------------+
Note that only rows that had the same country
attribute in both tables were included in the result.
We can perform a similar query using the JOIN
clause with the USING
keyword:
SELECT *
FROM capitals
JOIN population
USING (country);
+---------+---------+----------------+
| country | capital | population_mil |
+---------+---------+----------------+
| Russia | Moscow | 143 |
+---------+---------+----------------+
| Spain | Madrid | 48 |
+---------+---------+----------------+
Note that in this case the same-named columns from different tables will "collapse" into one.
SEMI JOIN
and ANTI JOIN
— semi-joins and anti-joins
Semi joins return rows from the left table that have at least one match in the right table.
Anti-joins return rows from the left table that have no matches in the right table.
When using semi-joins or anti-joins, the result will never contain more rows than the left table. Semi-joins provide the same logic as the IN
operator. Anti-joins provide the same logic as the NOT IN
operator, except that anti-joins ignore NULL
values from the right table.
Example of a semi-join
Let’s derive a list of such country-capital pairs from the capitals
table for which the country name is present in the population
table:
SELECT *
FROM capitals
SEMI JOIN population
USING (country);
+---------+---------+
| country | capital |
+---------+---------+
| Russia | Moscow |
+---------+---------+
| Spain | Madrid |
+---------+---------+
This request is equivalent to the following:
SELECT *
FROM capitals
WHERE country IN (SELECT country FROM population);
+---------+---------+
| country | capital |
+---------+---------+
| Russia | Moscow |
+---------+---------+
| Spain | Madrid |
+---------+---------+
Example of an anti-coupling
Let’s output a list of such country-capital pairs from the capitals
table for which the country name is not in the population
table:
SELECT *
FROM capitals
ANTI JOIN population
USING (country);
+---------+---------+
| country | capital |
+---------+---------+
| Italy | Rome |
+---------+---------+
| France | Paris |
+---------+---------+
This request is equivalent to the following:
SELECT *
FROM capitals
WHERE country NOT IN
(SELECT country FROM population WHERE country IS NOT NULL);
+---------+---------+
| country | capital |
+---------+---------+
| Italy | Rome |
+---------+---------+
| France | Paris |
+---------+---------+
Closed connection (Self-Join)
Tengri allows you to use a closed join (joining a table to itself) for all types of joins. Note that tables must be specified via aliases, using the same table name without aliases will cause an error:
CREATE TABLE t (x INTEGER);
SELECT * FROM t JOIN t USING(x);
Binder Error:
Duplicate alias "t" in query!
Adding aliases allows the request to be processed successfully:
CREATE TABLE t (num INTEGER);
SELECT * FROM t t1 JOIN t t2 USING(num);
+-----+
| num |
+-----+