How To Install PostgreSQL on Ubuntu 20.04 [Quickstart] | DigitalOcean

Databases

are

a key component of many websites and applications, and are at the core of how data is stored and exchanged over the Internet. One of the most important aspects of database administration is the practice of retrieving data from a database, either on an ad hoc basis or part of a process that has been coded into an application. There are several ways to retrieve information from a database, but one of the most commonly used methods is by sending queries through the command line.

In relational database management systems, a query is any command used to retrieve data from a table. In Structured Query Language (SQL), queries are almost always performed using the SELECT statement.

In this guide, we’ll discuss the basic syntax of SQL queries, as well as some of the most commonly employed functions and operators. We will also practice performing SQL queries using some sample data in a PostgreSQL database.

PostgreSQL, often abbreviated as “Postgres”, is a relational database management system with an object-oriented approach, meaning that information can be represented as objects or classes in PostgreSQL schemas. PostgreSQL aligns closely with standard SQL, although it also includes some features not found in other relational database systems.

In general

, the commands and concepts presented in this guide can be used on any Linux-based operating system running any SQL database software. However, it was specifically written with an Ubuntu 18.04 server running PostgreSQL in mind. To set this up, you’ll need the following:

  • An Ubuntu 18.04 machine with a non-root user with sudo privileges. This can be configured using our initial server setup guide for Ubuntu 18.04.
  • PostgreSQL installed on the machine. For help with setting this up, follow the “Installing PostgreSQL” section of our guide on how to install and use PostgreSQL on Ubuntu 18.04.

With this setting in place, we can start the tutorial

. Create a sample database Before we can start querying in SQL, we’ll first create a database and a couple of tables, then populate these tables with some

sample data

. This will allow you to gain some hands-on experience when you start making inquiries later.

For the sample database we’ll use throughout this guide, imagine the following scenario:

You and several of your friends celebrate each other’s birthdays. Each time, group members head to the local bowling alley, participate in a friendly tournament, and then everyone heads to their place where they prepare the birthday person’s favorite meal.

Now that this tradition has been going on for a while, you’ve decided to start tracking the records of these tournaments. In addition, to facilitate dinner planning, you decide to create a record of your friends’ birthdays and their favorite entrees, accompaniments and desserts. Instead of keeping this information on a physical ledger, he decides to exercise his database skills by registering it in a PostgreSQL database.

To get started, open a PostgreSQL prompt as your postgres superuser

:

  1. sudo -u postgres psql

Next, create the database by running:

  1. CREATE DATABASE BIRTHDAY;

Then select this database by typing:

  1. c birthday Next

, create

Two tables within this database. We will use the first table to track your friends’ records at the bowling alley. The following command will create a table called tournaments with columns for the name of each of your friends, the number of tournaments they have won (wins), their best score of all time, and what size bowling shoes they wear

(size): CREATE TABLE

  1. tournaments
  2. (

  3. name varchar(30),
  4. real win
  5. , real

  6. best
  7. ,

  8. real size
  9. );

Once you run the CREATE TABLE command and populate it with column headings, You will

receive the following output: OutputCREATE TABLE

Populate the tournament table with some sample data

: INSERT

  1. INTO tournaments (name, wins, best, size)
  2. VALUES (‘Dolly’, ‘7’, ‘245’, ‘8.5’), (‘Etta’, ‘4’, ‘283’, ‘9’), (‘Irma’, ‘9’, ‘

  3. 266
  4. ‘, ‘7’), (‘Barbara’, ‘2’, ‘

  5. 197
  6. ‘, ‘

  7. 7.5′), (
  8. ‘Gladys’, ’13’, ‘

  9. 273
  10. ‘, ‘

  11. 8′);

You receive the following output:

OutputINSERT 0 5

After this, create another table within the same database that we will use to store information about your friends’ favorite birthday meals. The following command creates a table named dinners with columns for the name of each of your friends

, their date of birth, their favorite dish, their favorite side dish and their favorite dessert: CREATE TABLES dinners ( name varchar(30), date of birth, main course varchar(30),

  1. garnish varchar
  2. (30),

  3. dessert varchar
  4. (30
  5. )

  6. );

Similarly, for this table,

you will receive feedback verifying that the table was created

: OutputCREATE TABLE

Fill in this table with some sample data as well:

  1. INSERT INTO dinners (name, date of birth, entry, accompaniment, dessert)
  2. VALUES (‘Dolly’, ‘1946-01-19’, ‘steak’, ‘salad’, ‘cake’), (‘Etta’, ‘1938-01-25’, ‘chicken’, ‘fries’, ‘ice cream’), (‘Irma’, ‘1941-02-18’, ‘tofu’, ‘fries’, ‘cake’), (‘Barbara’, ‘

  3. 1948-12-25
  4. ‘, ‘tofu’, ‘salad’, ‘

  5. ice cream
  6. ‘),
  7. (‘Gladys’, ‘1944-05-28

‘, ‘steak’, ‘fries’, ‘ice cream’); OutputINSERT 0 5

Once that command completes successfully, you are done configuring your database. Next, we’ll review the basic command structure of SELECT queries.

Description of instructions

SELECT

As mentioned in the introduction, SQL queries almost always begin with the SELECT statement. SELECT is used in queries to specify which columns in a table should be returned in the result set. Queries also almost always include FROM, which is used to specify which table the statement will query.

Generally, SQL queries follow this syntax

:

  1. SELECT column_to_select OF table_to_select WHERE certain_conditions_apply

;

As an example, the following statement will return the full name column of the dinner table:

  1. SELECT name FROM dinners;

Exit name – Dolly Etta Irma Barbara Gladys (5 rows)

You can select multiple columns from the same table by separating their names with a comma, like this:

  1. SELECT name, birthdate FROM dinners;

Output Name | date of birth -+- Dolly | 1946-01-19 Etta | 1938-01-25 Irma | 1941-02-18 Barbara | 1948-12-25 Gladys | 1944-05-28 (5 rows)

Instead of naming a specific column or set of columns, you can follow the SELECT operator with an asterisk (*) that serves as a placeholder that represents all columns in a table. The following command returns each column of the tournament table:

  1. SELECT * FROM tournaments;

Output Name | Wins | best | size -+-+-+- Dolly | 7 | 245 | 8.5 Etta | 4 | 283 | 9 Irma | 9 | 266 | 7 Barbara | 2 | 197 | 7.5 Gladys | 13 | 273 | 8 (5 rows)

WHERE is used in queries to filter records that meet a specified condition, and rows that do not meet that condition are removed from the result. A WHERE clause typically follows this syntax:

. .

  1. . WHERE column_namecomparison_operatorvalue

The comparison operator of a WHERE clause defines how the specified column should be compared to the value. Here are some common SQL comparison operators

:

For example, if you want to find Irma’s shoe size, you can use the following query:

  1. SELECT size FROM tourneys WHERE name = ‘Irma’;

Output size: 7 (1 row)

SQL allows the use of wildcard characters, and these are especially useful when used in WHERE clauses. Percent signs (%) represent zero or more unknown characters, and underscores (_) represent a single unknown character. These are useful if you’re trying to find a specific entry in a table, but aren’t sure what exactly that entry is. To illustrate, let’s say you’ve forgotten some of your friends’ favorite dish, but you’re sure this particular entrée starts with a “t.” You can find its name by running the following query:

SELECT entrée

  1. FROM dinners WHERE entrée LIKE ‘t%’;

Output Entry – Tofu Tofu (2 rows)

Based on the above output, we see that the main dish we have forgotten is tofu

.

There may be times when you work with databases that have columns or tables with relatively long or hard-to-read names. In these cases, you can make these names more readable by creating an alias with the AS keyword. Aliases created with AS are temporary and exist only for the duration of the query for which they are created:

  1. SELECT name AS n, birthdate AS b, dessert AS d FROM dinners;

Output n | b | d -+-+- Dolly | 1946-01-19 | Etta cake | 1938-01-25 | Irma ice cream | 1941-02-18 | Barbara cake | 1948-12-25 | Gladys ice cream | 1944-05-28 | ice cream (5 rows) Here, we’ve told SQL to display the name column as n, the

date of birth column as b, and the dessert column as d

.

The examples we’ve seen up to this point include some of the most commonly used keywords and clauses in SQL queries. These are useful for basic queries, but they’re not useful if you’re trying to perform a calculation or derive a scalar value (a single value, rather than a set of several different values) based on your data. This is where aggregate functions come into play.

Added functions

Often, when working with data, you don’t necessarily want to see the data itself. Rather, you want insights into the data. SQL syntax includes a number of functions that allow you to interpret or execute calculations on your data by simply issuing a SELECT query. These are known as aggregate functions.

The COUNT function counts and returns the number of rows that match a certain criterion. For example, if you want to know how many of your friends prefer tofu for their birthday main course, you can issue this query:

  1. SELECT COUNT(entrée) FROM dinners WHERE entrée = ‘tofu’;

Output count: 2 (1 row)

The AVG function returns the average (average) value of a column. Using our example table, you can find the best average score among your friends with this query:

  1. SELECT AVG (best) FROM tournaments;

Output average: 252.8 (1 row)

SUM is used to find the total sum of a given column. For example, if you want to see how many games you and your friends have played over the years, you can run this query:

  1. SELECT SUM(wins) FROM tournaments;

Output sum: 35 (1 row)

Please note that the AVG and SUM functions will only work properly when used with numeric data. If you try to use them on non-numeric data, it will result in an error or only 0, depending on the RDBMS you are using:

  1. SELECT SUM(entrée) FROM dinners;

OutputERROR: the sum of functions (character variation) does not exist LINE 1: select sum (input) of the dinners; ^ TIP: No function matches the given name and argument types. You may need to add explicit type conversions.

MIN is used to find the smallest value within a specified column. You can use this query to see what the worst overall bowling record is so far (in terms of number of

wins):

  1. SELECT MIN(wins) FROM tournaments;

Output min – 2 (1 row)

Similarly, MAX is used to find the largest numeric value in a given column. The following query will show the best overall bowling record:

  1. SELECT MAX(wins) FROM tournaments;

Max. output – 13 (1 row)

Unlike SUM and AVG, the MIN and MAX functions can be used for numeric and alphabetic data types. When executed on a column that contains string values, the MIN function will display the first value alphabetically:

  1. SELECT MIN(name) FROM dinners;

Output min – Barbara (1 row) Similarly,

when executed on a column containing string values, the MAX function will display the last value alphabetically:

  1. SELECT MAX(name) FROM dinners;

Max output – Irma (1 row)

Added functions have many uses beyond what is described in this section. They are particularly useful when used with the GROUP BY clause, which is discussed in the next section along with other query clauses that affect how result sets are sorted.

Manipulate

query

results

In addition to the FROM and WHERE clauses, there are several other clauses that are used to manipulate the results of a SELECT query. In this section, we will explain and provide examples for some of the most commonly used consultation clauses.

One of the most commonly used query clauses, apart from FROM and WHERE, is the GROUP BY clause. It is typically used when performing an aggregate function in one column, but relative to matching values in another.

For example, let’s say you want to know how many of your friends prefer each of the three main dishes you make. You can find this information with the following query:

  1. SELECT COUNT(name), entry FROM dinners GROUP BY entrée;

Output Count | Entry -+- 1 | chicken 2 | Steak 2 | tofu (3 rows)

The ORDER BY clause is used to sort the query results. By default, numeric values are sorted in ascending order, and text values are sorted in alphabetical order. To illustrate, the following query lists the name and date of birth columns, but sorts the results by date of birth:

  1. SELECT name, birthdate FROM dinners ORDER BY birthdate;

Output Name | date of birth -+- Etta | 1938-01-25 Irma | 1941-02-18 Gladys | 1944-05-28 Dolly | 1946-01-19 Barbara | 1948-12-25 (5 rows)

Note that the default behavior of ORDER BY is to sort the result set in ascending order. To reverse this and have the result set sorted in descending order, close the query with

DESC:

  1. SELECT NAME, BIRTHDATE FROM DINNERS ORDER BY BIRTHDATE DESC;

Output Name | date of birth -+- Barbara | 1948-12-25 Dolly | 1946-01-19 Gladys | 1944-05-28 Irma | 1941-02-18 Etta | 1938-01-25 (5 rows)

As mentioned earlier, the WHERE clause is used to filter results based on specific conditions. However, if you use the WHERE clause with an added function, it will return an error, as is the case with the following attempt to find which sides are the favorites of at least three of your friends:

SELECT COUNT(name

  1. ), side FROM dinners WHERE COUNT(name) >= 3;

OutputERROR: Added functions are not allowed in WHERE LINE 1: SELECT COUNT(name), side FROM dinners WHERE COUNT(name) >= 3…

The HAVING clause was added to SQL to provide functionality similar to that of the WHERE clause while also supporting the added functions. It is useful to think that the difference between these two clauses is that WHERE applies to individual records, while HAVING applies to group records. To this end, each time you issue a HAVE clause, the GROUP BY clause must also be present.

The following example is another attempt to find which garnishes are the favorites of at least three of your friends, although this one will return a result without error:

SELECT COUNT(name

  1. ), side FROM dinners GROUP BY side HAVING COUNT(name) >= 3;

Output Count | Side -+- 3 | fries (1 row)

Aggregate functions are useful for summarizing the results of a given column in a given table. However, there are many cases where it is necessary to query the contents of more than one table. We’ll go over some ways you can do this in the next section.

Most

of the time, a database contains multiple tables, each with different sets of data. SQL provides a few different ways to run a single query on multiple tables.

The JOIN clause can be used to combine rows from two or more tables into a query result. It does this by finding a related column between the tables and sorting the results appropriately in the output.

SELECT statements that include a JOIN clause generally follow this syntax: SELECT

table1.column1,

    table2.column2

  1. FROM table1
  2. JOIN

  3. table2 ON table1.related_column=table2.related_column;

Note that because JOIN clauses compare the contents of more than one table, the previous example specifies from which table to select each column by preceding the column name with the table name and a period. You can specify from which table a column should be selected in this way for any query, although it is not necessary when selecting from a single table, as we have done in the previous sections. Let’s review an example using our sample data.

Imagine that you wanted to buy each of your friends a pair of bowling shoes as a birthday present. Because information about your friends’ birth dates and shoe sizes is kept in separate tables, you can refer to both tables separately and then compare the results of each. However, with a JOIN clause, you can find all the information you want with a single query:

    SELECT tourneys.name, tourneys.size, dinners.birthdate FROM tourneys

  1. JOIN dinners ON tourneys.name=dinners.name;

Output Name | Size | date of birth -+-+- Dolly | 8.5 | 1946-01-19 Etta | 9 | 1938-01-25 Irma | 7 | 1941-02-18 Barbara | 7.5 | 1948-12-25 Gladys | 8 | 1944-05-28 (5 rows) The JOIN clause

used in this example, without any other arguments, is an internal JOIN clause. This means that it selects all records that have matching values in both tables and prints them to the result set, while mismatched records are excluded. To illustrate this idea, let’s add a new row to each table that does not have a corresponding entry in the other:

INSERT INTO tournaments (name, wins, best, size) VALUES (‘Bettye’, ‘0’, ‘193’, ‘9’);

  1. INSERT INTO dinners (name, date of birth, appetizer, side dish, dessert) VALUES (
  2. ‘Lesley’, ‘

  3. 1946-05-02

‘, ‘steak’, ‘salad’, ‘ice cream

  1. ‘);

Then rerun the previous SELECT statement with the JOIN clause: SELECT

tourneys.name, tourneys.size

    , dinners.birthdate

  1. FROM tournaments
  2. JOIN dinners ON tourneys.name=dinners.name;

Output Name | Size | date of birth -+-+- Dolly | 8.5 | 1946-01-19 Etta | 9 | 1938-01-25 Irma | 7 | 1941-02-18 Barbara | 7.5 | 1948-12-25 Gladys | 8 | 1944-05-28 (5 rows) Note that because the tournament table has no

entrance for Lesley and the dinner table has no entrance for Bettye, those records are absent from this exit

.

However, it is possible to return all records in one of the tables using an external JOIN clause. External JOIN clauses are written as LEFT JOIN, RIGHT JOIN, or FULL JOIN.

A LEFT JOIN clause returns all records in the “left” table and only matching records in the right table. In the context of outer joins, the left table is referenced in the FROM clause, and the right table is any other table referenced after the JOIN statement.

Run the previous query again, but this time use a LEFT JOIN clause:

SELECT tourneys.name, tourneys.size, dinners.birthdate FROM tournaments

  1. LEFT JOIN dinners ON tourneys.name=dinners.name;

This command will return all records in the left table (in this case, tournaments) even if you don’t have a corresponding record in the right table. Whenever there is no matching record from the right table, it is returned as a blank or NULL value, depending on your RDBMS: Output Name

| Size | date of birth -+-+- Dolly | 8.5 | 1946-01-19 Etta | 9 | 1938-01-25 Irma | 7 | 1941-02-18 Barbara | 7.5 | 1948-12-25 Gladys | 8 | 1944-05-28 Bettye | 9 | (6 rows)

Now run the query again, this time with a RIGHT JOIN clause:

SELECT tourneys.name, tourneys.size

    , dinners.birthdate FROM tourneys

  1. RIGHT JOIN dinners ON tourneys.name=dinners.name;

This will return all records from the right table (dinners). Because Lesley’s date of birth is recorded in the table on the right, but there is no corresponding row for it in the table on the left, the name and size columns will be returned as blank values in that row: Output Name

| Size | date of birth -+-+- Dolly | 8.5 | 1946-01-19 Etta | 9 | 1938-01-25 Irma | 7 | 1941-02-18 Barbara | 7.5 | 1948-12-25 Gladys | 8 | 1944-05-28 | | 1946-05-02 (6 rows)

Note that the left and right joints can be written as LEFT OUTER UNION or RIGHT OUTER UNION, although the OUTER part of the clause is implied. Similarly, specifying INNER JOIN will produce the same result as simply typing JOIN.

There is a fourth join clause called FULL JOIN available for some RDBMS distributions, including PostgreSQL. A FULL JOIN will return all records in each table, including null values:

    SELECT tourneys.name, tourneys.size, dinners.birthdate FROM tourneys

  1. FULL JOIN dinners ON tourneys.name=dinners.name;

Output Name | Size | date of birth -+-+- Dolly | 8.5 | 1946-01-19 Etta | 9 | 1938-01-25 Irma | 7 | 1941-02-18 Barbara | 7.5 | 1948-12-25 Gladys | 8 | 1944-05-28 Bettye | 9 | | | 1946-05-02 (7 rows)

As an alternative to using FULL JOIN to query all records in multiple tables, you can use the

UNION clause.

The UNION operator works slightly differently than a JOIN clause: instead of printing the results of multiple tables as single columns using a single SELECT statement, UNION combines the results of two SELECT statements into a single column.

As an example, run the following query

: SELECT name FROM tournaments UNION SELECT name FROM dinners;

This query will remove any duplicate entries, which is the default behavior of the

UNION operator: Exit Name – Irma Etta Bettye Gladys Barbara Lesley Dolly (7 rows)

To return all entries (including duplicates), use the UNION

ALL OPERATOR: SELECT

  1. name FROM tournaments
  1. UNION ALL SELECT dinner name;

Output name: Dolly Etta Irma Barbara Gladys Bettye Dolly Etta Irma Barbara Gladys Lesley (12 rows) The names and number of columns

in the results table reflect the name and number of columns queried by the first SELECT statement. Note that when you use UNION to query multiple columns in more than one table, each SELECT statement must query the same number of columns, the respective columns must have similar data types, and the columns in each SELECT statement must be in the same order. The following example shows what might result if you use a UNION clause in two SELECT statements that query a different number of columns:

SELECT name FROM

  1. dinners UNION SELECT name, win FROM tournaments;

OutputERROR: each UNION query must have the same number of columns LINE 1: SELECT name FROM dinners UNION SELECT name, win FROM tourne…

Another way to query multiple tables is by using subqueries. Subqueries (also known as internal or nested queries) are queries included in another query. These are useful in cases where you are trying to filter the results of a query against the result of a separate aggregate function.

To illustrate this idea, let’s say you want to know which of your friends has won more matches than Barbara. Instead of querying how many matches Barbara has won and then running another query to see who has won more games than that, you can calculate both with a single query:

SELECT name, win FROM tourneys WHERE win

  1. > (
  2. SELECT wins FROM tourneys WHERE name = ‘Barbara’);

Output Name | wins -+- Dolly | 7 Etta | 4 Irma | 9 Gladys | 13 (4 rows) The subquery of

this statement was executed only once; it only needed to find the value of the wins column in the same row as Barbara in the name column, and the data returned by the subquery and the external query are independent of each other. However, there are cases where the external query must first read each row in a table and compare those values with the data returned by the subquery to return the desired data. In this case, the subquery is called a correlated subquery.

The following statement is an example of a mapped subquery. This query seeks to find which of your friends has won more games than the average for those with the same shoe size:

SELECT name, size FROM tournaments AS t WHERE wins

    > (

  1. SELECT AVG(wins) FROM tournaments WHERE size = t.size
  2. );

For the query

to complete, you must first collect the name and size columns of the external query. Then, it compares each row of that result set with the results of the internal consultation, which determines the average number of wins for individuals with identical shoe sizes. Since you only have two friends who have the same shoe size, there can only be one row in the result set: Exit Name

| size -+- Etta | 9 (1 row) As

mentioned earlier, subqueries can be used to query results from multiple tables. To illustrate this with a final example, let’s say you wanted to host a surprise dinner for the best bowler of all time in the group. You can find which of your friends has the best bowling record and return your favorite meal with the following query:

SELECT name,

  1. starter, side, dessert
  2. FROM

  3. dinners
  4. WHERE name = (SELECT name FROM

  5. tourneys WHERE
  6. wins = (

  7. SELECT MAX(wins) FROM tournamentneys));

Output Name | Main course | Side | dessert -+-+-+- Gladys | Steak | French fries | ice cream (1 row) Note that

this statement not only includes a subquery, but also contains a subquery within that subquery

.

Conclusion

Issuing queries is one of the most common tasks within the field of database administration. There are a number of database management tools, such as phpMyAdmin or pgAdmin, that allow you to query and visualize the results, but issuing SELECT statements from the command line is still a widely practiced workflow that can also provide you with greater control.

If you are new to working with SQL, we recommend that you use our SQL Cheat Sheet as a reference and review the official PostgreSQL documentation. Also, if you want to learn more about SQL and relational databases, the following tutorials may be of interest to you:

Understanding

  • SQL and NoSQL databases and different database models
  • How to

  • Set Up Logical Replication with PostgreSQL 10 on Ubuntu 18.04 How to protect PostgreSQL
  • against automated attacks