DATETIME
datatype.C:\SQLite
entry, OK,
OK.Macs should already have SQLite installed.
sqlite3
followed by
name of database you wish to open.sqlite3
by itself creates a transient in-memory
database..tables
shows all tables in current database.exit
exits SQLitesqlite3 name_of_database.db
Next you need to add tables to the database.
If you have data in a CSV file and no existing table, you can do the
following from the sqlite3
command line:
sqlite> .mode csv
sqlite> .import file.csv table_name
This will import the contents of the CSV file into a new table called “table_name”. This seems to import everything as TEXT.
To edit table schema using GUIs:
You can also define a table and then populate it. Text preceded with
--
below are comments.
Example of creating an empty table:
CREATE TABLE t1
(id INTEGER, -- variable name followed by storage type
y NUMERIC,
x1 INTEGER,
x2 REAL,
x3 TEXT);
Can use .schema
command at SQLite command line to view
and verify table definition. Then you can import a CSV file into the
table using the steps above.
Repeat for multiple tables as needed.
SQLite has 5 data types: INTERGER, TEXT, NUMERIC, REAL, BLOB. See: https://www.sqlite.org/datatype3.html
The RSQLite package provides an interface to SQLite databases. Per the RSQLite vignette: “RSQLite is a DBI-compatible interface which means you primarily use functions defined in the DBI package, so you should always start by loading DBI, not RSQLite.”
Here’s an example of accessing a SQLite database called “homes” that
contains real estate data from Albemarle County in Virginia, USA. We
query the parcels table for records where LotSize is greater than 500
acres. The query is issued as a text string using the
dbGetQuery()
function. The result is saved as
d
, a data frame, which has 56 rows and 32 variables.
library(DBI)
mydb <- dbConnect(RSQLite::SQLite(), "homes.db")
d <- dbGetQuery(mydb, 'select * from parcels where LotSize > 500;')
dim(d)
[1] 56 32
class(d)
[1] "data.frame"
It’s good practice to close the database connection when finished.
“This closes the connection, discards all pending work, and frees
resources (e.g., memory, sockets).” (see ?dbDisconnect
)
dbDisconnect(mydb)
The sqlite3 module provides an interface to SQLite databases.
Below we replicate the example from the R section above using the
pandas function read_sql_query()
. The result is a pandas
data frame with 56 rows and 32 variables.
import pandas as pd
import sqlite3
con = sqlite3.connect("homes.db")
df = pd.read_sql_query("select * from parcels where LotSize > 500;", con)
df.shape
(56, 32)
It’s good practice to close the database connection when finished.
con.close()
These examples come from LinkedIn courses.
chinook <- dbConnect(RSQLite::SQLite(), "courses/intro_to_sqlite/Chinook_Learning_Sqlite.sqlite")
Select 10 records from Invoice table, only show first 10.
SELECT * FROM Invoice LIMIT 10;
InvoiceId | CustomerId | InvoiceDate | BillingAddress | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
---|---|---|---|---|---|---|---|---|
1 | 2 | 2023-01-01 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | NA | Germany | 70174 | 1.98 |
2 | 4 | 2023-01-02 00:00:00 | Ullevålsveien 14 | Oslo | NA | Norway | 0171 | 3.96 |
3 | 8 | 2023-01-03 00:00:00 | Grétrystraat 63 | Brussels | NA | Belgium | 1000 | 5.94 |
4 | 14 | 2023-01-06 00:00:00 | 8210 111 ST NW | Edmonton | AB | Canada | T6G 2C7 | 8.91 |
5 | 23 | 2023-01-11 00:00:00 | 69 Salem Street | Boston | MA | USA | 2113 | 13.86 |
6 | 37 | 2023-01-19 00:00:00 | Berger Straße 10 | Frankfurt | NA | Germany | 60316 | 0.99 |
7 | 38 | 2023-02-01 00:00:00 | Barbarossastraße 19 | Berlin | NA | Germany | 10779 | 1.98 |
8 | 40 | 2023-02-01 00:00:00 | 8, Rue Hanovre | Paris | NA | France | 75002 | 1.98 |
9 | 42 | 2023-02-02 00:00:00 | 9, Place Louis Barthou | Bordeaux | NA | France | 33000 | 3.96 |
10 | 46 | 2023-02-03 00:00:00 | 3 Chatham Street | Dublin | Dublin | Ireland | NA | 5.94 |
To output query result to csv file using sqlite3 command line, run the following:
.mode csv
.header on
.output filename.csv
Output of next query is sent to filename.csv
To restore default output:
.output stdout
.mode list
To see file from sqlite3 command line:
.shell filename.csv
SELECT * FROM Genre WHERE name LIKE 'S%';
GenreId | Name |
---|---|
10 | Soundtrack |
18 | Science Fiction |
20 | Sci Fi & Fantasy |
SELECT * FROM Invoice WHERE Total > 10;
InvoiceId | CustomerId | InvoiceDate | BillingAddress | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
---|---|---|---|---|---|---|---|---|
5 | 23 | 2023-01-11 00:00:00 | 69 Salem Street | Boston | MA | USA | 2113 | 13.86 |
12 | 2 | 2023-02-11 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | NA | Germany | 70174 | 13.86 |
19 | 40 | 2023-03-14 00:00:00 | 8, Rue Hanovre | Paris | NA | France | 75002 | 13.86 |
26 | 19 | 2023-04-14 00:00:00 | 1 Infinite Loop | Cupertino | CA | USA | 95014 | 13.86 |
33 | 57 | 2023-05-15 00:00:00 | Calle Lira, 198 | Santiago | NA | Chile | NA | 13.86 |
40 | 36 | 2023-06-15 00:00:00 | Tauentzienstraße 8 | Berlin | NA | Germany | 10789 | 13.86 |
47 | 15 | 2023-07-16 00:00:00 | 700 W Pender Street | Vancouver | BC | Canada | V6C 1G8 | 13.86 |
54 | 53 | 2023-08-16 00:00:00 | 113 Lupus St | London | NA | United Kingdom | SW1V 3EN | 13.86 |
61 | 32 | 2023-09-16 00:00:00 | 696 Osborne Street | Winnipeg | MB | Canada | R3L 2B9 | 13.86 |
68 | 11 | 2023-10-17 00:00:00 | Av. Paulista, 2022 | São Paulo | SP | Brazil | 01310-200 | 13.86 |
SELECT *
FROM Invoice
WHERE Total > 10
AND InvoiceDate BETWEEN '2024-01-01' AND '2024-03-31';
InvoiceId | CustomerId | InvoiceDate | BillingAddress | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
---|---|---|---|---|---|---|---|---|
250 | 55 | 2024-01-01 00:00:00 | 421 Bourke Street | Sidney | NSW | Australia | 2010 | 13.86 |
257 | 34 | 2024-02-01 00:00:00 | Rua da Assunção 53 | Lisbon | NA | Portugal | NA | 13.86 |
264 | 13 | 2024-03-03 00:00:00 | Qe 7 Bloco G | Brasília | DF | Brazil | 71020-677 | 13.86 |
334 | 39 | 2024-01-07 00:00:00 | 4, Rue Milton | Paris | NA | France | 75009 | 13.86 |
341 | 18 | 2024-02-07 00:00:00 | 627 Broadway | New York | NY | USA | 10012-2612 | 13.86 |
348 | 56 | 2024-03-10 00:00:00 | 307 Macacha Güemes | Buenos Aires | NA | Argentina | 1106 | 13.86 |
SELECT * FROM Genre WHERE LOWER(Name) = 'rock';
GenreId | Name |
---|---|
1 | Rock |
SELECT * FROM Genre WHERE LOWER(Name) LIKE '%rock%';
GenreId | Name |
---|---|
1 | Rock |
5 | Rock And Roll |
See also: https://sqlite.org/lang_aggfunc.html
Sum of Total, returned as OverallTotal.
SELECT SUM(Total) AS OverallTotal
FROM Invoice;
OverallTotal |
---|
2328.6 |
Sum of Total by InvoiceDate.
SELECT InvoiceDate, SUM(Total) AS DailyTotal
FROM Invoice
GROUP BY InvoiceDate;
InvoiceDate | DailyTotal |
---|---|
2023-01-01 00:00:00 | 1.98 |
2023-01-02 00:00:00 | 4.95 |
2023-01-03 00:00:00 | 5.94 |
2023-01-06 00:00:00 | 8.91 |
2023-01-08 00:00:00 | 3.96 |
2023-01-09 00:00:00 | 3.96 |
2023-01-10 00:00:00 | 6.94 |
2023-01-11 00:00:00 | 13.86 |
2023-01-13 00:00:00 | 17.91 |
2023-01-15 00:00:00 | 3.96 |
Sum of Total by InvoiceDate, sorted by InvoiceDate.
SELECT InvoiceDate, SUM(Total) AS DailyTotal
FROM Invoice
GROUP BY InvoiceDate
ORDER BY InvoiceDate;
InvoiceDate | DailyTotal |
---|---|
2023-01-01 00:00:00 | 1.98 |
2023-01-02 00:00:00 | 4.95 |
2023-01-03 00:00:00 | 5.94 |
2023-01-06 00:00:00 | 8.91 |
2023-01-08 00:00:00 | 3.96 |
2023-01-09 00:00:00 | 3.96 |
2023-01-10 00:00:00 | 6.94 |
2023-01-11 00:00:00 | 13.86 |
2023-01-13 00:00:00 | 17.91 |
2023-01-15 00:00:00 | 3.96 |
Sum of Total by InvoiceDate, sorted by InvoiceDate in descending order.
SELECT InvoiceDate, SUM(Total) AS DailyTotal
FROM Invoice
GROUP BY InvoiceDate
ORDER BY SUM(Total) DESC;
InvoiceDate | DailyTotal |
---|---|
2024-11-13 00:00:00 | 25.86 |
2024-08-05 00:00:00 | 23.86 |
2023-04-28 00:00:00 | 21.86 |
2023-02-18 00:00:00 | 21.86 |
2023-05-29 00:00:00 | 18.86 |
2023-01-18 00:00:00 | 18.86 |
2023-01-13 00:00:00 | 17.91 |
2024-10-06 00:00:00 | 16.86 |
2024-09-05 00:00:00 | 16.86 |
2023-06-29 00:00:00 | 15.86 |
Sum of Total by InvoiceDate having total > 10, sorted by InvoiceDate in descending order.
SELECT InvoiceDate, SUM(Total) AS DailyTotal
FROM Invoice
GROUP BY InvoiceDate
HAVING SUM(Total) > 10
ORDER BY SUM(Total) DESC;
InvoiceDate | DailyTotal |
---|---|
2024-11-13 00:00:00 | 25.86 |
2024-08-05 00:00:00 | 23.86 |
2023-04-28 00:00:00 | 21.86 |
2023-02-18 00:00:00 | 21.86 |
2023-05-29 00:00:00 | 18.86 |
2023-01-18 00:00:00 | 18.86 |
2023-01-13 00:00:00 | 17.91 |
2024-10-06 00:00:00 | 16.86 |
2024-09-05 00:00:00 | 16.86 |
2023-06-29 00:00:00 | 15.86 |
Sum of Total and count of Invoices by InvoiceDate and Customer ID
SELECT InvoiceDate,
CustomerId,
SUM(Total) AS DailyTotal,
COUNT(InvoiceId) AS InvoiceCount
FROM Invoice
GROUP BY InvoiceDate, CustomerId;
InvoiceDate | CustomerId | DailyTotal | InvoiceCount |
---|---|---|---|
2023-01-01 00:00:00 | 2 | 1.98 | 1 |
2023-01-02 00:00:00 | 4 | 3.96 | 1 |
2023-01-02 00:00:00 | 26 | 0.99 | 1 |
2023-01-03 00:00:00 | 8 | 5.94 | 1 |
2023-01-06 00:00:00 | 14 | 8.91 | 1 |
2023-01-08 00:00:00 | 43 | 1.98 | 1 |
2023-01-08 00:00:00 | 45 | 1.98 | 1 |
2023-01-09 00:00:00 | 47 | 3.96 | 1 |
2023-01-10 00:00:00 | 51 | 6.94 | 1 |
2023-01-11 00:00:00 | 23 | 13.86 | 1 |
Inner join: Get the count of invoice IDs for every customer who has a company name associated with their record in the customer table. The WHERE company is NOT NULL clause takes care of only selecting customers with company names.
SELECT Company, COUNT(InvoiceID)
FROM Customer
JOIN Invoice ON Invoice.CustomerId = Customer.CustomerId
WHERE Company IS NOT NULL
GROUP BY Company;
Company | COUNT(InvoiceID) |
---|---|
Apple Inc. | 7 |
Banco do Brasil S.A. | 7 |
Embraer - Empresa Brasileira de Aeronáutica S.A. | 7 |
Google Inc. | 7 |
JetBrains s.r.o. | 7 |
Microsoft Corporation | 7 |
Riotur | 7 |
Rogers Canada | 7 |
Telus | 7 |
Woodstock Discos | 7 |
Left Join: Get results even if customer ID values fall outside of the intersection. So in the case of a customer that has no matching invoice records, such as LinkedIn, the invoice columns are represented as NULLS, and so the COUNT invoice ID calculation returns zero.
SELECT Company, COUNT(InvoiceID)
FROM Customer
LEFT OUTER JOIN Invoice ON Invoice.CustomerId = Customer.CustomerId
WHERE Company IS NOT NULL
GROUP BY Company;
Company | COUNT(InvoiceID) |
---|---|
Apple Inc. | 7 |
Banco do Brasil S.A. | 7 |
Embraer - Empresa Brasileira de Aeronáutica S.A. | 7 |
Google Inc. | 7 |
JetBrains s.r.o. | 7 |
0 | |
Microsoft Corporation | 7 |
Riotur | 7 |
Rogers Canada | 7 |
Telus | 7 |
SELECT ROUND(1.2345678, 3);
ROUND(1.2345678, 3) |
---|
1.235 |
SELECT INSTR(BillingAddress, ' ') FROM Invoice LIMIT 5;
INSTR(BillingAddress, ’ ’) |
---|
21 |
14 |
13 |
5 |
3 |
SELECT SUBSTR(BillingAddress, 0, INSTR(BillingAddress, ' '))
FROM Invoice LIMIT 5;
SUBSTR(BillingAddress, 0, INSTR(BillingAddress, ’ ’)) |
---|
Theodor-Heuss-Straße |
Ullevålsveien |
Grétrystraat |
8210 |
69 |
SELECT FORMAT('$%f', SUM(Total)) FROM Invoice;
FORMAT(‘$%f’, SUM(Total)) |
---|
$2328.600000 |
SELECT FORMAT('$%.2f', SUM(Total)) FROM Invoice;
FORMAT(‘$%.2f’, SUM(Total)) |
---|
$2328.60 |
SELECT InvoiceDate, strftime('%Y-%m-%d', InvoiceDate)
FROM Invoice LIMIT 10;
InvoiceDate | strftime(‘%Y-%m-%d’, InvoiceDate) |
---|---|
2023-01-01 00:00:00 | 2023-01-01 |
2023-01-02 00:00:00 | 2023-01-02 |
2023-01-03 00:00:00 | 2023-01-03 |
2023-01-06 00:00:00 | 2023-01-06 |
2023-01-11 00:00:00 | 2023-01-11 |
2023-01-19 00:00:00 | 2023-01-19 |
2023-02-01 00:00:00 | 2023-02-01 |
2023-02-01 00:00:00 | 2023-02-01 |
2023-02-02 00:00:00 | 2023-02-02 |
2023-02-03 00:00:00 | 2023-02-03 |
Count number of rows.
SELECT COUNT(*) FROM Invoice;
COUNT(*) |
---|
412 |
Count number of non Null values for a given column.
SELECT COUNT(BillingPostalCode) FROM Invoice;
COUNT(BillingPostalCode) |
---|
384 |
Number of distinct postal codes in Invoice table
SELECT COUNT(DISTINCT BillingPostalCode) FROM Invoice;
COUNT(DISTINCT BillingPostalCode) |
---|
55 |
Number of customers in each country.
SELECT Country,
COUNT(DISTINCT CustomerId) AS num
FROM Customer
GROUP BY Country;
Country | num |
---|---|
NA | 1 |
Argentina | 1 |
Australia | 1 |
Austria | 1 |
Belgium | 1 |
Brazil | 5 |
Canada | 8 |
Chile | 1 |
Czech Republic | 2 |
Denmark | 1 |
Average of previouq query
SELECT AVG(num)
FROM (SELECT Country,
COUNT(DISTINCT CustomerId) AS num
FROM Customer
GROUP BY Country
);
AVG(num) |
---|
2.4 |
Get min and max as well.
SELECT AVG(num) AS avg_num,
MIN(num) AS min_num,
MAX(num) AS max_num
FROM (SELECT Country,
COUNT(DISTINCT CustomerId) AS num
FROM Customer
GROUP BY Country
);
avg_num | min_num | max_num |
---|---|---|
2.4 | 1 | 13 |
Another handy aggregate function is group_concat
. You
can use this to produce a comma separated list of aggregate values.
SELECT Country,
GROUP_CONCAT(DISTINCT State) AS States
FROM Customer
GROUP BY Country;
Country | States |
---|---|
NA | NA |
Argentina | NA |
Australia | NSW |
Austria | NA |
Belgium | NA |
Brazil | SP,RJ,DF |
Canada | QC,AB,BC,ON,NS,MB,NT |
Chile | NA |
Czech Republic | NA |
Denmark | NA |
quizdata <- dbConnect(RSQLite::SQLite(), "courses/learning_sql_programming/quizdata.db")
Let’s join up our quiz participants’ names and state abbreviations with the full state name from the states table.
SELECT people.first_name, people.last_name, people.state_code, states.state_name
FROM people
JOIN states
ON people.state_code=states.state_abbrev;
first_name | last_name | state_code | state_name |
---|---|---|---|
Janice | Howell | CA | California |
Wanda | Alvarez | CA | California |
Laura | Olson | CA | California |
Jack | Garcia | NY | New York |
Ryan | Rice | DE | Delaware |
Christine | Wood | MI | Michigan |
Dennis | Banks | NY | New York |
Paula | Montgomery | AL | Alabama |
Jerry | Ferguson | IL | Illinois |
Darren | Black | FL | Florida |
We can switch up the order that we’re joining the tables too, and we’ll see that we get a different order of records. I’ll switch the states and people tables here around the join keyword.
SELECT people.first_name, people.last_name, people.state_code, states.state_name
FROM states
JOIN people
ON people.state_code=states.state_abbrev;
first_name | last_name | state_code | state_name |
---|---|---|---|
Alice | Roberts | AL | Alabama |
Ann | Holmes | AL | Alabama |
Anna | Foster | AL | Alabama |
Arthur | Carroll | AL | Alabama |
Brenda | Green | AL | Alabama |
Brian | Berry | AL | Alabama |
Charles | Diaz | AL | Alabama |
Charles | Torres | AL | Alabama |
Earl | Snyder | AL | Alabama |
Edward | Hart | AL | Alabama |
Now that the states table is on the left and the people table is on the right, let’s switch this to a left join so we get all of the records from the states table instead of just the ones that match. (No quiz participants from WY.)
SELECT people.first_name, people.last_name, people.state_code, states.state_name
FROM states
LEFT JOIN people
ON people.state_code=states.state_abbrev;
first_name | last_name | state_code | state_name |
---|---|---|---|
Alice | Roberts | AL | Alabama |
Ann | Holmes | AL | Alabama |
Anna | Foster | AL | Alabama |
Arthur | Carroll | AL | Alabama |
Brenda | Green | AL | Alabama |
Brian | Berry | AL | Alabama |
Charles | Diaz | AL | Alabama |
Charles | Torres | AL | Alabama |
Earl | Snyder | AL | Alabama |
Edward | Hart | AL | Alabama |
Let’s write a query to match up the state abbreviations and the people table with those in the states table so we can see what else is missing. I don’t need all these rows, so I’ll use the distinct function to throw out duplicates and reduce the size of our result.
SELECT DISTINCT(people.state_code), states.state_name
FROM states
LEFT JOIN people ON people.state_code=states.state_abbrev
ORDER BY people.state_code;
state_code | state_name |
---|---|
NA | Maine |
NA | Rhode Island |
NA | Wyoming |
AK | Alaska |
AL | Alabama |
AR | Arkansas |
AZ | Arizona |
CA | California |
CO | Colorado |
CT | Connecticut |
How many times each first name appears in the table.
SELECT first_name, COUNT(first_name)
FROM people
GROUP BY first_name;
first_name | COUNT(first_name) |
---|---|
Aaron | 3 |
Adam | 6 |
Alan | 2 |
Albert | 1 |
Alice | 5 |
Amy | 7 |
Andrea | 4 |
Andrew | 8 |
Angela | 6 |
Ann | 6 |
How many participants come from each state.
SELECT state_code, COUNT(state_code)
FROM people
GROUP BY state_code;
state_code | COUNT(state_code) |
---|---|
AK | 1 |
AL | 24 |
AR | 6 |
AZ | 19 |
CA | 127 |
CO | 28 |
CT | 14 |
DE | 2 |
FL | 93 |
GA | 27 |
How many people each state got each score.
SELECT state_code, quiz_points, COUNT(quiz_points)
FROM people
GROUP BY state_code, quiz_points
state_code | quiz_points | COUNT(quiz_points) |
---|---|---|
AK | 93 | 1 |
AL | 74 | 1 |
AL | 76 | 2 |
AL | 79 | 2 |
AL | 81 | 2 |
AL | 82 | 1 |
AL | 83 | 2 |
AL | 85 | 3 |
AL | 86 | 2 |
AL | 87 | 1 |
Sometimes we need to use more than one select statement to get the information we want. It’s often the case that we want to set up a query that relies on the result of another query. Using a secondary select statement inside of another query is called a sub-query or sub-select, and it helps us to narrow down or zero in on a specific set of data to use in that larger query.
Get a listing of all the people in our database who achieved the highest score on our quiz.
SELECT first_name, last_name, quiz_points
FROM people
WHERE quiz_points=(SELECT MAX(quiz_points) FROM people);
first_name | last_name | quiz_points |
---|---|---|
Scott | Thompson | 100 |
Catherine | Rodriguez | 100 |
Kathryn | Welch | 100 |
Terry | Fernandez | 100 |
Earl | Green | 100 |
Julia | Lee | 100 |
Find all people from Minnesota without knowing state abbreviation.
SELECT * FROM people
WHERE state_code=(SELECT state_abbrev FROM states WHERE state_name='Minnesota');
id_number | first_name | last_name | city | state_code | shirt_or_hat | quiz_points | team | signup | age | company |
---|---|---|---|---|---|---|---|---|---|---|
78 | Roy | Rivera | Minneapolis | MN | shirt | 81 | Baffled Badgers | 2021-01-03 | 47 | Hayes, Greenfelder and Schulist |
150 | Joan | Watkins | Minneapolis | MN | hat | 85 | Baffled Badgers | 2021-01-05 | 23 | Morissette Ltd. |
253 | Darren | Phillips | Minneapolis | MN | hat | 85 | Cosmic Cobras | 2021-01-08 | 53 | NA |
288 | Antonio | Hart | Young America | MN | shirt | 85 | Baffled Badgers | 2021-01-09 | 30 | Feest Inc |
415 | Louis | Cox | Young America | MN | hat | 90 | Cosmic Cobras | 2021-01-13 | 45 | Mitchell-Nolan |
452 | Chris | Castillo | Duluth | MN | shirt | 76 | Angry Ants | 2021-01-14 | 25 | NA |
458 | Ashley | Fisher | Saint Paul | MN | shirt | 79 | Angry Ants | 2021-01-15 | 38 | Keeling Ltd. |
543 | Harry | Welch | Rochester | MN | hat | 92 | Baffled Badgers | 2021-01-17 | 27 | Wehner Group |
570 | Martin | Richardson | Minneapolis | MN | hat | 87 | Baffled Badgers | 2021-01-18 | 24 | Ortiz, Lubowitz and Brekke |
576 | Cynthia | Carroll | Monticello | MN | hat | 82 | Baffled Badgers | 2021-01-18 | 24 | Stracke Ltd. |
This course uses three databases:
album <- dbConnect(RSQLite::SQLite(),
"courses/sql_essential_training/album.db")
test <- dbConnect(RSQLite::SQLite(),
"courses/sql_essential_training/test.db")
world <- dbConnect(RSQLite::SQLite(),
"courses/sql_essential_training/world.db")
Quick SELECT
example.
SELECT Name, LifeExpectancy AS "Life Expectancy"
FROM Country ORDER BY Name;
Name | Life Expectancy |
---|---|
Afghanistan | 45.9 |
Albania | 71.6 |
Algeria | 69.7 |
American Samoa | 75.1 |
Andorra | 83.5 |
Angola | 38.3 |
Anguilla | 76.1 |
Antarctica | NA |
Antigua and Barbuda | 70.5 |
Argentina | 75.1 |
Get second five after first five
SELECT Name, Continent, Region
FROM Country
WHERE Continent = 'Europe' -- which rows
ORDER BY Name LIMIT 5 OFFSET 5; -- only 5, second 5
Name | Continent | Region |
---|---|---|
Bosnia and Herzegovina | Europe | Southern Europe |
Bulgaria | Europe | Eastern Europe |
Croatia | Europe | Southern Europe |
Czech Republic | Europe | Eastern Europe |
Denmark | Europe | Nordic Countries |
Counting Rows
COUNT(*)
means count all the rows. Count number of rows
where pop > 100,000,000 and Continent = ‘Europe’
SELECT COUNT(*) FROM Country WHERE Population > 100000000 AND Continent = 'Europe' ;
COUNT(*) |
---|
1 |
COUNT(LifeExpectancy)
means count rows where the column
has data.
SELECT COUNT(LifeExpectancy) FROM Country;
COUNT(LifeExpectancy) |
---|
222 |
Countries with pop < 100000 or pop is NULL, listed in descending order by population.
SELECT Name, Continent, Population FROM Country
WHERE Population < 100000 OR Population IS NULL ORDER BY Population DESC;
Name | Continent | Population |
---|---|---|
Tonga | Oceania | 99000 |
Grenada | North America | 94000 |
Virgin Islands, U.S. | North America | 93000 |
Kiribati | Oceania | 83000 |
Andorra | Europe | 78000 |
Northern Mariana Islands | Oceania | 78000 |
Seychelles | Africa | 77000 |
Dominica | North America | 71000 |
American Samoa | Oceania | 68000 |
Antigua and Barbuda | North America | 68000 |
Countries with pop < 100000 and Continent = ‘Oceania’, listed in descending order by population.
SELECT Name, Continent, Population FROM Country
WHERE Population < 100000 AND Continent = 'Oceania' ORDER BY Population DESC;
Name | Continent | Population |
---|---|---|
Tonga | Oceania | 99000 |
Kiribati | Oceania | 83000 |
Northern Mariana Islands | Oceania | 78000 |
American Samoa | Oceania | 68000 |
Marshall Islands | Oceania | 64000 |
Cook Islands | Oceania | 20000 |
Palau | Oceania | 19000 |
Wallis and Futuna | Oceania | 15000 |
Nauru | Oceania | 12000 |
Tuvalu | Oceania | 12000 |
LIKE
operator: percent sign is a wildcard; find Names
that have anything anything before or after ‘island’
SELECT Name, Continent, Population FROM Country
WHERE Name LIKE '%island%' ORDER BY Name;
Name | Continent | Population |
---|---|---|
Bouvet Island | Antarctica | NA |
Cayman Islands | North America | 38000 |
Christmas Island | Oceania | 2500 |
Cocos (Keeling) Islands | Oceania | 600 |
Cook Islands | Oceania | 20000 |
Falkland Islands | South America | 2000 |
Faroe Islands | Europe | 43000 |
Fiji Islands | Oceania | 817000 |
Heard Island and McDonald Islands | Antarctica | NA |
Marshall Islands | Oceania | 64000 |
LIKE
operator: all countries with ‘a’ as second
letter
SELECT Name, Continent, Population FROM Country
WHERE Name LIKE '_a%' ORDER BY Name;
Name | Continent | Population |
---|---|---|
Bahamas | North America | 307000 |
Bahrain | Asia | 617000 |
Bangladesh | Asia | 129155000 |
Barbados | North America | 270000 |
Cambodia | Asia | 11168000 |
Cameroon | Africa | 15085000 |
Canada | North America | 31147000 |
Cape Verde | Africa | 428000 |
Cayman Islands | North America | 38000 |
East Timor | Asia | 885000 |
IN
operator: all countries in Asia or Europe
SELECT Name, Continent, Population FROM Country
WHERE Continent IN ( 'Europe', 'Asia' ) ORDER BY Name;
Name | Continent | Population |
---|---|---|
Afghanistan | Asia | 22720000 |
Albania | Europe | 3401200 |
Andorra | Europe | 78000 |
Armenia | Asia | 3520000 |
Austria | Europe | 8091800 |
Azerbaijan | Asia | 7734000 |
Bahrain | Asia | 617000 |
Bangladesh | Asia | 129155000 |
Belarus | Europe | 10236000 |
Belgium | Europe | 10239000 |
SELECT DISTINCT
: Using the SELECT DISTINCT
statement, you will get only unique results.
SELECT DISTINCT Continent FROM Country;
Continent |
---|
Asia |
Europe |
North America |
Africa |
Oceania |
South America |
Antarctica |
Ordering output. Order by Continent in descending order, then within Continent show Region in ascending order, and within Region show Name in ascending order.
SELECT Name, Continent, Region
FROM Country
ORDER BY Continent DESC, Region, Name;
Name | Continent | Region |
---|---|---|
Argentina | South America | South America |
Bolivia | South America | South America |
Brazil | South America | South America |
Chile | South America | South America |
Colombia | South America | South America |
Ecuador | South America | South America |
Falkland Islands | South America | South America |
French Guiana | South America | South America |
Guyana | South America | South America |
Paraguay | South America | South America |
Example of a junction table.
SELECT c.name AS Cust, c.zip, i.name AS Item, i.description, s.quantity AS Quan, s.price AS Price
FROM sale AS s
JOIN item AS i ON s.item_id = i.id
JOIN customer AS c ON s.customer_id = c.id
ORDER BY Cust, Item
;
Cust | zip | Item | description | Quan | Price |
---|---|---|---|---|---|
Bill Smith | 98765 | Box of 64 Pixels | 64 RGB pixels in a decorative box | 1 | 2995 |
Bob Smith | 98765 | Bar Code | Unused. In original packaging. | 2 | 999 |
Mary Smith | 98765 | Box of 64 Pixels | 64 RGB pixels in a decorative box | 3 | 2995 |
Mary Smith | 98765 | Box of 64 Pixels | 64 RGB pixels in a decorative box | 1 | 2995 |
Mary Smith | 98765 | Sense of Humor | Especially dry. Imported from England. | 1 | 1995 |
Example of aggregate functions. Count of rows by Region.
SELECT Region, COUNT(*)
FROM Country
GROUP BY Region
;
Region | COUNT(*) |
---|---|
Antarctica | 5 |
Australia and New Zealand | 5 |
Baltic Countries | 3 |
British Islands | 2 |
Caribbean | 24 |
Central Africa | 9 |
Central America | 8 |
Eastern Africa | 20 |
Eastern Asia | 8 |
Eastern Europe | 10 |
SELECT Region, COUNT(*) AS Count -- with alias
FROM Country
GROUP BY Region
ORDER BY Count DESC, Region -- count in descending order
;
Region | Count |
---|---|
Caribbean | 24 |
Eastern Africa | 20 |
Middle East | 18 |
Western Africa | 17 |
Southern Europe | 15 |
South America | 14 |
Southern and Central Asia | 14 |
Southeast Asia | 11 |
Eastern Europe | 10 |
Polynesia | 10 |
-- number of tracks for each album
SELECT a.title AS Album, COUNT(t.track_number) as Tracks
FROM track AS t
JOIN album AS a
ON a.id = t.album_id
GROUP BY a.id
ORDER BY Tracks DESC, Album
;
Album | Tracks |
---|---|
Rubber Soul | 14 |
Birds of Fire | 10 |
Two Men with the Blues | 10 |
Apostrophe | 9 |
Hendrix in the West | 9 |
Live And | 6 |
Kind of Blue | 5 |
Just albums with 10 more tracks.
-- HAVING is for aggregate data
SELECT a.title AS Album, COUNT(t.track_number) as Tracks
FROM track AS t
JOIN album AS a
ON a.id = t.album_id
GROUP BY a.id
HAVING Tracks >= 10
ORDER BY Tracks DESC, Album
;
Album | Tracks |
---|---|
Rubber Soul | 14 |
Birds of Fire | 10 |
Two Men with the Blues | 10 |
Albums with 10 or more tracks by the Beatles.
-- WHERE is for non-aggregate data
SELECT a.title AS Album, COUNT(t.track_number) as Tracks
FROM track AS t
JOIN album AS a
ON a.id = t.album_id
WHERE a.artist = 'The Beatles'
GROUP BY a.id
HAVING Tracks >= 10
ORDER BY Tracks DESC, Album
;
Album | Tracks |
---|---|
Rubber Soul | 14 |
More aggregate functions.
SELECT Region, AVG(Population)
FROM Country
GROUP BY Region;
Region | AVG(Population) |
---|---|
Antarctica | NA |
Australia and New Zealand | 4550620 |
Baltic Countries | 2520633 |
British Islands | 31699250 |
Caribbean | 1589167 |
Central Africa | 10628000 |
Central America | 16902625 |
Eastern Africa | 12999947 |
Eastern Asia | 188416000 |
Eastern Europe | 30702600 |
SELECT Region, MIN(Population), MAX(Population)
FROM Country
GROUP BY Region;
Region | MIN(Population) | MAX(Population) |
---|---|---|
Antarctica | NA | NA |
Australia and New Zealand | 600 | 18886000 |
Baltic Countries | 1439200 | 3698500 |
British Islands | 3775100 | 59623400 |
Caribbean | 8000 | 11201000 |
Central Africa | 147000 | 51654000 |
Central America | 241000 | 98881000 |
Eastern Africa | 77000 | 62565000 |
Eastern Asia | 473000 | 1277558000 |
Eastern Europe | 4380000 | 146934000 |
Count of distinct heads of states.
SELECT COUNT(DISTINCT HeadOfState) FROM Country;
COUNT(DISTINCT HeadOfState) |
---|
179 |
Example of a subselect.
Which albums have tracks that are less than 90 seconds? Only shows album IDs.
SELECT DISTINCT album_id FROM track WHERE duration <= 90;
album_id |
---|
11 |
13 |
Use above select statement as a subselect. Note the use of parentheses and the IN operator.
SELECT * FROM album
WHERE id IN (SELECT DISTINCT album_id FROM track WHERE duration <= 90)
;
id | title | artist | label | released |
---|---|---|---|---|
11 | Hendrix in the West | Jimi Hendrix | Polydor | 1972-01-00 |
13 | Birds of Fire | Mahavishnu Orchestra | Columbia | 1973-03-00 |
SELECT a.title AS album, a.artist, t.track_number AS seq, t.title, t.duration AS secs
FROM album AS a
JOIN (
SELECT DISTINCT album_id, track_number, duration, title
FROM track
WHERE duration <= 90
) AS t
ON t.album_id = a.id
ORDER BY a.title, t.track_number
;
album | artist | seq | title | secs |
---|---|---|---|---|
Birds of Fire | Mahavishnu Orchestra | 4 | Sapphire Bullets of Pure Love | 24 |
Hendrix in the West | Jimi Hendrix | 6 | Sgt. Pepper’s Lonely Hearts Club Band | 76 |