Relational databases
Relational databases are storing data in tables and columns. Tables are linked together using (foreign) keys. You can view a relational database as an Excel table.
The Structured Query Language (SQL) is the most used language to query and manage relational databases.
Relational databases are managed by a Database Management System (DBMS) or RDBMS for Relational DBMS.
β‘οΈ You may try a few of them online at sqliteonline. When installing a WAMP Server, you will install a database (MariaDB or MySQL).
Terminology
A record/tuple/row π΅ is an entry in a table, such as (1, Luna, 18, 'Woman')
or (2, Henry, 24, 'Not specified')
.
An attribute/variable π» is a column in a table, such as id
, name
, age
... Attributes have a type, and may have constraints (such as Not Null...).
A key π is a set of one or more attributes used to identify a row. Keys must be unique, and every row must have a key.
- Primary key (set of attributes, must be unique and not null)
- Foreign key (a key taking values from another key in another table)
- Artificial key (incrementing (+=1) each time we are adding a row)
The attribute id
is an artificial key. In another table, we can use this number to reference the associated customer.
β‘οΈ It's common to pre- or post-fix columns such as id
(ex: c_id
) if there are many columns with the same name, or for other reasons.
Modeling
Creating a database from a specification is called Modeling π§βπ¨. It's a process usually involving 3 steps:
- π From the specification (a text), create a database diagram
- βοΈ From the diagram, write down the relational representation
- β From the relational representation, create the database
Step 1 β Get Started
Before creating your database diagrams, you need to identify some information within the specification.
- highlight every concrete piece of information
- Names, Locations
- Values (prices, distances...)
- ...
- categorize each information
- "John Doe" (piece of information) is a "name" (category)
- "Toronto" (piece of information) is a "city" (category)
- "Berlin" (piece of information) is a "city" (category)
- ...
- group categories by entities
- "name" belongs to "person"
- "address" belongs to "person"
- "city" belongs to "address"
- ...
Each group corresponds to a UML class
and each of their categories are their UML attributes
. Complete the diagram with identifiers
, associations
, multiplicity
, and generalizations
.
βοΈ If there is no identifier, you may create one.
Step 2 β Relational Representation
We represent the database diagram using the relational language. It provides a textual representation of your database structure.
A relation is a set of attributes that are mapped to their type.
// Overview of the syntax
R1(attribute=>type, #key=>type, ...)
R2(#foreign_key=>R1.key, ...)
R3(#(r1_key=>R1.key, r2_key=>R2.key), ...)
// Example
Person(#id=>Integer, name => String, address => Address.id)
Address(#id=>Integer, city => String)
π‘ The representation is usually including notes (ex: xxx is unique...).
π See the list of rules to map a database diagram here.
Step 3 β Implementation
For SQL, this is how you do it:
- Each
relation
is mapped to a SQLtable
- Each
relation
attributes are its SQLtable
attributes - Add all constraints (either on attributes or on the table)
- Ex: keys (primary, foreign)
- Ex: number of records
- Ex: interval, range, null and not null
- ...
π» To-do π»
Stuff that I found, but never read/used yet.
- CRUD/ACID
- Data lake
- Data warehouse
- ETL tools
- duckdb