Term
What is flexing defined as? What 3 design modfications can you do with flexing? |
|
Definition
Flexing is the process of introducing controlled redundancy for a specific design purpose, in order to increase performance.
Flexing can be done by:
- by Table Elimination
- by Table Splitting
- by introducing derivable attributes
|
|
|
Term
How would you flex this scenario, using table elimination? What is the issue with doing this?
[image]
20% of employees work on a machine; about 95% of machines have an employee assigned to them. |
|
Definition
Flexed tables by elimination:
employee(emp#, emp_name...)
machine(machine#, machine_location, emp#)
This saves storage space and increases performance as only one join is needed when querying employee and machine. However, there will be come NULL values in the machine table - but only in 5% of cases. |
|
|
Term
How would you flex this table using table splitting?
What are the advantages and disadvanges of this?
Customer(cust#, cust_name, cust_addr, gender, age_group, marital_status, no_of_children, employment_status, account#, ...)
Note: One of the key transcations that must be done quickly is to print mailing labels. |
|
Definition
Can split like so:
CustomerAddress(cust#, name, address)
CustomerProfile(cust#, gender, age_group, marital_status, no_of_children, employment_status, account#, ...)
The query to select address data means that a lot less data must enter RAM, making printing labels more efficient. However, when all customer details are needed a join is required. If the address details are needed rarely apart for labels, this is acceptable. |
|
|
Term
How would you flex this scenario using derivable attributes? When are derivable attributes worth implementing?
Order(order#, order_date, cust#, ...)
Orderline(order#, item#, qty_ordered, ...)
Item(item#, item_desc, unit_price, ...)
The value of each orderline is derived by multiplying unit price by quantity ordered.
The total value of the order is derived by summing the total of each orderline.
|
|
Definition
We could put attributes with the totals pre-calculated in the order and orderline. This means that every read doesn't require the totals to be found by a series of joins and SUM() functions which are processing heavy.
However, it does mean that every time the order is changed, it the derived attributes will need to be recalculated. So it's only worth it if the data is read frequently but rarely modifed. |
|
|
Term
How many clauses can a SELECT statement have, and what is their syntax? |
|
Definition
SELECT [ALL|DISTINCT] <item>
FROM <table>
[WHERE <expression>]
[GROUP BY <column name>]
[HAVING <expression>]
[ORDER BY <column name>] |
|
|
Term
What aggregate functions are available in SQL? |
|
Definition
COUNT (number of values)
AVG (average value)
MIN (smallest value)
MAX (largest value)
SUM (total value) |
|
|
Term
Given these tables...
Borrower(bor_no, bor_name, bor_state, addr1, addr2, town, postcode)
Reservation(bor_no, ISBN, reserve_date)
Loan(access_no, bor_no, loan_date)
Find the number of product types that are stocked by each branch. |
|
Definition
SELECT branch_code, COUNT(product_code) AS
product_types
FROM stock
GROUP BY branch_code; |
|
|
Term
Given these tables...
Borrower(bor_no, bor_name, bor_state, addr1, addr2, town, postcode)
Reservation(bor_no, ISBN, reserve_date)
Loan(access_no, bor_no, loan_date)
Find the number of produt types in each branch where there is more than one product. |
|
Definition
SELECT branch_code, COUNT(product_code) AS Product_types
FROM stock
GROUP BY branch_code
HAVING COUNT(product_code) > 1; |
|
|
Term
Rewrite this query as a subquery:
SELECT bor_name
FROM borrower, reservation
WHERE borrower.bor_no = reservation.bor_no; |
|
Definition
SELECT bor_name
FROM borrower
WHERE bor_no IN
(SELECT bor_no
FROM reservation);
|
|
|
Term
Given the following table:
Book(isbn, title, now_price)
Find the books whose price is higher than the average book price. |
|
Definition
SELECT isbn, title, now_price
FROM book
WHERE now_price >
(SELECT AVG(now_price)
FROM book); |
|
|
Term
What does the EXISTS operator do in SQL? |
|
Definition
EXISTS can be used with correleated subqueries (where the inner query references the outer query). EXISTS tests whether the inner query has returned any rows, and returns either true or false, so it can be used with WHERE.
For example:
SELECT bor_name
FROM borrower b
WHERE EXISTS
(SELECT *
FROM reservation r
WHERE b.bor_no = r.bor_no) |
|
|
Term
How can you ensure entity integrity when creating a table? |
|
Definition
By ensuring you include a constraint on a CREATE TABLE query that creates a primary key and ensures the PK attribute is set to NOT NULL.
e.g. isbn CHAR(8) NOT NULL,
CONSTRAINT book_pk PRIMARY KEY (isbn) |
|
|
Term
How can you ensure referential integrity when creating a table? |
|
Definition
By ensuring that foreign keys reference primary keys that actually exist, or are set to NULL.
In SQL:
CONSTRIANT book_fk FOREIGN KEY (pub_code) REFERENCES PUBLISHER(pub_code) |
|
|
Term
What are the the advantages of using a DBMS compared to other methods? |
|
Definition
- Sharing of data between applications
- Control of redundancy
- Improved data integrity
- Better data accessiblity
- Representation of complex relationships within data
- Enforcing integrity constraints
- Improved security
- Allowance of concurrency
- Backup and Recovery Procedures
|
|
|
Term
What are the disadvantages of using a DBMS compared to other methods of storing data? |
|
Definition
- Size and complexity of usage
- High cost
- Additional hardware costs
- Requirement of skilled staff
- Cost of conversion
- Slower processing for some applications
|
|
|
Term
What is a 'view' in SQL?
What is it most commonly used for?
What is the syntax for a view? |
|
Definition
A view is a virtual table - its created from base tables as consists of a subset of them.
It is commonly used as a security mechanism, controlling user access to data in the database - showing insensitive columns whilst not including more sensitive ones such as credit card details.
The basic syntax is:
CREATE VIEW <view name> AS
SELECT <columns>
FROM <table>; |
|
|
Term
Define a 'transaction', in terms of a DBMS. |
|
Definition
It is a logical unit of work, regarding reading or updating the contents of a database. |
|
|
Term
What are the 4 properites of transactions? Briefly describe each one. |
|
Definition
- Atomicity: it must be executed in its entirety or not at all.
- Consistency: it must transform the database from one consistent state to another consistent state.
- Isolation: it must execute independenty of other transactions, so it's actions are not seen by other transactions until committed.
- Durability: the effects of a committed transaction are permamently recorded and not lost.
|
|
|
Term
What three SQL delimiters allow the user to define transaction boundaries? |
|
Definition
BEGIN TRANSACTION
COMMIT
ROLLBACK |
|
|
Term
Draw a diagram to show all the possibilities of a transaction, showing all the states and transitions. |
|
Definition
|
|
Term
What is the LOG and what does it do? |
|
Definition
Log files keep track of database transactions, and contain information on all changes to the database.
They hold the before image of the data (for the ROLLBACK) and the after image of the data (to COMMIT). |
|
|
Term
What does concurrency in a DBMS allow? |
|
Definition
It allows shared read and write access to a data store at the same time, by interleaving transactions. |
|
|
Term
What are the titles of the three concurrency problems? |
|
Definition
- The lost update problem
- The uncommited dependency problem
- The inconsistent analysis problem
|
|
|
Term
What is the lost update problem? |
|
Definition
Allowing two processes to update the same data simultaneously.
[image]
|
|
|
Term
What is the Uncommited Dependency problem? |
|
Definition
Allowing one transaction to update, then another transcation reading the other update, before the first transaction fails and rolls back it's update. The second transcation has incorrect data.
[image] |
|
|
Term
What is the Inconsistent Analysis problem? |
|
Definition
Where a transcation reads in multiple values, but before it's finished reading all relevant values, another transcation updates and commits chances to the relevant values.
[image] |
|
|
Term
How is concurrent access achieved without running into the three concurrency problems? |
|
Definition
Locking, as it enforces the serialization principle, which is "when two or more transactions execute in a parallel interleaved manner then their effects should be the same as if they executed in a purely serial manner". |
|
|
Term
What are the two types of transaction lock avaialble, and what do they entail? |
|
Definition
- Shared read lock (S-lock): shared reading but not writing
- Exclusive write lock (X-lock): exclusive access
These locks mean that a transaction must wait for a lock before is released before it can place its own lock (and therefore read/modify data). |
|
|
Term
What is deadlock in terms of concurrency and how is it solved? |
|
Definition
Deadlock arises when two or more transactions are in a simulaneous wait state, each waiting for locks held by each other to be released.
The DBMS can either use a timeout and roll back the transaction, or give priority to one transaction and roll back the other. |
|
|
Term
What is database recovery? |
|
Definition
The process of resoring the database to a correct state in the event of a failure. |
|
|
Term
What are the primary causes of a failure? |
|
Definition
- Concurrency control enforcement
- Error detected by a transaction (e.g. attempting to withdraw money from an account with insufficient funds)
- Software error
- System crash
- Media failure
|
|
|
Term
When is a transaction considered permament? |
|
Definition
When the buffers have been cleared and the changes are committed to secondary storage (typically a disk). It is called a checkpoint. |
|
|