Term
What are characteristics of sequential files? |
|
Definition
- Tuples of a relation are sorted by their primary key
- Tuplues are then distributed among blocks in that order
- Common to leave free space in each block to allow for later insertions
|
|
|
Term
What are the tradeoffs to maintaining an index? |
|
Definition
- Maintaining one costs processor time
- When entries are added, index must be updated
- Index must be maintained to make good use of resources
- There is a tradeoff between
- Rapida access when retreiving data
- Speed of updating the database
|
|
|
Term
|
Definition
- Sequence of blocks holding only keys and pointers to records
- Entity for every record in data file
- Blocks of index are in same order as those in the data file
- Key-pointer pair much smaller than record
- Advantages
- Fewer blocks than data file, fewer disk accesses
- Keys are sorted, so we can use binary search
- Can keep in main memory if small neough (no disk accesses)
|
|
|
Term
|
Definition
- One key-pointer pair per block of data file
- Can only be used if data file is sorted by search key
- Uses less space than dense index
- Takes longer to find key than dense index
|
|
|
Term
What is a multi-level index? |
|
Definition
- Index file may cover many blocks
- May still need many disk acceses
- Use spare index over the first index
- Can't be a dense index (would use the same nuber of blocks as the index being indexed)
- Can create a third level index, but in general prefer B-trees
|
|
|
Term
What are some notes on pointers? |
|
Definition
- Block pointers (as used in sparse indexes) can be smaller than record pointers (used in dense indexes)
- Physical record pointers consist of a block pointer and an offset
- If file is contiguous, then we can omit pointers
- Compute offset from block size and key position
|
|
|
Term
What is the tradeoff of sparse vs dense indexing? |
|
Definition
- Sparse
- Less index space per recor can keep more of index memory
- Better for insertions
- Dense:
- Can tell if a record exists without accessing file
- Needed for secondary indexes
|
|
|
Term
Show an example of an insertion in a Sparse index where there's no space between the blocks, where it should be inserted |
|
Definition
|
|
Term
What are some characteristics of secondary indexes? |
|
Definition
- Unlike a primary index, does not determine placement of records in data file
- Location (order) of records may have been decided by a primary index or another field
- Secondary indexes are always dense
- Pointers are record pointers not block pointers
- May have higher levels of sparse indexes above the dense index
|
|
|
Term
Show the ways Dense index can approach duplicate keys |
|
Definition
|
|
Term
How can sparse indexes deal with duplicate keys? |
|
Definition
These would just point to the first instance of each value |
|
|
Term
How can secondary indexes deal with duplicate keys? |
|
Definition
There can be several solutions:
- Repeated entries
- Problems:
- excess disk space
- Search time is an overhead
- Drop repeated keys
- Problem with variable size records in index
- Chain records with same key (linked list)
- Problems:
- Need to ad fields to records, and need to follow a chain
- Indirection via buckets of pointers
- Advantages:
- If we have multiple secondary indexes on a relation, we can calculate conjunctions by taking intersections of buckets
- Don't need to examine data file
|
|
|
Term
What are the advantages and disadvantages of conventional indexes? |
|
Definition
- Advantages:
- Simple
- Index is sequential file and good for scans
- Disadvantages:
- Inserts expensive
- May lose sequentiality & balance
|
|
|