Thursday, 29 November 2012

INDEXING

What is an Index?

index

A structure containing a set of entries, each consisting of a key field and a reference field, which is used to locate records in a data file.

key field

The part of an index which contains keys.

reference field

The part of an index which contains information to locate records.

  • An index imposes order on a file without rearranging the file.
  • Indexing works by indirection.

A Simple Index for Entry-Sequenced Files

simple index

An index in which the entries are a key ordered linear list.

  • Simple indexing can be useful when the entire index can be held in memory.
  • Changes (additions and deletions) require both the index and the data file to be changed.
  • Updates affect the index if the key field is changed, or if the record is moved.
  • An update which moves a record can be handled as a deletion followed by an addition.

Object Oriented Support for Indexed, Entry Sequenced Files

T E R M S
entry-sequenced file
A file in which the record order is determined by the order in which they are entered.

  • The physical order of records in the file may not be the same as order of entry, because of record deletions and space reuse.
  • The index should be read into memory when the data file is opened.

  Indexes That are too Large to Hold in Memory

  • Searching of a simple index on disk takes too much time.
  • Maintaining a simple index on disk in sorted order takes too much time.
  • Tree structured indexes such as B-trees are a scalable alternative to simple indexes.
  • Hashed organization is an alternative to indexing when only a primary index is needed.

 Indexing to Provide Access by Multiple Keys

secondary key

A search key other than the primary key.

secondary index

An index built on a secondary key.

  • Secondary indexes can be built on any field of the data file, or on combinations of fields.
  • Secondary indexes will typically have multiple locations for a single key.
  • Changes to the data may now affect multiple indexes.
  • The reference field of a secondary index can be a direct reference to the location of the entry in the data file.
  • The reference field of a secondary index can also be an indirect reference to the location of the entry in the data file, through the primary key.
  • Indirect secondary key references simplify updating of the file set.
  • Indirect secondary key references increase access time.

   Retrieval Using Combinations of Secondary Keys

  • The search for records by multiple keys can be done on multiple index, with the combination of index entries defining the records matching the key combination.
  • If two keys are to be combined, a list of entries from each key index is retrieved.
  • For an "or" combination of keys, the lists are merged.
  • I.e., any entry found in either list matches the search.
  • For an "and" combination of keys, the lists are matched.
  • I.e., only entries found in both lists match the search.

   Improving the Secondary Index Structure: Inverted Lists

     inverted list

  An index in which the reference field is the head pointer of a linked      list of reference items.

  Selective Indexes

   selective index

 An index which contains keys for only part of the records in a data file.

 Binding

      binding

     The association of a symbol with a value.

      locality

       A condition in which items accessed temporally close are also   physically close.

2 comments: