数据库概念 序言/Introduction 第一课笔记

2018年3月1日
12:59

写在前面

本笔记仅仅是本人在上课时的一些随手记录,并不完整也不完全正确。
如有错误,请在评论中或直接联系我指正,谢谢!

原始文件下载:(mht)(pdf)

Database management system (DBMS) Contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data(增删改查)
An environment that is both convenient and efficient to use

教材:University Database Example 第六版

Drawbacks of using file systems to store data

  • Data redundancy and inconsistency一致性
    • Multiple file formats, duplications of information in different files
  • Difficulty in accessing data
    • Need to write a new program to carry out each new task
  • Data isolation - multiple files and formats
  • Integrity problems 数据完整性
    • Integrity constraints (e.g., grade > 0) become “buried” in program code
      rather than being stated explicitly 数据合法性等问题
    • Hard to add new constraints or change existing ones
  • Atomicity of update 修改原子性
    • Failures may leave database in an inconsistent state with partial
      updates carried out
    • e.g.: Transfer of funds from one account to another should either
      complete or not happen at all
    • Rollback 回滚
  • Concurrent access by multiple users 并发访问
    • Concurrent access needed for performance
    • Uncontrolled concurrent accesses can lead to inconsistencies
    • 锁机制
  • Security problems
    • Hard to provide user access to some, but not all, data

View of Data 数据的视图

A major purpose of a database system is to provide uses with an abstract view
of a data. The database system hides certain details of how the data are stored
and maintained. P6
view I Figure 1.1 view level view 2 logical I-ewe I physical levee I view n The three levels Of data abstraction.

Physical level

The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail.

Logical level

The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data.

View level

The highest level of abstraction describes only part of the entire database.
(Application programs)

Instances and Schemas

Similar to types and variables in programming languages

Schema

模式,指的是数据库的结构(表格式)

  • Physical schema: database design at the physical level
  • Logical schema: logical level

    Instance

    实例
    给定schema,在当前数据库中的数据
    The actual content of the database at a particular point in time
  • Analogous to the value of a variable

    Physical Data Independence

    物理数据独立性
    The ability to modify the physical schema without changing the logical schema
  • Applications depend on the logical schema
  • In general, the interfaces between the various levels and components should
    be well defined so that changes in some parts do not seriously influence
    others

Data Models

A way to describe the design of a database at the physical, logical and view
level.

  • A collection of tools for describing
    • Data
    • Data relationships
    • Data semantics
    • Data constraints
  • Relational model关系模型
  • Entity-Relationship data model (mainly for database design)
    (E-R)实体联系模型
  • Object-based data models (Object-oriented and Object-relational)
  • Semi-structured data model (XML) 半结构化数据模型 json *自学
  • Other older models
    • Network model
    • Hierarchical model (银行)

NoSQL 数据库数据模型

  • Key-Value
  • Document-oriented
  • Graph
  • Column-family 列族

Database Language

  • DDL data definition language: specify the database schema
  • DML data manipulation language: express database queries and update
  • They are not two separate languages, instead they are parts of a single
    database language, such as SQL

Relational Model

Example of tabular data in the relational model
22222 12121 32343 45565 98345 76766 10101 58583 83821 15151 33456 76543 Einstein wu El Said Katz Kim Crick Sri n i vasan Califieri Brandt Mozart Gold Sin dep f Physics Finance History Comp. Sci. Elec. Eng. Biology Comp. Sci. History Comp. Sci. Music Physics Finance salary 75000 80000 72000 65000 62000 92000 87000 (a) The instructor table
关系表
列:属性 首行决定schema 下面即为instance

DML

增删改查
两种类

  • Procedural DMLs require a user to specify what data are needed and how to
    get those data
  • Declarative DMLs (also referred to as nonprocedural DMLs) require a user to
    specify what data are needed without specifying how to get those data
    Query language
    SQL is the best.

e.g.

1
2
3
select instructor.name
from instructor
where instructor.dept name = ’History’;

DDL

  • Data storage and definition language define the implementation details of
    the database schemas(hidden from the users)
  • Must satisfy certain consistency constraints
    • Domain constraints
    • Referential integrity
    • Assertions
    • Authorization

SQL provides a rich DDL that allows one to define tables, integrity constraints,
assertions, etc.
e.g.:

1
2
3
4
create table department
(dept_name char (20),
building char (15),
budget numeric (12,2));

Application programs generally access databases through one of

  • Embedded/ language extension

Database design

The process of designing the general structure of the database

  • Logical Design - Deciding on the database schema
    • Business decision
    • CS decision
  • Physical Design - Deciding on the physical layout of the database

Design approaches

Normalization theory C8

  • Formalize what designs are bad, and test for them
    E-R Model C7 (*Test)

The Entity-Relationship Model

Models an enterprise as a collection of entities and relationships

  • Described by a set of attributes
    Represented diagrammatically by an entity-relationship diagram
    instructor ID salary department dept name building Figure 1.3 A sample E -R diagram.
  • Entity sets are represented by a rectangular box with the entity set name in
    the header and the attributes listed below it.
  • Relationship sets are represented by a diamond connecting a pair of related
    entity sets. The name of the relationship is placed inside the diamond.

Storage Management (Physical Level)

The storage manager is the component of a database system that provides the
interface between the low-level data stored in the database and the application
programs and queries submitted to the system.

Query Processing

  1. Parsing and translation
  2. Optimization
  3. Evaluation

Transaction: An Execution of a DB Program

  • A transaction is a collection of operations that performs a single logical
    function in a database application
    • An atomic sequence of database actions (R/W)
    • Each transaction, executed completely, must leave the DB in a consistent
      state if DB is consistent
  • Recovery manager, failure recovery
  • Concurrency-control manager