2018年3月1日
12:59
写在前面
本笔记仅仅是本人在上课时的一些随手记录,并不完整也不完全正确。
如有错误,请在评论中或直接联系我指正,谢谢!
Database management system (DBMS) | Contains information about a particular enterprise |
---|---|
Collection of interrelated data | |
Set of programs to access the data(增删改查) | |
An environment that is both convenient and efficient to use |
教材:University Database Example 第六版
Drawbacks of using file systems to store data
- Data redundancy and inconsistency一致性
- Multiple file formats, duplications of information in different files
- Difficulty in accessing data
- Need to write a new program to carry out each new task
- Data isolation - multiple files and formats
- Integrity problems 数据完整性
- Integrity constraints (e.g., grade > 0) become “buried” in program code
rather than being stated explicitly 数据合法性等问题 - Hard to add new constraints or change existing ones
- Integrity constraints (e.g., grade > 0) become “buried” in program code
- Atomicity of update 修改原子性
- Failures may leave database in an inconsistent state with partial
updates carried out - e.g.: Transfer of funds from one account to another should either
complete or not happen at all - Rollback 回滚
- Failures may leave database in an inconsistent state with partial
- Concurrent access by multiple users 并发访问
- Concurrent access needed for performance
- Uncontrolled concurrent accesses can lead to inconsistencies
- 锁机制
- Security problems
- Hard to provide user access to some, but not all, data
View of Data 数据的视图
A major purpose of a database system is to provide uses with an abstract view
of a data. The database system hides certain details of how the data are stored
and maintained. P6
Physical level
The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail.
Logical level
The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data.
View level
The highest level of abstraction describes only part of the entire database.
(Application programs)
Instances and Schemas
Similar to types and variables in programming languages
Schema
模式,指的是数据库的结构(表格式)
- Physical schema: database design at the physical level
- Logical schema: logical level
Instance
实例
给定schema,在当前数据库中的数据
The actual content of the database at a particular point in time - Analogous to the value of a variable
Physical Data Independence
物理数据独立性
The ability to modify the physical schema without changing the logical schema - Applications depend on the logical schema
- In general, the interfaces between the various levels and components should
be well defined so that changes in some parts do not seriously influence
others
Data Models
A way to describe the design of a database at the physical, logical and view
level.
- A collection of tools for describing
- Data
- Data relationships
- Data semantics
- Data constraints
- Relational model关系模型
- Entity-Relationship data model (mainly for database design)
(E-R)实体联系模型 - Object-based data models (Object-oriented and Object-relational)
- Semi-structured data model (XML) 半结构化数据模型 json *自学
- Other older models
- Network model
- Hierarchical model (银行)
NoSQL 数据库数据模型
- Key-Value
- Document-oriented
- Graph
- Column-family 列族
Database Language
- DDL data definition language: specify the database schema
- DML data manipulation language: express database queries and update
- They are not two separate languages, instead they are parts of a single
database language, such as SQL
Relational Model
Example of tabular data in the relational model
关系表
列:属性 首行决定schema 下面即为instance
DML
增删改查
两种类
- Procedural DMLs require a user to specify what data are needed and how to
get those data - Declarative DMLs (also referred to as nonprocedural DMLs) require a user to
specify what data are needed without specifying how to get those data
Query language
SQL is the best.
e.g.1
2
3select instructor.name
from instructor
where instructor.dept name = ’History’;
DDL
- Data storage and definition language define the implementation details of
the database schemas(hidden from the users) - Must satisfy certain consistency constraints
- Domain constraints
- Referential integrity
- Assertions
- Authorization
SQL provides a rich DDL that allows one to define tables, integrity constraints,
assertions, etc.
e.g.:1
2
3
4create table department
(dept_name char (20),
building char (15),
budget numeric (12,2));
Application programs generally access databases through one of
- Embedded/ language extension
Database design
The process of designing the general structure of the database
- Logical Design - Deciding on the database schema
- Business decision
- CS decision
- Physical Design - Deciding on the physical layout of the database
Design approaches
Normalization theory C8
- Formalize what designs are bad, and test for them
E-R Model C7 (*Test)
The Entity-Relationship Model
Models an enterprise as a collection of entities and relationships
- Described by a set of attributes
Represented diagrammatically by an entity-relationship diagram - Entity sets are represented by a rectangular box with the entity set name in
the header and the attributes listed below it. - Relationship sets are represented by a diamond connecting a pair of related
entity sets. The name of the relationship is placed inside the diamond.
Storage Management (Physical Level)
The storage manager is the component of a database system that provides the
interface between the low-level data stored in the database and the application
programs and queries submitted to the system.
Query Processing
- Parsing and translation
- Optimization
- Evaluation
Transaction: An Execution of a DB Program
- A transaction is a collection of operations that performs a single logical
function in a database application- An atomic sequence of database actions (R/W)
- Each transaction, executed completely, must leave the DB in a consistent
state if DB is consistent
- Recovery manager, failure recovery
- Concurrency-control manager