Designing and Implementing Databases with
Microsoft SQL Server 2000 Enterprise Edition
This exam tests your ability to design and implement
database solutions by using Microsoft SQL Server 2000
Enterprise Edition. The expected environment has the
following characteristics:
n
Heterogeneous databases
n
SQL Server security integrated with Windows
Authentication
n
Client/server configurations of 50 to 5,000 or
more users
n
Web configurations that use Internet
Information Services (IIS) or COM+
n
Databases as large as 2 TB
n
Multiple installations of SQL Server 2000
When you pass this exam, you achieve Microsoft Certified
Professional status. You also earn credit toward the
following certifications:
n
Core credit toward Microsoft Certified
Database Administrator on Microsoft SQL Server 2000
certification
n
Elective credit toward Microsoft Certified
Systems Engineer on Microsoft Windows 2000 certification
n
Elective credit toward Microsoft Certified
Systems Engineer on Microsoft Windows NT 4.0 certification
n
Elective credit toward Microsoft Certified
Solution Developer for Microsoft .NET certification
n
Elective credit toward Microsoft Certified
Application Developer for Microsoft .NET certification
HANDS-ON
experience on SQL is extremely important for passing this
exam. You should setup your own test lab and play with the
SQL language thoroughly. You may use the sample database
that comes with SQL Server for experiment.
Key Concepts of Database Design
Normalization:
n
a process of organizing data
n
focus on minimizing redundancy
n
involves dividing a database into multiple
tables and defining relationships between them
n
isolate the data in a manner that all
additions, deletions, and modifications of a field can be
made in one table, and then propagated to the other relevant
tables via the defined relationships
n
makes databases more efficient in terms of
maintain and update
n
complicate the database structure
Normal forms:
n
1st Normal Form (1NF)
u
Each field in a table contains different
information.
n
2nd Normal Form (2NF)
u
No field values can be derived from another
field.
n
3rd Normal Form (3NF)
u
No duplicate information is permitted.
n
4th Normal form (4NF) and 5th
Normal form (5NF)
u
These levels are the "extreme ends" and are
not tested.
OLAP n
Online Analytical Processing
n
Provides analysis of data stored in a database
n
Allows user to analyze different dimensions of
multidimensional data
n
Too much normalization is no good for OLAP.
n
Decision-support system DSS can be described
as a form of OLAP.
n
Design considerations:
u
Heavy indexing - improve query performance.
u
Denormalization - use preaggregated or
summarized data
u
Schema - use star or snowflake schema to
organize the data
Data Warehouse: n
Collection of data for supporting management
decision making.
n
Presents a coherent picture of management
scenarios at a point in time.
n
Usually combine multiple different databases
across an enterprise.
OLTP
n
A type of processing in which the DBMS
responds immediately to user requests.
n
Each user request is considered to be a
transaction:
u
explicit transaction - you explicitly define
both the start and the end of the transaction.
u
autocommit transaction - the default
transaction management mode of SQL Server where every
Transact-SQL statement is committed or rolled back when it
is completed.
u
implicit transaction - generates a
continuous chain of transactions.
n
Needs efficient updates
n
Normalization is a MUST
n
24x7 is critical
n
Design Considerations:
u
Data placement - place frequently accessed
data together using filegroups and RAID Short transactions
- minimize long-term locks and improve concurrency.
u
User interaction - avoid user interaction
during transactions.
u
Online backup - schedule the backup process
to occur during times of low activity
u
High normalization of the database - reduce
redundant information as much as possible
u
Little or no historical or aggregated data -
keeps tables as small as possible
u
Careful use of indexes - keep indexes
narrow. Ask Index Tuning Wizard for help.
u
Optimum hardware configuration - provide the
needed quick response times
SQL: Webopedia.com has a very good description of SQL:
"Abbreviation of structured query language, and pronounced
either see-kwell or as separate letters. SQL is a
standardized query language for requesting information from
a database. The original version called SEQUEL (structured
English query language) was designed by an IBM research
center in 1974 and 1975. SQL was first introduced as a
commercial database system in 1979 by Oracle Corporation.
Historically, SQL has been the favorite query language for
database management systems running on minicomputers and
mainframes. Increasingly, however, SQL is being supported by
PC database systems because it supports distributed
databases (databases that are spread out over several
computer systems). This enables several users on a
local-area network to access the same database
simultaneously.
Although
there are different dialects of SQL, it is nevertheless the
closest thing to a standard query language that currently
exists. In 1986, ANSI approved a rudimentary version of SQL
as the official standard, but most versions of SQL since
then have included many extensions to the ANSI standard. In
1991, ANSI updated the standard. The new standard is known
as SAG SQL."
http://www.webopedia.com/TERM/S/SQL.html
T-SQL:
According to the Microsoft Press 229 Training Kit,
"Transact-SQL is a language that contains the commands used
to administer instances of SQL Server; to create and manage
all objects in an instance of SQL Server; and to insert,
retrieve, modify, and delete data in SQL Server tables.
Transact-SQL is an extension of the language defined in the
SQL standards published by ISO and ANSI."
A "Good" database design:
n
A table should have an identifier.
n
A table should store only data for a single
type of entity.
n
A table should avoid nullable columns.
n
A table should not have repeating values or
columns.
Good maintenance design guidelines:
n
Design the database to be as small as possible
and to exclude redundant information via normalization.
n
Design partitioned tables rather than a single
table for table that will contain a large number of rows.
Steps of database design planning:
n
Gather information.
n
Identify the objects.
n
Model the objects.
n
Identify the types of information for each
object.
n
Identify the relationships between objects.
Database Integrity
Entity Integrity
u
defines a row as a unique entity for a
particular table
u
enforces the integrity of the identifier
column(s)
u
enforces the integrity of the primary key
Domain Integrity
u
validity of entries for a given column
u
can be enforced by:
n
restricting the type through data types
n
restricting the format through CHECK
constraints and rules
n
restricting the range of possible values
through FOREIGN KEY constraints, CHECK constraints, DEFAULT
definitions, NOT NULL definitions, and rules
Referential Integrity
u
ensures that key values are consistent
across tables
u
preserves the defined relationships between
tables when records are entered or deleted. based on
relationships between foreign keys and primary keys or
between foreign keys and unique keys through FOREIGN KEY
and CHECK constraints
Maintaining integrity
u
CHECK constraints - ensure that column
values are valid.
u
DEFAULT and NOT NULL constraints - avoid the
complexities caused by missing column values.
u
PRIMARY KEY and UNIQUE constraints - enforce
row uniqueness
u
FOREIGN KEY constraints - ensure that rows
in dependent tables always have a matching master record.
u
IDENTITY columns - generate unique row
identifiers.
u
Timestamp columns - ensure concurrency
checking between multiple-user updates.
u
User-defined data types - ensure consistency
of column definitions
Partitioning
Hardware Partitioning:
n
designs the database to take advantage of the
available hardware architecture.
n
multiprocessors allow multiple threads of
execution
n
RAID allows data to be striped across multiple
disk drives for faster data access
n
Horizontal Partitioning:
u
segments a table into multiple tables
u
each table contains the same number of
columns
u
each table contains fewer rows
u
goal: queries reference as few tables as
possible
n
Vertical Partitioning:
u
segments table into multiple tables
u
each table has fewer columns
u
two types of vertical partitioning:
normalization and row splitting
u
Row splitting divides the original table
vertically into tables with fewer columns.
u
Goal: allows queries to scan less data for
better query performance
Limitations of raw partitions:
n
Only one database file can be created on each
raw partition.
n
No support for standard file-system operations
such as copy, move, and delete
n
No support for database file backup using the
NT Backup utility
n
No support for automatic expansion
n
Numbered devices cannot be used.
n
No support for file-system services such as
bad block replacement.
Database Architecture and Design
Table relationships:
n
one-to-one relationship
u
a row in table A can have no more than one
matching row in table B, and vice versa
n
one-to-many relationship
u
table A can have many matching rows in table
B
u
a row in table B can have only one matching
row in table A
n
many-to-many relationship
u
a row in table A can have many matching rows
in table B, and vice versa
Common types of column information:
n
Raw data columns - store tangible information
determined by a source external to the database.
n
Categorical columns - group and store a
limited selection of data
n
Identifier columns - identify each item stored
in the table
n
Relational/Referential columns - establish
link between information in one table and related
information in another table.
Table size estimation - the steps:
1.
Specify the number of rows present in the table
2.
Calculate the row size:
3.
Calculate the number of rows per page (8096 free
bytes per page)
4.
Calculate the number of pages required to store all
the rows
5.
Calculate the amount of space required to store the
data in a table
Estimating the amount of space required to store the data
and any additional nonclustered indexes on a table that does
not have a clustered index:
1.
Calculate the space used to store data.
2.
Calculate the space used to store each additional
nonclustered index.
3.
Sum the values calculated.
Estimating the amount of space required to store the data
and any additional nonclustered indexes on a table that has
a clustered index:
1.
Calculate the space used to store data.
2.
Calculate the space used to store the clustered
index.
3.
Calculate the space used to store each additional
nonclustered index.
4.
Sum the values calculated.
What to do with Views?
n
Restrict a user to specific rows in a table
n
Restrict a user to specific columns
n
Join columns from multiple tables to make them
look like a single table
n
Aggregate information instead of supplying
details
Why should we use Views?
n
To focus on specific data
n
To simplify data manipulation
n
To customize data
n
To export and import data
n
To combine partitioned data
Restrictions of Views:
n
Can create views only in the current database.
n
Cannot associate rules or DEFAULT definitions
with views.
n
Cannot associate AFTER triggers with views.
n
Cannot define full-text index definitions for
views.
n
Cannot create temporary views
n
Cannot create views on temporary tables.
n
Cannot issue full-text queries against a view
n
View names must follow the rules for
identifiers.
n
View names must be unique for each user.
n
Views or tables that participate in a view
created with the SCHEMABINDING clause cannot be dropped
unless the view is dropped
n
Queries that define the view cannot include
the ORDER BY, COMPUTE, or COMPUTE BY clauses or the INTO
keyword.
Conditions of view:
n
A view must contain at least one table in the
FROM clause of the view definition
n
A view cannot be based solely on an
expression.
n
A view must not use aggregate functions such
as AVG, COUNT, SUM, MIN, MAX, GROUPING, STDEV, STDEVP, VAR,
or VARP in the select list
n
A view must not use GROUP BY, UNION, DISTINCT,
or TOP clauses in the select list.
n
A view must not use derived columns in the
select list.
Checkpoints are created:
n
When a CHECKPOINT statement is executed
n
When an ALTER DATABASE statement changes a
database option
n
When an instance of SQL Server is stopped
n
When an instance of SQL Server periodically
generates automatic checkpoints in each database
Index Tuning
Indexes are useful when a query:
n
searches for exact match
n
searches for match in a range
n
searches for rows in a table T1 that match,
based on a join predicate, a row in another table T2
n
produces sorted query output without an
explicit sort operation
n
scans rows in a sorted order to permit an
order-based operation
n
scans all rows in a table with better
performance than a table scan
n
searches for duplicates of new search key
values in insert and update operations,
n
searches for matching rows between two tables
for which a FOREIGN KEY constraint is defined
Recommendations for creating indexes:
n
Write queries that update as many rows as
possible in a single statement
n
Use the Index Tuning Wizard to analyze queries
n
Use integer keys for clustered indexes.
n
Create nonclustered indexes on all columns
frequently used in queries.
n
Examine and ensure column uniqueness.
n
Examine data distribution in indexed columns.
n
Avoid long-running query by indexing a column
with more unique values
Instructing SQL Server not to maintain statistics for a
given column or index:
n
Use the sp_autostats system stored procedure.
n
Use the STATISTICS_NORECOMPUTE clause of the
CREATE INDEX statement.
n
Use the NORECOMPUTE clause of the UPDATE
STATISTICS statement.
n
Use the NORECOMPUTE clause of the CREATE
STATISTICS statement.
n
Set the AUTO_CREATE_STATISTICS and
AUTO_UPDATE_STATISTICS database options to OFF via the ALTER
DATABASE statement.
Other points to remember:
n
Unique index can be defined for both a
clustered or nonclustered index.
n
A composite index key is NOT part of a
covering index.
n
A covering index could contain a single
column.
n
An index is considered covering when the
result set of a query is provided entirely by the index.
n
A global fill factor affects an index when it
is created without the FILLFACTOR clause.
n
A fill factor is set when the index is created
and the index fills from that point.
n
When changing the global default fill factor
from zero to another value, existing indexes are unaffected.
n
To avoid or reduce the frequency of page
splits, you can re-create the index with the original fill
factor by using CREATE INDEX with the DROP_EXISTING and
FILLFACTOR clauses.
n
You can use DBCC DBREINDEX to rebuild the
index.
Index Tuning Wizard can:
n
recommend the best mix of indexes for a
database given a workload
n
analyze the effects of the proposed database
changes
n
recommend ways to tune the database for a
small set of problem queries.
n
consume significant CPU and memory resources
during analysis.
Index Tuning Wizard canNOT:
n
recommend indexes on tables referenced by
cross-database queries that do not exist in the currently
selected database.
n
recommend indexes on system tables
n
recommend indexes on PRIMARY KEY constraints
and unique indexes.
n
select or create indexes and statistics in
databases on SQL Server version 6.5 or earlier.
n
give an error when saving an SQL Script to a
disk with insufficient available space.
n
make index suggestions if there is not enough
data in the tables being sampled.
n
make index suggestions if the suggested
indexes do not offer enough projected improvement in query
performance over existing indexes.
n
consider more than 32,767 tunable queries in a
workload
Performance Tuning in other areas
Transaction Log Performance recommendations:
n
Create the transaction log on a physically
separate disk or RAID
n
Set the original size of the transaction log
file to a reasonable size - prevent the file from
automatically expanding
n
Set the file growth increment percentage to a
reasonable size - prevent the file from growing by small
increment
n
Manually shrink the transaction log files
periodically
Optimizing tempdb:
n
Allow tempdb to automatically expand as
needed.
n
Set the original size of tempdb to a
reasonable size - avoid the files from automatically
expanding
n
Set the file growth increment percentage to a
reasonable size - avoid tempdb from growing by small
increment
n
Place tempdb on a fast I/O subsystem
Causes of poor query performance:
n
Slow network communication.
n
Inadequate memory in the server computer
n
Lack of useful statistics.
n
Out-of-date statistics.
n
Lack of useful indexes.
n
Lack of useful data striping
Determining the performance of a database query:
n
Use the SET statement to enable the SHOWPLAN,
STATISTICS IO, STATISTICS TIME, and STATISTICS PROFILE
options.
u
SHOWPLAN - describes the method chosen by
the SQL Server query optimizer to retrieve data.
u
STATISTICS IO - reports information about
the number of scans, logical reads (pages accessed in
cache), and physical reads (number of times the disk was
accessed) for each table referenced in the SQL statement.
u
STATISTICS TIME - shows the amount of time
required to parse, compile, and execute a query.
u
STATISTICS PROFILE - displays a result set
after each executed query representing a profile of the
execution of the query.
Recommendations for tuning queries:
n
Add more server memory
n
Run SQL Server on a computer with multiple
processors
n
Rewriting the query
n
Put loop inside the query
n
Create single and more complex query using a
temporary table to reduce network round trip
n
Do not use multiple aliases for a single table
in the same query
n
Use query hints only if necessary.
n
Use query governor configuration option and
setting to prevent long-running queries from executing
Reducing transaction duration:
n
Commit transactional changes as soon as
possible within the requirements of the application.
n
Use statement batches for sending multiple
Transact-SQL statements from the client to SQL Server at one
time to reduce the number of network roundtrips
n
Use parameter arrays for repeated operations
Common causes of blocking:
n
Submitting queries with long execution times.
n
Canceling queries that were not committed or
rolled back.
n
Applications are not processing all results to
completion.
n
Distributed client/server deadlocks.
Guidelines for avoiding blocking:
n
Do not use application that allows users to
fill in edit boxes that generate a long-running query.
n
Do not use application that allows user input
within a transaction.
n
Allow for query cancellation.
n
Use a query or lock time out to prevent
runaway query or distributed deadlocks.
n
Immediately fetch all result rows to
completion.
n
Keep transactions as short as possible.
n
Explicitly control connection management.
n
Stress test the application before production.
Optimizing backup and restore operations:
n
Use multiple backup devices for parallel
backup/restore
n
Use a combination of database, differential
database, and transaction log backups to minimize the
recovery time
n
Minimally Logged Bulk Copy Operations
Server Tuning:
n
min server memory - ensure that SQL Server
starts with at least the minimum amount of allocated memory
and does not release memory below this value
n
max server memory - specify the maximum amount
of memory SQL Server can allocate when it starts and while
it runs
u
Do not set min server memory and max server
memory server configuration options to the same value
u
Dynamic memory allocation is always
preferred
n
max worker threads - the default setting of
255 may not be appropriate. The settings really depends on
the number of concurrent users.
n
index create memory - increasing this number
can improve the performance of index creation
n
min memory per query - a lower number means
the index creation job will start even if all the requested
memory is not available
n
recovery interval - controls when SQL Server
issues a checkpoint in each database. Changing this
parameter to cause the checkpoint process to occur less
often can improve overall performance when disk utilization
is high.
n
Maximizing Throughput - a MUST when the
Full-Text Search feature is installed
n
Server Tasking - when you need to support
connection from a client running on the same computer as the
server, improve processing time by setting up the server to
run foreground and background applications with equal
priority.
n
Virtual Memory - the recommended setting is to
set the virtual memory size to 1.5 times the physical memory
installed in the computer. If you have additionally
installed the Full-Text Search feature and plan to run the
Microsoft Search service, the virtual memory size should be
at least 3 times the physical memory installed in the
computer.
SQL
Server-specific security features
n
Pass-through authentication / Windows
authentication
n
SQL Server Authentication
n
Object and statement permissions
n
Security accounts
n
Database object encryption
n
Internal auditing
SQL
Profiler
Activities:
n
Stepping through problematic queries to find
the cause of the problem.
n
Diagnosing slow-running queries.
n
Capturing the series of problematic SQL
statements via saved trace.
n
Monitoring the performance of SQL Server for
workload tuning.
Categories of events that can be audited:
n
End user activity
n
DBA activity
n
Security events
n
Utility events
n
Server events
n
Audit events
Aspects of SQL Server commonly audited through SQL
Profiler:
n
Date and time of event.
n
User who caused the event to occur.
n
Type of event.
n
Success of the event.
n
Failure of the event.
n
The origin of the request.
n
The name of the object accessed.
n
Text of the SQL statement.
n
Who is changing someone else's password.
Reference Books
MCSE SQL Server 2000 Database Design Exam Cram (Exam:
70-229)
By Sean Chase, Richard McMahon; Paperback
MCSE Training Kit : Microsoft SQL Server 2000 Database
Design and Implementation (Exam 70-229)
By Hardcover
MCSE: SQL Server 2000 Design Study Guide (Exam 70-229)
By Marc Isreal, et al; Hardcover
|