Reasons for not keeping several search indices include:
In a generalization–specialization hierarchy, it must be possible to decide which entities are members of which lower level entity sets. In a condition defined design constraint, membership in the lower level entity-sets is evaluated on the basis of whether or not an entity satisfies an explicit condition or predicate.User-defined lower-level entity sets are not constrained by a membership condition; rather, entities are assigned to a given entity set by the database user.
Condition-defined constraints alone can be automatically handled by the system. Whenever any tuple is inserted into the database, its membership in the various lower level entity-sets can be automatically decided by evaluating the respective membership predicates. Similarly when a tuple is updated, its membership in the various entity sets can be re-evaluated automatically.
A general purpose database manager (DBM) has five responsibilities:
Increasing contention for shared resources prevents linear scale-up with increasing parallelism. In a shared memory system, contention for memory (which implies bus contention) will result in falling scale-up with increasing parallelism. In a shared disk system, it is contention for disk and bus access which affects scale-up. In a shared-nothing system, inter-process communication overheads will be the main impeding factor. Since there is no shared memory, acquiring locks, and other activities requiring message passing between processes will take more time with increased parallelism.
It is preferable to use a dense index instead of a sparse index when the file is not sorted on the indexed field (such as when the index is a secondary index) or when the index file is small compared to the size of memory.
4NF is more desirable than BCNF because it reduces the repetition of information. If we consider a BCNF schema not in 4NF (see Exercise 7.28), we observe that decomposition into 4NF does not lose information provided that a loss less join decomposition is used, yet redundancy is reduced.
A physical OID needs to have a unique identifier in addition to a pointer to a physical storage location. This is required to prevent dereferences of dangling pointers.
If an object gets forwarded multiple times, the retrieval speed will decrease because accessing it will require accessing the series of locations from which the object has been successively forwarded to the current location. Multiple accesses can be avoided by always keeping in the oldest location the latest address of the object. This can be done by checking while forwarding whether this object has already been forwarded and in that case updating the forwarding address at the oldest location. Thus, atmost two accesses will be required.
Usually, a well-designed view and security mechanism can avoid conflicts between ease of access and security. However, as the following example shows, the two purposes do conflict in case the mechanisms are not designed carefully.
Suppose we have a database of employee data and a user whose view involves employee data for employees earning less than $10,0@If this user inserts employee Jones, whose salary is $9,000, but accidentally enters $90,000, several existing database systems will accept this update as a valid update through a view. However, the user will be denied access to delete this erroneous tuple by the security mechanism.
When a traction explicitly locks a node in shared or exclusive mode, it implicitly locks all the descendents of that node in the same mode. The traction need not explicitly lock the descendent nodes. There is no difference in the functionalities of these locks, the only difference is in the way they are acquired, and their presence tested.
The ACID properties, and the need for each of them are:-
Let tgrid be a two-dimensional integer array of size n × m.
In a total design constraint, each higher-level entity must belong to a lower-level entity set. The same need not be true in a partial design constraint. For instance, some employees may belong to no work-team.
Deadlock avoidance is preferable if the consequences of abort are serious (as in interactive tractions), and if there is high contention and a resulting high probability of deadlock.
Open hashing may place keys with the same hash function value in different buckets. Closed hashing always places such keys together in the same bucket. Thus in this case, different buckets can be of different sizes, though the implementation may be by linking together fixed size buckets using overflow chains. Deletion is difficult with open hashing as all the buckets may have to inspected before we can ascertain that a key value has been deleted, whereas in closed hashing only that bucket whose address is obtained by hashing the key value need be inspected. Deletions are more common in databases and hence closed hashing is more appropriate for them. For a small, static set of data lookups may be more efficient using open hashing. The symbol table of a compiler would be a good example.
A schedule in which all the instructions belonging to one single traction appear together is called a serial schedule. A serializable schedule has a weaker restriction that it should be equivalent to some serial schedule. There are two definitions of schedule equivalence – conflict equivalence and view equivalence. Both of these are described in the chapter.
With the central server, each site does not have to remember which site to contact when a particular data item is to be requested. The central server alone needs to remember this, so data items can be moved around easily, depending on which sites access which items most frequently.Other house-keeping tasks are also centralized rather than distributed, making the system easier to develop and maintain. Of course there is the disadvantage of a total shutdown in case the server becomes unavailable. Even if it is running, it may become a bottleneck because every request has to be routed via it.
Your wer will be based on the computers and storage media that you use. Typical examples would be hard disk, floppy disks and CD-ROM drives.
Dangling tuples can arise when one tuple is inserted into a decomposed relation but no corresponding tuple is inserted into the other relations in the decomposition. They can cause incorrect values to be returned by queries which form the join of a decomposed relation since the dangling tuple might not be included. dangling tuples can be avoided by the specification of referential integrity constraints.
In the concurrency control scheme choosing Start(Ti) as the time stamp of Ti gives a subset of the schedules allowed by choosing Validation(Ti) as the time stamp. Using Start(Ti) me that whoever started first must finish first. Clearly tractions could enter the validation phase in the same order in which they began executing, but this is overly restrictive. Since choosing Validation(Ti) causes fewer non conflicting tractions to restart, it gives the better response times.
The phantom phenomenon arises when, due to an insertion or deletion, two tractions logically conflict despite not locking any data items in common. The insertion case is described in the book. Deletion can also lead to this phenomenon. Suppose Ti deletes a tuple from a relation while Tj sc the relation. If Ti deletes the tuple and then Tj reads the relation, Ti should be serialized before Tj . Yet there is no tuple that both Ti and Tj conflict on.
An interpretation of 2PL as just locking the accessed tuples in a relation is incorrect. There is also an index or a relation data that has information about the tuples in the relation. This information is read by any traction that sc the relation, and modified by tractions that update, or insert into, or delete from the relation. Hence locking must also be performed on the index or relation data, and this will avoid the phantom phenomenon.
Consider the balance in an account, replicated at N sites. Let the current balance be $100 – consistent across all sites. Consider two tractions T1 and T2 each depositing $10 in the account. Thus the balance would be $120 after both these tractions are executed. Let the tractions execute in sequence: T1 first and then T@Suppose the copy of the balance at one of the sites, say s, is not consistent – due to lazy replication strategy – with the primary copy after traction T1 is executed and let traction T2 read this copy of the balance. One can see that the balance at the primary site would be $110 at the end.
An entity is simply a collection of variables or data items. An object is an encapsulation of data as well as the methods (code) to operate on the data. The data members of an object are directly visible only to its methods. The outside world can gain access to the object’s data only by passing pre-defined messages to it, and these messages are implemented by the methods.
RAID level 1 (mirroring) is the one which facilitates rebuilding of a failed disk with minimum interference with the on-going disk accesses. This is because rebuilding in this case involves copying data from just the failed disk’s mirror. In the other RAID levels, rebuilding involves reading the entire contents of all the other disks.
There are several steps in the creation of a file. A storage area is assigned to the file in the file system, a unique i-number is given to the file and an i-node entry is inserted into the i-list. Deletion of file involves exactly opposite steps.
For the file system user in UNIX, durability is important for obvious reasons, but atomicity is not relevant generally as the file system doesn’t support tractions. To the file system implementor though, many of the internal file system actions need to have traction semantics. All the steps involved in creation/deletion of the file must be atomic, otherwise there will be unreferenceable files or unusable areas in the file system.
The possible sequences of states are:-
If the type of an attribute is x, then in each tuple of the table, corresponding to that attribute, there is an actual object of type x . If its type is ref(x), then in each tuple, corresponding to that attribute, there is a reference to some object of type x. We choose a reference type for an attribute, if that attribute’s intended purpose is to refer to an independent object.
Autonomy is the amount of control a single site has over the local database. It is important because users at that site want quick and correct access to local data items. This is especially true when one considers that local data will be most frequently accessed in a database. Trparency hides the distributed nature of the database. This is important because users should not be required to know about location, replication, fragmentation or other implementation aspects of the database.
If a traction is very long or when it fetches data from a slow disk, it takes a long time to complete. In absence of concurrency, other tractions will have to wait for longer period of time. Average response time will increase. Also when the traction is reading data from disk, CPU is idle. So resources are not properly utilized. Hence concurrent execution becomes important in this case. However, when the tractions are short or the data is available in memory, these problems do not occur.
The assertion-name is arbitrary. We have chosen the name Perry. Note that since the assertion applies only to the Perry ridge branch we must restrict attention to only the Perry ridge tuple of the branch relation rather than writing a constraint on the entire relation.
create assertion perry check (not exists (select * from branch where branch-name = ’Perryridge’ and assets = (select sum (amount) from loan where branch-name = ’Perryridge’)))
Note that indices must operate on the encrypted data or someone could gain access to the index to interpret the data.Otherwise, the index would have to be restricted so that only certain users could access it. To keep the data in sorted order, the index scheme would have to decrypt the data at each level in a tree. Note that hash systems would not be affected.
The primary key of a weak entity set can be inferred from its relationship with the strong entity set. If we add primary key attributes to the weak entity set, they will be present in both the entity set and the relationship set and they have to be the same. Hence there will be redundancy.
BCNF is not always dependency preserving. Therefore, we may want to choose another normal form (specifically, 3NF) in order to make checking dependencies easier during updates. This would avoid joins to check dependencies and increase system performance.
We have weak entities for several reasons:
Most of the concurrency control protocols (protocols for ensuring that only serializable schedules are generated) used in practice are based on conflict serializability—they actually permit only a subset of conflict serializable schedules. The general form of view serializability is very expensive to test, and only a very restricted form of it is used for concurrency control.
Let us consider a two-dimensional grid array. When a bucket overflows, we can split the ranges corresponding to that row and column into two, in both the linear scales. Thus the linear scales will get one additional entry each, and the bucket is split into four buckets. The ranges should be split in such a way as to ensure that the four resultant buckets have nearly the same number of values.
There can be several other heuristics for deciding how to reorganize the ranges, and hence the linear scales and grid array.
Some main differences between a database management system and a file-processing system are:
An edge from class A to class B in the DAG representing inheritance me that an object of class B is also an object of class A. It has all the properties that objects of class A have, plus additional ones of its own. In particular, it inherits all the variables and methods of class A. It can of course provide its own implementations for the inherited methods. And edge from class A to class B in the object containment DAG me that an object of class A contains an object of class B. There need not be any similarities in the properties of A and B. Neither B nor A inherit anything from the other. They function as independent types, to the extent that an object of class A can access the variables of the B object contained in it only via the B object’s methods.
Certain functional dependencies are called trivial functional dependencies because they are satisfied by all relations.
create view salinfo as select manager-name, avg(salary) from manages m, works w where m.employee-name = w.employee-name group by manager-name
Updates should not be allowed in this view because there is no way to determine how to change the underlying data. For example, suppose the request is “change the average salary of employees working for Smith to $200”. Should everybody who works for Smith have their salary changed to $200? Or should the first (or more, if necessary) employee found who works for Smith have their salary adjusted so that the average is $200?Neither approach really makes sense.
We define triggers for each relation whose primary-key is referred to by the foreign-key of some other relation. The trigger would be activated whenever a tuple is deleted from the referred-to relation. The action performed by the trigger would be to visit all the referring relations, and delete all the tuples in them whose foreign-key attribute value is the same as the primary-key attribute value of the deleted tuple in the referred-to relation. These set of triggers will take care of the on delete cascade operation.
The primary index is on the field which specifies the sequential order of the file. There can be only one primary index while there can be many secondary indices.
If a traction needs to access a large a set of items, multiple granularity locking requires fewer locks, whereas if only one item needs to be accessed, the single lock granularity system allows this with just one lock. Because all the desired data items are locked and unlocked together in the multiple granularity scheme, the locking overhead is low, but concurrency is also reduced.
Index and resource authorization should be special categories to allow certain users to create relations (and the indices to operate on them) while preventing these time-consuming and schema-changing operations from being available to many users. Separating index and resource authorization allows a user to build an index on existing relations, say, for optimization purposes, but allows us to deny that user the right to create new relations.
Even in this case the recovery manager is needed to perform roll-back of aborted tractions.