Top 50 Teradata Interview Questions You Must Prepare 27.Jul.2024

Q1. What Is A Sparse Index?

Sparse Join Indexes are a type of Join Index which contains a WHERE clause that reduces the number of rows which would otherwise be included in the index. All types of join indexes, including single table, multitable, simple or aggregate can be sparse.

Q2. How To View Every Column And The Columns Contained In Indexes In Teradata?

Following query describes each column in the Teradata RDBMS

  SELECT * FROM DBC.TVFields;

Following query describes columns contained in indexes in the Teradata RDBMS

  SELECT * FROM DBC.Indexes;

Q3. Can You Load Multiple Data Files For Same Target Table Using Fastload?

Yes, we can Load a table using multiple datafiles in Fastload. Before giving "end loading" statement user can define file path and use insert sql for multiple source files and give "end loading" statement at the end.

Q4. What Are The Multiload Utility Limitations?

MultiLoad is a very powerful utility; it has following limitations:

  • MultiLoad Utility doesn’t support SELECT statement.
  •  Concatenation of multiple input data files is not allowed.
  •  MultiLoad doesn’t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script.
  •  MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script.

     

Q5. What Are The 5 Phases In A Multiload Utility?

  • Preliminary Phase – Basic Setup
  •  DML Phase – Get DML steps down on AMPs
  •  Acquisition Phase – Send the input data to the AMPs and sort it
  •  Application Phase – Apply the input data to the appropriate Target Tables
  •  End Phase – Basic Cleanup

     

Q6. Steps To Create A Data Model?

  •  Get business requirements.
  •  Create High Level Conceptual Data Model.
  •  Create Logical Data Model.
  •  Select target DBMS where data-modeling tool creates the physical schema.
  •  Create standard abbreviation document according to business standard.

Q7. What Is Node? How Many Nodes And Amps Used In Your Previous Project?

Node is a database running in a server. We used 318 nodes and each node has 2 to 4 AMPS.

Q8. What Are Types Of Partition Primary Index (ppi) In Teradata?

@Partition by CASE
CREATE TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
Generated by Foxit PDF Creator © Foxit Software
http://www.foxitsoftware.com For evaluation only.
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY case1 (
order_total < 10000 ,
order_total < 20000 ,
order_total < 30000,
NO CASE OR UNKNOWN ) ;
@Partition by Range - example using date range
CREATE TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY range1 (
Order_date BETWEEN date '2010-01-01' AND date '2010-12-01'
EACH interval '1' month ,
NO RANGE OR UNKNOWN);

Q9. How Many Types Of Index Are Present In Teradata?

There are 5 different indices present in Teradata
@Primary Index.
a.Unique primary index.
b. non Unique primary index.
@Secondary Index.
a. Unique Secondary index.
b. non Unique Secondary index.
@Partitioned Primary Index.
a. Case partition (ex. age, salary...).
b. range partition ( ex. date).
@Join index.
a. Single table join index.
b. multiple table join index.
c. Sparse Join index ( constraint applied on join index in where clause).
@Hash index.

Q10. What Do High Confidence, Low Confidence And No Confidence Mean In Explain Plan?

Explain gives the execution strategy means what are the different steps that the query will go through.
HIGH CONFIDENCE: Statistics are collected.
LOW CONFIDENCE: Statistics are not collected. But the where condition is having the condition on indexed column. Then estimations can be based on sampling.
NO CONFIDENCE: Statistics are not collected and the condition is on non indexed column.

Q11. How To Identify Ppi Columns?

  Select databasename , tablename , columnposition ,columnname from dbc.indices   where indextype ='Q'  order by 1 ,2,3 ;

Q12. How Do You Determine The Number Of Sessions?

  • Teradata performance and workload.
  • Client platform type, performance and workload.
  • Channel performance for channel attached systems.
  • Network topology and performance for network attached systems.
  • Volume of data to be processed by the application.

Q13. What Is Basic Teradata Query Language?

@It allows us to write SQL statements along with BTEQ commands. We can use BTEQ for importing, exporting and reporting purposes.
@The commands start with a (.) dot and can be terminated by using (;), it is not mandatory to use (;).
@BTEQ will assume any thing written with out a dot as a SQL statement and requires a (;) to terminate it.

Q14. What Are The Functions Of A Teradata Dba?

Following are the different functions which a DBA can perform:
@User Management – Creation and managing Users, Databases, Roles, Profiles and Accounts.
@Space Allocation – Assigning Permanent Space, Spool Space and Temporary Space.
@Access of Database Objects – Granting and Revoking Access Rights on different database objects.
@Security Control – Handling logon and logoff rules for Users.
@System Maintenance – Specification of system defaults, restart etc.
@System Performance – Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling.
@Resource Monitoring – Database Query Log(DBQL) and Access Logging.
@Data Archives, Restores and Recovery – ARC Utility and Permanent Journals.

Q15. What Is A Clique?

Clique is a group of disk arrays physically cabled to a group of nodes.

Q16. What Is The Command In Bteq To Check For Session Settings ?

The BTEQ .SHOW CONTROL command displays BTEQ settings.

Q17. What Is Data Warehousing?

A data warehouse is a subject oriented, integrated, time variant, non-volatile collection of data in support of management's decision-making process.

Q18. What Are The Enhanced Features In Teradata V2r5 And V2r6?

V2R6 included the feature of replica in it. in which copy of data base are available on another system which means V2R6 provide the additional data protection as comparison to V2R5 while if data from one system has been vanishes.

Q19. How Do You Create A Table With An Existing Structure Of Another Table With Data And With No Data?

Create table Customerdummy as Customer with data / with no data;

Q20. What Is A Checkpoint?

Checkpoints are entries posted to a restart log table at regular intervals during the data transfer operation. If processing stops while a job is running, you can restart the job at the most recent checkpoint.

Q21. Does Sdlc Changes When You Use Teradata Instead Of Oracle?

  1. If the teradata is going to be only a data base means It won’t change the System development life cycle (SDLC).
  2. If you are going to use the teradata utilities then it will change the Architecture or SDLC.
  3. If your schema is going to be in 3NF then there won’t be huge in change.

     

Q22. What Is Collect Statistics?

Collects demographic data for one or more columns of a table, hash index, or join index, computes a statistical profile of the collected data, and stores the synopsis in the data dictionary. The Optimizer uses the synopsis data when it generates its table access and join plans.

Q23. Difference Between Multiload And Tpump?

Tpump provides an alternative to MultiLoad for low volume batch maintenance of large databases under control of a Teradata system. Tpump updates information in real time, acquiring every bit of a data from the client system with low processor utilization. It does this through a continuous feed of data into the data warehouse, rather than the traditional batch updates. Continuous updates results in more accurate, timely data. Tpump uses row hash locks than table level locks. This allows you to run queries while Tpump is running.

Q24. Difference Between Star And Snowflake Schemas?

Star schema is De-normalized and snowflake schema is normalized.

Q25. In A Table Can We Use Primary Key In One Column And In Another Column Both Unique And Not Null Constrains.if Yes How?

Yes, you can have a column for Primary key and have another column which will have no duplicates or null. e.g. A Salary Table will have employee ID as primary key. The table also contains TAX-ID which can not be null or duplicate.

Q26. Teradata Performance Tuning And Optimization?

@collecting statistics.
@Explain Statements.
@Avoid Product Joins when possible.
@select appropriate primary index to avoid skewness in storage.
@Avoid Redistribution when possible.
@Use sub-selects instead of big "IN" lists.
@Use derived tables.
@Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed).
@Use Compression on large tables.

Q27. What Is Ods?

An operational data store (ODS) is primarily a "dump" of relevant information from a very small number of systems (often just one) usually with little or no transformation. The benefits are an ad hoc query database, which does not affect the operation of systems required to run the business. ODS’s usually deal with data "raw" and "current" and can answer a limited set of queries as a result.

Q28. What Is Data Mining?

Analyzing of large volumes of relatively simple data to extract important trends and new, higher level information. For example, a data-mining program might analyze millions of product orders to determine trends among top-spending customers, such as their likelihood to purchase again, or their likelihood to switch to a different vendor.

Q29. What Is Slowly Changing Dimension?

In a slowly changing dimension the attribute for a record varies over time. There are three ways to solve this problem.
Type 1 – Replace an old record with a new record. No historical data available.
Type 2 – Keep the old record and insert a new record. Historical data available but resources intensive.
Type 3 – In the existing record, maintain extra columns for the new values.

Q30. What Is The Use Of Having Index's On Table?

For faster record search.

Q31. Did You Write Stored Procedures In Teradata?

No, because they become a single amp operation and my company didn’t encourage that.

Q32. Can We Collect Statistics On Table Level?

Yes we can collect statistics on table level. The syntax is COLLECT STAT ON TAB_A;

Q33. What's The Difference Between Timestamp (0) And Timestamp (6)?

TIMESTAMP (0) is CHAR (19) and TIMESTAMP (6) is CHAR (26)
Everything is same except that TIMESTAMP (6) has microseconds too.

Q34. How Many Codd's Rules Are Satisfied By Teradata Database?

There are 12 codd's rules applied to the teradata database.

Q35. What Does Sleep Function Does In Fast Load?

The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions. Sleep command can be used with all load utilities not only fastload. This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration.

Q36. While Creating Table My Dba Has Fallback Or No Fallback In His Ddl. What Is That?

FALLBACK requests that a second copy of each row inserted into a table be stored on another AMP in the same cluster. This is done when AMP goes down or disk fails.

Q37. How Do You Verify A Complicated Sql?

I use explain statement to check if the query is doing what I wanted it to do.

Q38. What Interface Is Used To Connect To Windows Based Applications?

WinCLI interface.

Q39. What Is A Three-tier Data Warehouse?

The three-tier differs from the two-tier architecture by strictly enforcing a logical separation of the graphical user interface, business logic, and data. The three-tier is widely used for data warehousing today. Organizations that require greater performance and scalability, the three-tier architecture may be more appropriate. In this architecture, data extracted from legacy systems is cleansed, transformed, and stored in high –speed database servers, which are used as the target database for front-end data access.

Q40. What Is Dimensional Modeling?

Dimensional Data Modeling comprises of one or more dimension tables and fact tables. Good examples of dimensions are location, product, time, promotion, organization etc. Dimension tables store records related to that particular dimension and no facts (measures) are stored in these tables.

Q41. What Is Primary Index And Secondary Index?

Primary index is the mechanism for assigning a data row to an AMP and a location on the AMP’s disks. Indexes also used to access rows from a table without having to search the entire table. Secondary indexes enhance set selection by specifying access paths less frequently used than the primary index path. Secondary indexes are also used to facilitate aggregate operations.

If a secondary index covers a query, then the Optimizer determines that it would be less costly to accesses its rows directly rather than using it to access the base table rows it points to. Sometimes multiple secondary indexes with low individual selectivity can be overlapped and bit mapped to provide enhanced.

Q42. What Is Er Diagram?

It is Entity relationship diagram. Describes the relationship among the entities in the database model.

Q43. What Is Logical Data Model?

A Logical data model is the version of a data model that represents the business requirements (entire or part) of an organization and is developed before the physical data model. A sound logical design should streamline the physical design process by clearly defining data structures and the relationships between them. A good data model is created by clearly thinking about the current and future business requirements. Logical data model includes all required entities, attributes, key groups, and relationships that represent business information and define business rules.

Q44. What Is A Common Data Source For The Central Enterprise Data Warehouse?

ODS=>Operational Data Source.

Q45. What Is A Dimension Table?

Dimension table is one that describes the business entities of an enterprise, represented as hierarchical, categorical information such as time, departments, locations, and products. Dimension tables are sometimes called lookup or reference tables. In a relational data modeling, for normalization purposes, country lookup, state lookup, county lookup, and city lookups are not merged as a single table. In a dimensional data modeling (star schema), these tables would be merged as a single table called LOCATION DIMENSION for performance and slicing data requirements. This location dimension helps to compare the sales in one region with another region. We may see good sales profit in one region and loss in another region. If it is a loss, the reasons for that may be a new competitor in that area, or failure of our marketing strategy etc.

Q46. Can We Collect Statistics On Multiple Columns?

Yes we can collect statistics on multiple columns.

Q47. How Many Tables Can You Join In V2r5?

Up to 64 tables.

Q48. Can We Load A Multi Set Table Using Mload?

We can Load SET, MULTISET tables using Mload, But here when loading into MULTISET table using MLOAD duplicate rows will not be rejected, we have to take care of them before loading.But in case of Fload when we are loading into MULTISET duplicate rows are automatically rejected, FLOAD will not load duplicate rows weather table is SET or MULTISET.

Q49. How To Find Duplicates In A Table?

Group by those fields and select id, count(*) from table group by id having count (*) > 1.

Q50. What Is Subject Area?

Subject area means fundamental entities that make up the major components of the business, e.g. customer, product, employee.