Top 19 Data Center Management Interview Questions You Must Prepare 24.Apr.2024

Eay physical infrastructure management tools were limited in scope and required considerable human intervention. While they would warn that a particular parameter had been exceeded, the operator would have to determine what equipment was affected by the error. First-gen tools could not make correlations between a physical infrastructure device and a server, nor were they capable of initiating actions to prevent downtime, such as speeding up f to dissipate a hot spot. 

Never management tools are designed to identify and resolve issues with minimum human intervention. By correlating power, cooling and space resources to individual servers (physical and virtual), DCIM tools today can proactively inform IT management systems of potential physical infrastructure problems and how they might impact specific IT loads. Newer planning software tools illustrate, through a graphical user interface, the current physical state of the data center and simulate the effect of future physical equipment adds, moves, and failures. 

Data center architecture is the physical and logical layout of the resources and equipment within a data center facility. 

It serves as a blueprint for designing and deploying a data center facility. It is a layered process which provides architectural guidelines in data center development.

Some features/functions to consider: 

Open source — vendor-neutrality is key, as very few data centers are standardized on a single vendor from top to bottom. Open source DCIM software can integrate data from, for example, uninterruptible power supply (UPS) systems, power distribution units (PDUs), and cooling units from three (or more) different vendors. 

In addition, with open protocols, it is quite easy to add additional software tools and expect them to communicate and work together effectively. 

  • Functionality — depending on your needs, you’ll want to explore various functions: 
  • Planning functions, such as asset management and cause/effect analysis 
  • Operational functions, such as helping to complete more tasks in less time, reducing human error, and identifying root causes of problems 
  • Analysis functions, such as identifying operational strengths and weaknesses and optimizing energy usage 

A data center is said to be carrier-neutral if a customer can order cross connections or communications services from any existing provider and the data center provider actively tries to court additional carriers into the facility.

Here are a few we see more often than we’d like: 

  • A rack of servers loses power when an IT administrator unintentionally overloads an already maxed-out power strip. 
  • A large data center virtualizes and consolidates its most critical applications on a cluster of servers. Using the failover mechanism of the virtualization platform, they feel protected from hardware failure. Unfortunately, in their panning, they don’t recognize that each of the servers is dependent on the same UPS, which me that if the UPS fails, no UPS-protected servers are available to migrate the affected loads to. 
  • An operator is trying to determine whether power capacity that was just exceeded on a rack is only an anomaly or a developing trend. She goes on “gut feel” and leaves it alone. The next time power capacity in that rack is exceeded, a breaker trips and all the servers downstream of that breaker that are running mission critical applications are suddenly shut down. 
  • In a large. mission critical data center, the provisioning and installation of servers is so complex that only highly paid contract engineers are able to perform the task. 

No, DCIM tools consist of a collection of software applications (outlined above), data collection tools, and a dashboard. The data collection is generally done by devices like meters, power protection devices, embedded cards, programmable logic controllers (PLCs), and sensors, which gather data and forward it to management software for processing. 

The other component of DCIM is a dashboard. Critical information from the DCIM software and data collection tools needs to be aggregated and presented so IT managers can visualize the data in a way that is meaningful and actionable. Dashboards can be configured for different needs, for instance to focus on the performance of the IT equipment versus the physical infrastructure (cooling, power, security). 

A data center (or datacenter) is a facility composed of networked computers and storage that businesses or other organizations use to organize, process, store and disseminate large amounts of data. A business typically relies heavily upon the applications, services and data contained within a data center, making it a focal point and critical asset for everyday operations.

 Monitoring and automation software can do things like: 

  • Provide energy use details that enable the linking of operating costs to each business unit user group, which then ll ws for” charge backs” 
  • Monitor and control facility heat, ventilation, and air conditioning (HVAC) systems, as well as fire, water steam, and gas systems, and facility security 
  • Perform auto discovery of new equipment additions, verifying that everything works out of the box 
  • Report real-time, average and peak power usage by rack, which might help you decide where to add a new server or identity and eliminate recurring and possible dangerous load spikes 
  • Measure power usage effectiveness (PUE) on a daily basis and track historical PUE, helping you analyze whether cost cutting and energy saving strategies are actually working. 

Planning and implementation software can do things like: 

  • Generate inventory reports organized by device type, age, manufacturer, and properties of the device (handy to quickly identify underutilized assets, assets out of warranty, and assets that need to be upgraded) 
  • Generate an audit trail for changes to assets and work orders, including a record of alarms raised and alarms removed, providing factual evidence for post-failure analysis Perform auto discovery of new equipment additions, verifying that everything works out of the box 
  • Map out what-il scenanos, such as: ill change the contents of this rack, how will it impact my cooling? Measure power usage effectiveness (PUE) on a daily basis and track historical PUE, helping you analyze whether cost cutting and energy saving strategies are actually working. 
  • Answer questions such as: 
    • What is my data center’s PUE? 
    • What is the optimal place to put my next physical or virtual server? 
    • What will the impact of new equipment be on my redundancy and safety margins?

Data center management refers to a small number of employees who have been designated and hired to manage large data sets and hardware systems that are usually part of a large distributed network. The data center is responsible for the management of significant amounts of data and the hardware required to store it and distribute it to users.

Data center management plays a crucial role in protecting data and keeping it secure so as to avoid data security breaches. The hosted computer environment within a data center must be explicitly managed, but most of the management is conducted in an automated fashion, thus saving hiring and energy costs. Data centers can be managed remotely and may not even house actual employees.

Functions of data center management include upgrading hardware and software/operating systems, managing data distribution and storage, backup regimes, emergency planning and some technical support.

There are many DCIM tools and suites of solutions on the market, and as with any acquisition, you need to look at each critically and choose the one that best meets your specific needs.

A cross connection is most often a layer 1 or physical layer connection between two networks. Data center providers typically segment cross connections by type of cabling used to make the connection - copper, coaxial or fiber. Cross connections are usually completed by the data center provider for a non-recurring (NRC) and a monthly recurring (MRC) charge.

Newer DCIM tools measure, monitor, automate, and optimize processes for energy efficiency. They can do things like: 

  • Initiate load shifts: for example, when a monitoring system detects a reduced data center load at night, it might consolidate applications onto rack #1 and turn off rack #2, saving energy. In addition, if the reduced IT load can operate at a higher temperature, variable speed f in CRAGS can be adjusted down, and the reduced cooling load would be reported to the building management system (BMS), which optimizes the chiller by raising the chilled water temperature, saving more energy. 
  • Maximize use of existing capacity: DCIM tools help identify excess capacity and pinpoint devices that can either be decommissioned or used elsewhere, saving on energy, capital, maintenance, and manpower costs. DCIM tools also help identity stranded capacity, or unusable capacity caused by an imbalance in power, cooling, and/or rack space. Map out what-if scenarios, such as: if I change the contents of this rack, how will it impact my cooling? Measure power usage effectiveness (PUE) on a daily basis and track historical PUE. helping you analyze whether cost cutting and energy saving strategies are actually working. 
  • Measure power usage effectiveness (PUE): DCIM tools track daily and historical PUE, helping you analyze whether cost cutting and energy saving strategies are actually working and make adjustments accordingly 

"Critical" power or "IT load" often refers to the data center load that is consumed or is dedicated to IT equipment such as servers, storage equipment and communications switches and routers. Power for lighting or cooling the data center is excluded from "critical" power. It's important for an end user to understand their critical load as the data center - whether managed internally or outsourced - will be sized based on the current or expected amount of critical power.

There are two main categories of data center management software tools:  monitoring/automation software and planning/implementation software. 

The first deals with monitoring and automation of the IT room and facility power, environmental control, and security. It acts upon user-set thresholds by alarming, logging, or even controlling physical devices, and does things like verifying the data center is functioning as designed, and automating activities that optimize availability and efficiency. 

The second category of software focuses on planning and implementation, where IT managers can typically have the greatest impact on total cost of ownership (TCO). It ensures efficient deployment of new equipment, organizes planning in order to facilitate changes in the data center, tracks assets, and simulates the impact of all kinds of what-if” scenarios. 

The multi-tier data center model is dominated by HTTP-based applications in a multi-tier approach. The multi-tier approach includes web, application, and database tiers of servers. Today, most web-based applications are built as multi-tier applications. The multi-tier model uses software that runs as separate processes on the same machine using interprocess communication (IPC), or on different machines with communications over the network. Typically, the following three tiers are used:

  • Web-server
  • Application
  • Database

Multi-tier server farms built with processes running on separate machines can provide improved resiliency and security. Resiliency is improved because a server can be taken out of service while the same function is still provided by another server belonging to the same application tier. Security is improved because an attacker can compromise a web server without gaining access to the application or database servers. Web and application servers can coexist on a common physical server; the database typically remains separate.

User interfaces — different packages offer different views, so choose those that would be most useful to you. Among those typically available: 

  • Floor layout: provides an accurate representation of your data center in a floor plan and/or elevation diagram 
  • Recommended actions: Provides decriptions of problems and recommended actions 
  • Virtual store room: Keeps track of new devices from arrival on site through installation 
  • Rack front view: Provides accurate graphical representation of equipment and its location in the rack 
  • Equipment browser: Locates equipment based on vendor name, model and/or type, and can often export equipment data to Excel format 
  • User rights management: Allows assignment of individual user rights and controls across rooms, locations, reports, alarms, and work orders 
  • Mobile devices: Communicates critical data to specified PDAS 

With multiple virtual machines and applications running on any single host, the health and availability of each physical machine becomes that much more critical, and that’s where DCIM tools play a vital role in ensuring adequate power and cooling. The other consideration is the intensive and constantly changing power and cooling requirements of a virtual environment — dynamics loads simply can’t be responded to manually. 

Cages and cabinets delineate the type of space that a colocation provider will convey to a customer in a retail colocation model. Cages are moveable walls on top of raised flooring to separate one customer's space from that of another. Cabinets, on the other hand, are typically lockable individual racks to house server, storage or communications equipment. Cages are typically for larger retail colocation customers, while cabinets can come in 1/3, half or full sizes. Both are used in shared room environments (other customers).

For years, IT managers have deployed their servers and IT equipment in hot and cold aisles. In such a scenario, the front side of two rows of equipment racks face each other and draw cool air into each rack's equipment intake. As such, the back side of two rows each expel hot air into the hot aisle. While this is an efficient concept, it may not go far enough for higher power loads. A solution to make the data center even more efficient is to deploy either hot or cold aisle containment. In cold aisle containment, the cold aisles are augmented to effectively "trap" cold air into the cold aisle. This allows the data center operator to increase air handling set points and more efficiently cool the intakes of the servers. On the other hand, hot aisle containment is a strategy to isolate the hot air exhaust found in the hot aisle. In both cases, the intent is to restrict mixing of significantly different air temperatures. Both solutions can be effective to lower PUE.