Top 25 Java Interview Questions You Must Prepare 02.Oct.2022

Having an understanding of assembly language makes one aware of:

  • How programs interface with OS, processor, and BIOS;
  • How data is represented in memory and other external devices;
  • How the processor accesses and executes instruction;
  • How instructions access and process data;
  • How a program accesses external devices.

Other advantages of using assembly language are:

  • It requires less memory and execution time;
  • It allows hardware-specific complex jobs in an easier way;
  • It is suitable for time-critical jobs;
  • It is most suitable for writing interrupt service routines and other memory resident programs.

Data movement instructions move data from one location to another. The source and destination locations are determined by the addressing modes, and can be registers or memory. Some processors have different instructions for loading registers and storing to memory, while other processors have a single instruction with flexible addressing modes.

An assembly program can be divided into three sections −

  • The data section,
  • The bss section, and
  • The text section.

Every number system uses positional notation, i.e., each position in which a digit is written has a different positional value. Each position is power of the base, which is 2 for binary number system, and these powers begin at 0 and increase by 1.

The value of a binary number is based on the presence of 1 bits and their positional value. So, the value of a given binary number is:

1 + 2 + 4 + 8 +16 + 32 + 64 + 128 = 255

which is same as 28 - 1.

A segmented memory model divides the system memory into groups of independent segments referenced by pointers located in the segment registers. Each segment is used to contain a specific type of data. One segment is used to contain instruction codes, another segment stores the data elements, and a third segment keeps the program stack.

In the light of the above discussion, we can specify various memory segments as −

  • Data segment − It is represented by .data section and the .bss. The .data section is used to declare the memory region, where data elements are stored for the program. This section cannot be expanded after the data elements are declared, and it remains static throughout the program.
    The .bss section is also a static memory section that contains buffers for data to be declared later in the program. This buffer memory is zero-filled.
  • Code segment − It is represented by .text section. This defines an area in memory that stores the instruction codes. This is also a fixed area.
  • Stack − This segment contains data values passed to functions and procedures within the program.

Processors can broadly be divided into the categories of: CISC, RISC, hybrid, and special purpose.

The main internal hardware of a PC consists of processor, memory, and registers. Registers are processor components that hold data and address. To execute a program, the system copies it from the external device into the internal memory. The processor executes the program instructions.

The fundamental unit of computer storage is a bit; it could be ON (1) or OFF (0). A group of nine related bits makes a byte, out of which eight bits are used for data and the last one is used for parity. According to the rule of parity, the number of bits that are ON (1) in each byte should always be odd.

So, the parity bit is used to make the number of bits in a byte odd. If the parity is even, the system assumes that there had been a parity error (though rare), which might have been caused due to hardware fault or electrical disturbance.

The processor supports the following data sizes −

  • Word: a 2-byte data item
  • Doubleword: a 4-byte (32 bit) data item
  • Quadword: an 8-byte (64 bit) data item
  • Paragraph: a 16-byte (128 bit) area
  • Kilobyte: 1024 bytes
  • Megabyte: 1,048,576 bytes

Assemblies are of two types: 

  1. Private Assemblies 
  2. Shared Assemblies

Hexadecimal number system uses base @The digits in this system range from 0 to @By convention, the letters A through F is used to represent the hexadecimal digits corresponding to decimal values 10 through 15.

Hexadecimal numbers in computing is used for abbreviating lengthy binary representations. Basically, hexadecimal number system represents a binary data by dividing each byte in half and expressing the value of each half-byte. 

The three basic modes of addressing are −

  1. Register addressing
  2. Immediate addressing
  3. Memory addressing

Register Addressing

In this addressing mode, a register contains the operand. Depending upon the instruction, the register may be the first operand, the second operand or both.

For example,

MOV DX, TAX_RATE ; Register in first operand
MOV COUNT, CX  ; Register in second operand
MOV EAX, EBX  ; Both the operands are in registers

As processing data between registers does not involve memory, it provides fastest processing of data.

Immediate Addressing

An immediate operand has a constant value or an expression. When an instruction with two operands uses immediate addressing, the first operand may be a register or memory location, and the second operand is an immediate constant. The first operand defines the length of the data.

For example,

BYTE_VALUE  DB  150 ; A byte value is defined
WORD_VALUE  DW  300 ; A word value is defined
ADD  BYTE_VALUE, 65 ; An immediate operand 65 is added
MOV AX, 45H; Immediate constant 45H is trferred to AX

Direct Memory Addressing

When operands are specified in memory addressing mode, direct access to main memory, usually to the data segment, is required. This way of addressing results in slower processing of data. To locate the exact location of data in memory, we need the segment start address, which is typically found in the DS register and an offset value. This offset value is also called effective address.

In direct addressing mode, the offset value is specified directly as part of the instruction, usually indicated by the variable name. The assembler calculates the offset value and maintains a symbol table, which stores the offset values of all the variables used in the program.

In direct memory addressing, one of the operands refers to a memory location and the other operand references a register.

Condition codes are the list of possible conditions that can be tested during conditional instructions. Typical conditional instructions include: conditional branches, conditional jumps, and conditional subroutine calls. Some processors have a few additional data related conditional instructions, and some processors make every instruction conditional. Not all condition codes available for a processor will be implemented for every conditional instruction.

The data section is used for declaring initialized data or constants. This data does not change at runtime. You can declare various constant values, file names, or buffer size, etc., in this section.

The syntax for declaring data section is:

section.data

Assembly language statements are entered one statement per line. Each statement follows the following format −

[label]   mnemonic   [operands]   [;comment]

The fields in the square brackets are optional. A basic instruction has two parts, the first one is the name of the instruction (or the mnemonic), which is to be executed, and the second are the operands or the parameters of the command.

You can make use of Linux system calls in your assembly programs. You need to take the following steps for using Linux system calls in your program −

  • Put the system call number in the EAX register.
  • Store the arguments to the system call in the registers EBX, ECX, etc.
  • Call the relevant interrupt (80h).
  • The result is usually returned in the EAX register.

There are six registers that store the arguments of the system call used. These are the EBX, ECX, EDX, ESI, EDI, and EBP. These registers take the consecutive arguments, starting with the EBX register. If there are more than six arguments, then the memory location of the first argument is stored in the EBX register.

Most assembly language instructions require operands to be processed. An operand address provides the location, where the data to be processed is stored. Some instructions do not require an operand, whereas some other instructions may require one, two, or three operands.

When an instruction requires two operands, the first operand is generally the destination, which contains data in a register or memory location and the second operand is the source. Source contains either the data to be delivered (immediate addressing) or the address (in register or memory) of the data. Generally, the source data remains unaltered after the operation.

The EQU directive is used for defining constants. The syntax of the EQU directive is as follows −

CONSTANT_NAME EQU expression

For example: TOTAL_STUDENTS equ 50

You can then use this constant value in your code, like −

mov  ecx,  TOTAL_STUDENTS 
cmp  eax,  TOTAL_STUDENTS

The operand of an EQU statement can be an expression −

LENGTH equ 20
WIDTH  equ 10
AREA   equ length * width

Above code segment would define AREA as 200.

The bss section is used for declaring variables. The syntax for declaring bss section is :

section.bss

  • BRA Branch; Motorola 680x0, Motorola 68300; short (16 bit) unconditional branch relative to the current program counter
  • JMP Jump; Motorola 680x0, Motorola 68300; unconditional jump (any valid effective addressing mode other than data register)
  • JMP Jump; Intel 80x86; unconditional jump (near [relative displacement from PC] or far; direct or indirect [based on contents of general purpose register, memory location, or indexed])
  • JMP Jump; MIX; unconditional jump to location M; J-register loaded with the address of the instruction which would have been next if the jump had not been taken
  • JSJ Jump, Save J-register; MIX; unconditional jump to location M; J-register unchanged
  • Jcc Jump Conditionally; Intel 80x86; conditional jump (near [relative displacement from PC] or far; direct or indirect [based on contents of general purpose register, memory location, or indexed]) based on a tested condition: JA/JNBE, JAE/JNB, JB/JNAE, JBE/JNA, JC, JE/JZ, JNC, JNE/JNZ, JNP/JPO, JP/JPE, JG/JNLE, JGE/JNL, JL/JNGE, JLE/JNG, JNO, JNS, JO, JS
  • Bcc Branch Conditionally; Motorola 680x0, Motorola 68300; short (16 bit) conditional branch relative to the current program counter based on a tested condition: BCC, BCS, BEQ, BGE, BGT, BHI, BLE, BLS, BLT, BMI, BNE, BPL, BVC, BVS
  • JOV Jump on Overflow; MIX; conditional jump to location M if overflow toggle is on; if jump occurs, J-register loaded with the address of the instruction which would have been next if the jump had not been taken

Assemblies are made up of IL code modules and the metadata that describes them. Although programs may be compiled via an IDE or the command line, in fact, they are simply trlated into IL, not machine code. The actual machine code is not generated until the function that requires it is called.

Attributes are declarative tags in code that insert additional metadata into an assembly.

If you select "Development Tools" while installing Linux, you may get NASM installed along with the Linux operating system and you do not need to download and install it separately. For checking whether you already have NASM installed, take the following steps −

  • Open a Linux terminal.
  • Type whereis nasm and press ENTER.
  • If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM.

To install NASM, take the following steps :

  • Check The netwide assembler (NASM) website for the latest version.
  • Download the Linux source archive nasm-X.XX.ta.gz, where X.XX is the NASM version number in the archive.
  • Unpack the archive into a directory which creates a subdirectory nasm-X. XX.
  • cd to nasm-X.XX and type ./configure. This shell script will find the best C compiler to use and set up Makefiles accordingly.
  • Type make to build the nasm and ndisasm binaries.
  • Type make install to install nasm and ndisasm in /usr/local/bin and to install the man pages.

This should install NASM on your system. Alternatively, you can use an RPM distribution for the Fedora Linux. This version is simpler to install, just double-click the RPM file.

 

There are ten 32-bit and six 16-bit processor registers in IA-32 architecture. The registers are grouped into three categories −

  • General registers,
  • Control registers, and
  • Segment registers.

The general registers are further divided into the following groups −

  • Data registers,
  • Pointer registers, and
  • Index registers.

Assembly language is dependent upon the instruction set and the architecture of the processor. In this tutorial, we focus on Intel-32 processors like Pentium. To follow this tutorial, you will need :

  • An IBM PC or any equivalent compatible computer
  • A copy of Linux operating system
  • A copy of NASM assembler program

There are many good assembler programs, such as :

  • Microsoft Assembler (MASM)
  • Borland Turbo Assembler (TASM)
  • The GNU assembler (GAS)

We will use the NASM assembler, as it is :

  • Free. You can download it from various web sources.
  • Well documented and you will get lots of information on net.
  • Could be used on both Linux and Windows.

Each personal computer has a microprocessor that manages the computer's arithmetical, logical, and control activities.

Each family of processors has its own set of instructions for handling various operations such as getting input from keyboard, displaying information on screen and performing various other jobs. These set of instructions are called 'machine language instructions'.

A processor understands only machine language instructions, which are strings of 1's and 0's. However, machine language is too obscure and complex for using in software development. So, the low-level assembly language is designed for a specific family of processors that represents various instructions in symbolic code and a more understandable form.

Assembly language programs consist of three types of statements −

  • Executable instructions or instructions,
  • Assembler directives or pseudo-ops, and
  • Macros.

The executable instructions or simply instructions tell the processor what to do. Each instruction consists of an operation code (opcode). Each executable instruction generates one machine language instruction.

The assembler directives or pseudo-ops tell the assembler about the various aspects of the assembly process. These are non-executable and do not generate machine language instructions.

Macros are basically a text substitution mechanism.

The text section is used for keeping the actual code. This section must begin with the declaration global _start, which tells the kernel where the program execution begins.

The syntax for declaring text section is:

section.text
   global _start
_start: