Verilog Fundamentals

Module Declaration and Ports
Every Verilog design starts with a module. Think of it like a black box with inputs and outputs.
module my_module (
input clk, // Single bit input
input rst, // Single bit input
input [7:0] data_in, // 8-bit input bus
output [7:0] data_out, // 8-bit output bus
output valid // Single bit output
);
// Module content goes here
endmodule
Key Points:
input
= signals coming INTO your moduleoutput
= signals going OUT of your module[7:0]
= 8-bit wide signal (bit 7 down to bit 0)All ports are implicitly
wire
type unless specified otherwise
Data Types: wire vs reg
wire
Represents physical connections (like actual wires)
Can't store values on their own
Must be continuously driven by something
Used for combinational logic outputs
wire [7:0] sum; // 8-bit wire
wire carry_out; // 1-bit wire
wire [15:0] product; // 16-bit wire
reg
Represents storage elements (like flip-flops or latches)
Can hold values between clock edges
Used in sequential logic (inside
always
blocks)Note:
reg
doesn't always mean a physical register!
reg [7:0] counter; // 8-bit register
reg state; // 1-bit register
reg [31:0] memory [0:255]; // Array of 32-bit registers
assign Statement (Combinational Logic)
assign
creates continuous assignments - like connecting wires together.
module adder_example (
input [7:0] a, b,
input cin,
output [7:0] sum,
output cout
);
// Simple assignments
assign sum = a + b + cin;
assign cout = (a + b + cin) > 8'hFF;
// Conditional assignment (multiplexer)
assign sum = (cin) ? a + b + 1 : a + b;
// Bitwise operations
assign result = a & b; // AND
assign result = a | b; // OR
assign result = a ^ b; // XOR
assign result = ~a; // NOT
assign result = a << 2; // Shift left by 2
assign result = a >> 1; // Shift right by 1
endmodule
Key Points:
assign
is for combinational logic onlyOutput changes immediately when inputs change
Think of it as permanent wire connections
always Blocks (Sequential and Combinational)
always
blocks are where the action happens. They describe behavior.
Sequential Logic (Clocked)
// D Flip-Flop
always @(posedge clk) begin
if (rst) begin
q <= 1'b0; // Reset
end else begin
q <= d; // Normal operation
end
end
// Counter with enable
always @(posedge clk) begin
if (rst) begin
counter <= 8'b0;
end else if (enable) begin
counter <= counter + 1;
end
end
// Shift register
always @(posedge clk) begin
shift_reg <= {shift_reg[6:0], serial_in}; // Shift left, new bit in
end
Combinational Logic (always_comb or always @(*))
// ALU example
always @(*) begin // or always_comb in SystemVerilog
case (operation)
2'b00: result = a + b; // ADD
2'b01: result = a - b; // SUB
2'b10: result = a & b; // AND
2'b11: result = a | b; // OR
default: result = 8'b0;
endcase
end
// Priority encoder
always @(*) begin
if (input_vec[7]) output_code = 3'd7;
else if (input_vec[6]) output_code = 3'd6;
else if (input_vec[5]) output_code = 3'd5;
// ... etc
else output_code = 3'd0;
end
Edge Detection: posedge, negedge
Controls when the always block executes.
always @(posedge clk) begin // Rising edge of clock
// This runs when clk goes from 0 → 1
end
always @(negedge clk) begin // Falling edge of clock
// This runs when clk goes from 1 → 0
end
always @(posedge clk or posedge rst) begin // Multiple edges
if (rst) begin
// Asynchronous reset
end else begin
// Normal clocked operation
end
end
always @(*) begin // Any input change
// Combinational logic - runs whenever ANY input changes
end
Module Instantiation and Hierarchy
Modules are like LEGO blocks - you build complex designs by connecting simpler modules together.
Basic Module Instantiation
// Define a simple adder module
module adder_4bit (
input [3:0] a, b,
input cin,
output [3:0] sum,
output cout
);
assign {cout, sum} = a + b + cin;
endmodule
// Use the adder in a larger design
module calculator (
input [3:0] x, y, z,
output [3:0] result1, result2,
output overflow1, overflow2
);
// Instantiate two adders
adder_4bit add1 (
.a(x), // Connect x to port a
.b(y), // Connect y to port b
.cin(1'b0), // Tie cin to 0
.sum(result1), // Connect sum to result1
.cout(overflow1) // Connect cout to overflow1
);
adder_4bit add2 (
.a(result1), // Chain the outputs
.b(z),
.cin(1'b0),
.sum(result2),
.cout(overflow2)
);
endmodule
Positional vs Named Port Connections
// POSITIONAL (order matters - error prone!)
adder_4bit add1 (x, y, 1'b0, result1, overflow1);
// NAMED (explicit - much better!)
adder_4bit add1 (
.a(x),
.b(y),
.cin(1'b0),
.sum(result1),
.cout(overflow1)
);
Generate Statements (Arrays of Modules)
module ripple_carry_adder_16bit (
input [15:0] a, b,
input cin,
output [15:0] sum,
output cout
);
wire [16:0] carry; // Internal carry chain
assign carry[0] = cin;
assign cout = carry[16];
// Generate 16 full adders
genvar i;
generate
for (i = 0; i < 16; i = i + 1) begin : adder_stage
full_adder fa (
.a(a[i]),
.b(b[i]),
.cin(carry[i]),
.sum(sum[i]),
.cout(carry[i+1])
);
end
endgenerate
endmodule
Control Structures
if-else Statements
// Simple if
always @(*) begin
if (enable) begin
output_data = input_data;
end else begin
output_data = 8'b0;
end
end
// Nested if-else (priority encoder)
always @(*) begin
if (priority[3]) begin
grant = 4'b1000;
end else if (priority[2]) begin
grant = 4'b0100;
end else if (priority[1]) begin
grant = 4'b0010;
end else if (priority[0]) begin
grant = 4'b0001;
end else begin
grant = 4'b0000;
end
end
// if in sequential logic
always @(posedge clk) begin
if (rst) begin
counter <= 0;
end else if (load) begin
counter <= load_value;
end else if (enable) begin
counter <= counter + 1;
end
// else counter keeps its value
end
case Statements
// Basic case (like switch statement)
always @(*) begin
case (opcode)
3'b000: result = a + b; // ADD
3'b001: result = a - b; // SUB
3'b010: result = a & b; // AND
3'b011: result = a | b; // OR
3'b100: result = a ^ b; // XOR
3'b101: result = ~a; // NOT
3'b110: result = a << 1; // SHIFT LEFT
3'b111: result = a >> 1; // SHIFT RIGHT
default: result = 8'b0; // Always include default!
endcase
end
// Case with don't cares (casez)
always @(*) begin
casez (instruction[6:0])
7'b0110011: instr_type = R_TYPE; // R-type
7'b0010011: instr_type = I_TYPE; // I-type
7'b01100??: instr_type = B_TYPE; // B-type (? = don't care)
7'b???????: instr_type = UNKNOWN; // Catch-all
default: instr_type = UNKNOWN;
endcase
end
// State machine with case
always @(posedge clk) begin
if (rst) begin
state <= IDLE;
end else begin
case (state)
IDLE: begin
if (start) state <= FETCH;
end
FETCH: begin
state <= DECODE;
end
DECODE: begin
state <= EXECUTE;
end
EXECUTE: begin
if (done) state <= IDLE;
end
default: state <= IDLE;
endcase
end
end
Loops (for, while, repeat)
Important: Loops in Verilog create hardware, not software! Each loop iteration becomes parallel hardware.
for Loops
// Parallel hardware - all operations happen simultaneously
always @(*) begin
integer i;
parity = 1'b0;
for (i = 0; i < 8; i = i + 1) begin
parity = parity ^ data[i]; // XOR all bits together
end
end
// Array initialization
integer j;
always @(posedge clk) begin
if (rst) begin
for (j = 0; j < 16; j = j + 1) begin
register_file[j] <= 32'b0; // Clear all registers
end
end else if (write_enable) begin
register_file[write_addr] <= write_data;
end
end
// Generate-for (creates actual hardware instances)
generate
for (genvar k = 0; k < 8; k = k + 1) begin : byte_lane
assign byte_valid[k] = |data_bus[k*8 +: 8]; // OR all bits in byte
end
endgenerate
while and repeat Loops
// while loop (rare in synthesizable code)
always @(*) begin
temp = input_val;
count = 0;
while (temp != 0) begin
count = count + 1;
temp = temp >> 1; // Count leading zeros
end
end
// repeat loop (fixed number of iterations)
always @(*) begin
shifted_data = input_data;
repeat (shift_amount) begin
shifted_data = shifted_data << 1;
end
end
Common Data Structures
Arrays and Memories
// Register array (like register file)
reg [31:0] registers [0:31]; // 32 registers, each 32-bits wide
// 2D array
reg [7:0] memory [0:255][0:3]; // 256 rows × 4 columns of bytes
// Accessing arrays
always @(posedge clk) begin
if (write_enable) begin
registers[write_addr] <= write_data;
end
read_data <= registers[read_addr];
end
// Initialize array
integer i;
initial begin
for (i = 0; i < 32; i = i + 1) begin
registers[i] = 32'b0;
end
end
Packed vs Unpacked Arrays
// PACKED - contiguous bits, can be accessed as vector
reg [7:0] packed_array [0:15]; // 16 elements of 8 bits each
wire [127:0] flat_view = packed_array; // Can treat as 128-bit vector
// UNPACKED - separate storage locations
reg unpacked_array [0:15][7:0]; // Same data, different organization
// Cannot directly convert to single vector
Bit Selection and Slicing
wire [31:0] data = 32'hDEADBEEF;
// Bit selections
wire msb = data[31]; // Most significant bit
wire lsb = data[0]; // Least significant bit
// Bit slicing
wire [7:0] byte3 = data[31:24]; // Upper byte
wire [7:0] byte0 = data[7:0]; // Lower byte
wire [15:0] upper = data[31:16]; // Upper 16 bits
// Variable bit selection (careful!)
wire selected_bit = data[bit_index]; // bit_index must be constant!
// Part-select with variable index
wire [7:0] byte_sel = data[byte_index*8 +: 8]; // +: means "8 bits starting at"
wire [7:0] byte_sel2 = data[byte_index*8 + 7 : byte_index*8]; // Same thing
Advanced Control Structures
Conditional Operator (Ternary)
// Simple mux
assign output = select ? input1 : input0;
// Nested conditionals
assign priority_out = (high_pri) ? high_data :
(med_pri) ? med_data :
(low_pri) ? low_data : default_data;
// Inside always blocks
always @(*) begin
next_state = (current_state == IDLE) ?
(start ? ACTIVE : IDLE) :
(done ? IDLE : ACTIVE);
end
Functions and Tasks
// Function (combinational only, returns value)
function [7:0] count_ones;
input [31:0] data;
integer i;
begin
count_ones = 0;
for (i = 0; i < 32; i = i + 1) begin
count_ones = count_ones + data[i];
end
end
endfunction
// Task (can have delays, multiple outputs)
task write_memory;
input [7:0] addr;
input [31:0] data;
begin
@(posedge clk);
mem_addr <= addr;
mem_data <= data;
mem_we <= 1'b1;
@(posedge clk);
mem_we <= 1'b0;
end
endtask
// Using function and task
always @(*) begin
ones_count = count_ones(input_vector);
end
always @(posedge clk) begin
if (write_request) begin
write_memory(address, write_data);
end
end
Parameters and Localparam
// Module parameters (configurable)
module fifo #(
parameter WIDTH = 8,
parameter DEPTH = 16
) (
input clk, rst,
input [WIDTH-1:0] din,
input push, pop,
output [WIDTH-1:0] dout,
output full, empty
);
localparam ADDR_BITS = $clog2(DEPTH); // Local parameter
reg [WIDTH-1:0] memory [0:DEPTH-1];
reg [ADDR_BITS:0] wr_ptr, rd_ptr;
// Implementation...
endmodule
// Instantiate with different parameters
fifo #(.WIDTH(32), .DEPTH(64)) data_fifo (...);
fifo #(.WIDTH(16), .DEPTH(8)) cmd_fifo (...);
Complete Examples
Simple Counter
module counter (
input clk,
input rst,
input enable,
output reg [7:0] count
);
always @(posedge clk) begin
if (rst) begin
count <= 8'b0; // Use <= for sequential
end else if (enable) begin
count <= count + 1;
end
end
endmodule
State Machine
module fsm (
input clk, rst,
input start, done,
output reg busy,
output reg [1:0] state
);
parameter IDLE = 2'b00;
parameter WORKING = 2'b01;
parameter FINISH = 2'b10;
reg [1:0] next_state;
// State register
always @(posedge clk) begin
if (rst) begin
state <= IDLE;
end else begin
state <= next_state;
end
end
// Next state logic
always @(*) begin
case (state)
IDLE: begin
if (start) next_state = WORKING;
else next_state = IDLE;
end
WORKING: begin
if (done) next_state = FINISH;
else next_state = WORKING;
end
FINISH: begin
next_state = IDLE;
end
default: next_state = IDLE;
endcase
end
// Output logic
always @(*) begin
busy = (state == WORKING);
end
endmodule
Memory Module
module simple_memory (
input clk,
input [7:0] addr,
input [31:0] data_in,
input we, // Write enable
output reg [31:0] data_out
);
reg [31:0] memory [0:255]; // 256 words of 32-bit memory
always @(posedge clk) begin
if (we) begin
memory[addr] <= data_in; // Write
end
data_out <= memory[addr]; // Read (always happens)
end
endmodule
Real-World Example: UART Transmitter
module uart_tx #(
parameter CLOCK_FREQ = 50_000_000,
parameter BAUD_RATE = 115200
) (
input clk, rst,
input [7:0] data,
input send,
output reg tx,
output reg busy
);
localparam CLKS_PER_BIT = CLOCK_FREQ / BAUD_RATE;
localparam COUNTER_BITS = $clog2(CLKS_PER_BIT);
// State machine states
localparam IDLE = 3'b000;
localparam START_BIT = 3'b001;
localparam DATA_BITS = 3'b010;
localparam STOP_BIT = 3'b011;
reg [2:0] state;
reg [COUNTER_BITS-1:0] clk_counter;
reg [2:0] bit_counter;
reg [7:0] tx_data;
always @(posedge clk) begin
if (rst) begin
state <= IDLE;
tx <= 1'b1;
busy <= 1'b0;
clk_counter <= 0;
bit_counter <= 0;
end else begin
case (state)
IDLE: begin
tx <= 1'b1;
busy <= 1'b0;
clk_counter <= 0;
bit_counter <= 0;
if (send) begin
tx_data <= data;
state <= START_BIT;
busy <= 1'b1;
end
end
START_BIT: begin
tx <= 1'b0; // Start bit is 0
if (clk_counter < CLKS_PER_BIT - 1) begin
clk_counter <= clk_counter + 1;
end else begin
clk_counter <= 0;
state <= DATA_BITS;
end
end
DATA_BITS: begin
tx <= tx_data[bit_counter];
if (clk_counter < CLKS_PER_BIT - 1) begin
clk_counter <= clk_counter + 1;
end else begin
clk_counter <= 0;
if (bit_counter < 7) begin
bit_counter <= bit_counter + 1;
end else begin
bit_counter <= 0;
state <= STOP_BIT;
end
end
end
STOP_BIT: begin
tx <= 1'b1; // Stop bit is 1
if (clk_counter < CLKS_PER_BIT - 1) begin
clk_counter <= clk_counter + 1;
end else begin
state <= IDLE;
end
end
default: state <= IDLE;
endcase
end
end
endmodule
Key Rules and Best Practices
Blocking vs Non-Blocking Assignments
// BLOCKING (=) - happens immediately
always @(*) begin
temp = a + b; // This happens first
result = temp + c; // This uses the NEW value of temp
end
// NON-BLOCKING (<=) - happens at end of time step
always @(posedge clk) begin
temp <= a + b; // Both assignments happen
result <= temp + c; // simultaneously using OLD values
end
Rule of Thumb:
Use
<=
for sequential logic (flip-flops, registers)Use
=
for combinational logic (inside always @(*))
Common Patterns
// Reset pattern
always @(posedge clk) begin
if (rst) begin
// Reset all registers to known values
counter <= 0;
state <= IDLE;
valid <= 0;
end else begin
// Normal operation
end
end
// Enable pattern
always @(posedge clk) begin
if (rst) begin
data <= 0;
end else if (enable) begin
data <= new_data; // Only update when enabled
end
// else data keeps its old value
end
// Mux pattern
assign output = (select) ? input1 : input0;
// Decoder pattern
always @(*) begin
outputs = 8'b0; // Default all off
outputs[address] = 1'b1; // Turn on selected bit
end
Summary
wire: For connections and combinational outputs
reg: For storage and variables in always blocks
assign: Continuous combinational logic
always @(posedge clk): Sequential logic (flip-flops)
always @(*): Combinational logic
<=: Non-blocking (use for sequential)
\=: Blocking (use for combinational)
Modules: Building blocks - connect them to make complex designs
if/case: Control flow - creates multiplexers and decoders in hardware
for loops: Create parallel hardware, not sequential execution
Functions/tasks: Reusable code blocks
Parameters: Make modules configurable
The key is understanding when things happen:
assign
andalways @(*)
react immediately to input changesalways @(posedge clk)
only updates on clock edgesLoops unroll into parallel hardware
Each module instance is separate physical hardware
Subscribe to my newsletter
Read articles from Jyotiprakash Mishra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Jyotiprakash Mishra
Jyotiprakash Mishra
I am Jyotiprakash, a deeply driven computer systems engineer, software developer, teacher, and philosopher. With a decade of professional experience, I have contributed to various cutting-edge software products in network security, mobile apps, and healthcare software at renowned companies like Oracle, Yahoo, and Epic. My academic journey has taken me to prestigious institutions such as the University of Wisconsin-Madison and BITS Pilani in India, where I consistently ranked among the top of my class. At my core, I am a computer enthusiast with a profound interest in understanding the intricacies of computer programming. My skills are not limited to application programming in Java; I have also delved deeply into computer hardware, learning about various architectures, low-level assembly programming, Linux kernel implementation, and writing device drivers. The contributions of Linus Torvalds, Ken Thompson, and Dennis Ritchie—who revolutionized the computer industry—inspire me. I believe that real contributions to computer science are made by mastering all levels of abstraction and understanding systems inside out. In addition to my professional pursuits, I am passionate about teaching and sharing knowledge. I have spent two years as a teaching assistant at UW Madison, where I taught complex concepts in operating systems, computer graphics, and data structures to both graduate and undergraduate students. Currently, I am an assistant professor at KIIT, Bhubaneswar, where I continue to teach computer science to undergraduate and graduate students. I am also working on writing a few free books on systems programming, as I believe in freely sharing knowledge to empower others.