A Simple HTTP Server in C

Directory Structure and Makefile
Directory Structure
.
├── build
├── include
├── Makefile
├── README.md
└── src
└── main.c
The directory structure is organized as follows:
build: This directory is intended to hold the compiled output of the project. It typically contains the executable files and any intermediate object files generated during the build process.
include: This folder is used to store header files (.h) that contain declarations for the functions and variables used in the source code. These headers are included in the source files to provide necessary definitions and interfaces.
Makefile: This file contains instructions for the build process. It defines rules and dependencies for compiling the source code into an executable program. The Makefile automates the compilation process, making it easier to build the project.
README.md: This markdown file provides an overview of the project. It usually includes information about the project's purpose, how to set it up, how to build it, and any other relevant details that users or developers might need to know.
src: This directory contains the source code files for the project. It is where the main implementation of the program resides.
Makefile
# Compiler
CC = gcc
# Directories
SRC_DIR = src
INC_DIR = include
BUILD_DIR = build
# Files
SRCS = $(wildcard $(SRC_DIR)/*.c)
OBJS = $(patsubst $(SRC_DIR)/%.c,$(BUILD_DIR)/%.o,$(SRCS))
TARGET = $(BUILD_DIR)/main
# Flags
CFLAGS = -I$(INC_DIR) -Wall -Wextra -g
# Rules
all: $(TARGET)
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.c | $(BUILD_DIR)
$(CC) $(CFLAGS) -c $< -o $@
$(TARGET): $(OBJS)
$(CC) $(CFLAGS) $^ -o $@
$(BUILD_DIR):
mkdir -p $(BUILD_DIR)
run: $(TARGET)
./$(TARGET)
clean:
rm -rf $(BUILD_DIR)
.PHONY: all clean
A Makefile simplifies our workflow. Instead of typing out a lengthy command with all the necessary compiler flags or pressing the up arrow on your terminal, we can predefine the commands and execute them using simple commands like make run
, make clean
, etc.
Understanding File Descriptors
File descriptors are unique identifiers used by the operating system to access files or other input/output resources. They enable programs to read from or write to these resources efficiently. Each open file or resource is associated with a specific file descriptor, typically an Integer.
#include <stdio.h>
#include <fcntl.h>
int main() {
int filefd = open("test.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
if (filefd < 0) {
perror("Failed to open file");
return 1;
}
printf("File descriptor: %d\n", filefd);
return 0;
}
In this program, we open a file described by test.txt
using the options O_WRONLY | O_CREAT | O_TRUNC
. Here's a brief explanation of these options:
O_WRONLY
: Opens the file for writing only.O_CREAT
: Creates the file if it does not exist.O_TRUNC
: Truncates the file to zero length if it already exists.
These options are bitwise OR'd together. They are defined in the fcntl.h
header file and are essentially binary numbers. We OR these flags and pass the resulting integer to the open
system call. Additionally, we can set the file permissions to 0644
, which is the same format used with the chmod
command. The leading zero in 0644 signifies that the number is in octal format.
You can use Ctrl + Click
on the flags to navigate to the file located at /usr/include/x86_64-linux-gnu/bits/fcntl.h
.
#define O_ACCMODE 0003
#define O_RDONLY 00
#define O_WRONLY 01
#define O_RDWR 02
#ifndef O_CREAT
# define O_CREAT 0100 /* Not fcntl. */
#endif
#ifndef O_EXCL
# define O_EXCL 0200 /* Not fcntl. */
#endif
#ifndef O_NOCTTY
# define O_NOCTTY 0400 /* Not fcntl. */
#endif
#ifndef O_TRUNC
# define O_TRUNC 01000 /* Not fcntl. */
#endif
#ifndef O_APPEND
# define O_APPEND 02000
#endif
#ifndef O_NONBLOCK
# define O_NONBLOCK 04000
#endif
#ifndef O_NDELAY
# define O_NDELAY O_NONBLOCK
#endif
#ifndef O_SYNC
# define O_SYNC 04010000
#endif
#define O_FSYNC O_SYNC
#ifndef O_ASYNC
# define O_ASYNC 020000
#endif
#ifndef __O_LARGEFILE
# define __O_LARGEFILE 0100000
#endif
#ifndef __O_DIRECTORY
# define __O_DIRECTORY 0200000
#endif
#ifndef __O_NOFOLLOW
# define __O_NOFOLLOW 0400000
#endif
#ifndef __O_CLOEXEC
# define __O_CLOEXEC 02000000
#endif
#ifndef __O_DIRECT
# define __O_DIRECT 040000
#endif
#ifndef __O_NOATIME
# define __O_NOATIME 01000000
#endif
#ifndef __O_PATH
# define __O_PATH 010000000
#endif
#ifndef __O_DSYNC
# define __O_DSYNC 010000
#endif
#ifndef __O_TMPFILE
# define __O_TMPFILE (020000000 | __O_DIRECTORY)
#endif
Above, you can see that we have defined an octal number, which we can pass using the |
operator and send it to the open
syscall.
Sockets
Introduction
Sockets are endpoints for communication between two machines. They enable data exchange over a network using protocols like TCP or UDP. Sockets are fundamental in network programming, allowing applications to send and receive data.
The primary distinction between normal file descriptors and socket descriptors lies in their binding behavior. When you use the
open
function in C to access a file, you receive a file descriptor that is directly bound to a specific file or device on the system. In contrast, when you create a socket and receive a file descriptor, it is not automatically bound to any port. You must manually bind the socket descriptor to a port using additional functions.
Socket Creation
In networking, socket creation is the initial step in establishing a connection between two devices. It involves generating a unique endpoint for communication, typically using a combination of IP address and port number. This process is fundamental in both client-server and peer-to-peer communication models.
// Create a socket
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) {
perror("Error opening socket");
exit(1);
}
The syntax for creating a socket is shown above. It accepts three arguments and returns a descriptor.
The initial parameter is referred to as the Protocol family. There are numerous families, including AF_INET, AF_BLUETOOTH, and AF_INET6. This parameter determines the fundamental type of socket being created. For instance, AF_INET is used for IPv4 connections.
The subsequent parameter is known as the Socket Type, with two primary options: SOCK_DGRAM
and SOCK_STREAM
. SOCK_DGRAM
refers to datagrams, not exclusively UDP, as there are various types of datagrams, with UDP being one of them. This parameter defines a protocol class, while the specific protocol is determined by the next parameter. It's important to note that SOCK_STREAM
represents an ordered, reliable stream, whereas SOCK_DGRAM
signifies an unordered, unreliable stream.
The final parameter is named Protocol, and it specifies the particular protocol to be used. For instance, if we select SOCK_DGRAM, we can utilize the UDP protocol, UDPLite, or ICMPv6. It's important to note that you cannot use SOCK_STREAM with a protocol that belongs to the SOCK_DGRAM class, such as UDP or UDPLite, and vice versa.
Honestly, I'm aware that recalling protocol numbers for protocols can be challenging, and it's easy to forget or mix them up. To simplify this, we can utilize a function called
getprotobyname
. This function accepts the protocol name as a parameter, liketcp
, and returns aprotoent
structure. From this structure, we can access the protocol number usingprotoent->p_proto
.
struct protoent *proto
proto = getprotobyname('tcp')
sockfd = socket(AF_INET, SOCK_STREAM, proto->p_proto);
Binding to an Address and Port
Binding to an address and port refers to the process of assigning a specific network address and port number to a network socket. This allows a program to listen for incoming data or send data to a particular destination. It's a fundamental concept in network programming, enabling communication between different devices and applications.
int bind(int sockfd, const struct sockaddr_in *addr, socklen_t addrlen);
To utilize the bind
function, we must supply it with our socket file descriptor and a structure known as sockaddr
. Let's take a closer look at the sockaddr
structure.
struct sockaddr_in {
sa_family_t sin_family; // Always AF_INET
in_port_t sin_port; // Port number (must use htons())
struct in_addr sin_addr; // IP address
char sin_zero[8]; // Padding (unused, just fill with 0s)
};
Let's proceed to construct a sockaddr_in
structure.
struct sockaddr_in server_addr;
When we create it, it might contain garbage values, so we need to reset all of them to 0. We can achieve this using the memset
function. We need to provide three things: the struct, the value to write, and the size.
memset(&server_addr, 0, sizeof(server_addr));
Let's configure the server address to bind to.
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(8080);
server_addr.sin_addr.s_addr = inet_addr("0.0.0.0");
The sin_family
must be set to AF_INET
exclusively because we are utilizing sockaddr_in
, which is designed for IPv4 communication. Therefore, no other value can be used for sin_family
sin_port
This can be configured to any port, but avoid using well-known ports such as 22.
htons
converts a 16-bit unsigned short from host byte order to network byte order. It ensures that the byte order is consistent across different systems when transmitting data over a network. This function is crucial for network programming to maintain data integrity.
htons -> Host to Network Short
To understand the necessity, consider that different systems may employ either little-endian or big-endian formats. However, data transmitted over networks consistently uses big-endian. Consequently, we utilize the host-to-network byte order function, commonly abbreviated as htons
.
Finally, we can configure sin_addr
and its parameter s_addr
using inet_addr
. The inet_addr
function takes an IPv4 string as its parameter.
We can also configure sin_addr.s_addr
to INADDR_ANY
, which allows the system to accept connections on any available network interface.
IANA Port Number Ranges
There are three ranges of port numbers defined by the Internet Assigned Numbers Authority (IANA):
Well-Known Ports (0-1023)
These ports are reserved for specific protocols such as HTTP, SSH, and DNS.
We must avoid binding sockets to ports within this range.
Registered Ports (1024-49151)
- These ports are available for use in binding and other operations.
Dynamic/Ephemeral Ports (49152-65535)
- These ports are used by the system as ephemeral ports during communication.
Listening on a Socket
The listen()
function in network programming is used to mark a socket as a passive socket, indicating that it will be used to accept incoming connection requests. This function is typically called on a server-side socket to enable it to listen for incoming connections from client sockets. The listen()
function takes a backlog parameter, which specifies the maximum number of pending connections that can be queued before new connection requests are rejected.
listen(sockfd, 5)
The initial parameter represents the socket file descriptor, while the second parameter denotes the size of the queue.
Returns
0
on success.Returns
-1
on failure.
Accepting a Connection
Accepting a connection involves using the accept()
function, which is typically called on a socket that is already bound to a specific port and is listening for incoming connections. This function returns a new socket file descriptor that can be used to communicate with the connected client. The accept()
function is usually used in conjunction with bind()
and listen()
functions to set up the server socket.
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
To handle incoming connections, you need to pass the socket file descriptor to the accept
function. For each accepted connection, you must store the client's sockaddr
information. To achieve this, you'll need to create a new sockaddr
structure and pass it along with its length as the next two parameters to the accept
function.
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
int newsockfd = accept(sockfd, (struct sockaddr *)&client_addr, &client_len);
if (newsockfd < 0) {
perror("accept");
close(sockfd);
return 1;
}
You should be aware that the accept
function call is a blocking call. If you invoke the accept
call, it will remain in a blocked state until a connection is received and closed.
printf("Accepted connection from %s:%d\n",
inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port));
const char *response = "HTTP/1.1 200 OK\r\n"
"Content-Type: text/plain\r\n"
"Content-Length: 13\r\n"
"\r\n"
"Hello, World!";
write(newsockfd, response, strlen(response));
close(newsockfd);
Upon accepting a connection, the details of the connection, such as the client's IP address and port, are stored in the sockadd
structure that we provided. These details can be retrieved from this structure.
Next, we'll create an HTTP 1.1 response to send back to the client. For now, we'll manually send a simple "hello world" string. Don't worry about the HTTP protocol details; it's text-based, and you can understand what's happening just by reading the response value.
Currently, we employ the write
call in a manner similar to writing to any other file descriptor. We pass the file descriptor, the buffer, and the length to write.
We then terminate the connection.
curl
and Check
Let's now utilize the curl
command to establish a connection with the server.
razor@beast:~$ curl localhost:8080
Hello, World!
Here's what we've got: the "hello world" text displayed in the terminal. We've just created a basic HTTP server using C.
Accepting Connections with a while
Loop
You may have noticed that when you attempt to use curl
again, the connection fails because the server shuts down after the previous unsuccessful connection. This happens because the accept
primitive is a blocking call. It listens for a connection the first time, and once it receives one, it executes the code and exits, causing the entire program to terminate. To prevent this, you can wrap the entire accept connection code in an infinite while
loop.
while(1){
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
int newsockfd = accept(sockfd, (struct sockaddr *)&client_addr, &client_len);
if (newsockfd < 0) {
perror("accept");
close(sockfd);
return 1;
}
printf("Accepted connection from %s:%d\n",
inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port));
const char *response = "HTTP/1.1 200 OK\r\n"
"Content-Type: text/plain\r\n"
"Content-Length: 13\r\n"
"\r\n"
"Hello, World!";
write(newsockfd, response, strlen(response));
close(newsockfd);
}
When a connection request is received, the accept
call will pause and wait. Upon arrival of a connection, the associated code runs, generating a response. Due to the infinite while
loop, accept
is called again, causing the process to pause once more waiting for other connections.
Subscribe to my newsletter
Read articles from Praful M directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
