Socket programming is simply the act of programming interactions. It is used in a variety of applications, from chatbots, web applications and file transfer programs to inter-process communications within the operating system. In this article, I’ll focus on network sockets, and we’ll go through making our own client-server interaction in Python3 at the end.
Objectives :
- Understand what sockets are. Not just the ‘what’ but also the ‘why’.
- Code a simple client-server script in Python3, using and understanding the basic functions associated with the socket module.
Pre-requisites :
- Python3 on your system ( Linux, Windows or macOS )
Sockets
Sockets “are” file descriptors in Unix. That’s it.
Great. What are file descriptors though?
To understand that, let’s take a bit of a detour talking about the kernel and something called the POSIX API.
POSIX
POSIX stands for ‘Portable Operating System Based on Unix’. It is basically a family of standards developed by the IEEE Computer Society in order to unify the UNIX or UNIX-like platforms. macOS, Solaris and QNX Neutrino are POSIX certified operating systems. Linux is NOT. It adheres to most of the POSIX compliance but is not certified. The POSIX API (Application Programming Interface) is a set of subroutines for building software. More specifically, it is a collection of subroutines to build a specific type of software - Operating Systems.
Just remember POSIX == standards for now. You’ll see its value in a few moments.
Kernel
Now the kernel is a small computer program that is responsible for handling the entire system’s processes. All data-processing occurs through the kernel, as it translates the instructions for the CPU. Naturally, due to the sensitive nature of the tasks it performs it is allocated a special memory space referred to as ‘kernel space’.
Why? What is the need for a special allocation of memory?
Because the kernel contains a file-descriptor table for each process, which is then indexed to a system-wide table called the file-table. The file-table’s entries are what are referenced by the file descriptors. So to put it simply, a file descriptor is the index of the open files represented in the file-table.
Figure 1 : Descriptor, File and Inode Tables
The inode table is simply a third table which indexes the location of the file that is being accessed/modified.
So how does this tie into the sockets discussion?
File Descriptors
Well, remember how I talked about sockets “being” file descriptors? There are three special file descriptors that are associated with every process by default.
In Unix and many Unix-like systems these file descriptors are represented by the integer datatype (like in C). File descriptors aren’t just integers, they are non-negative integers.
{ These characteristics are described by the POSIX API. See, that detour was useful! }
The three file descriptors :
- STDIN - this is assigned to the integer ‘0’.
- STDOUT - this is assigned to the integer ‘1’.
- STDERR - this is assigned to the integer ’2’.
Coming back to the sockets discussion, it is helpful to think of sockets as a communication endpoint that is represented with the help of file descriptors. The numbers are a mapping to the open files per process. Hopefully, this helps you understand the relevance of file descriptors and why it was necessary to learn about them.
But, a question that could arise in your mind is : “Why file descriptors? Why not some other object oriented paradigm of data?”
The answer is actually pretty simple. “Everything in Unix is a file.”. More specifically, “Everything is a file descriptor”. All the pipes, network sockets, files and other resources are represented with the help of file descriptors in Unix.
But wait! How can you verify what I just said? I certainly wouldn’t believe a stranger on the Internet!
Let’s verify this by taking a quick example. Open up your terminal and use the interactive Python environment by typing in :
>>python3
Next, import the socket module that is available in the standard python library.
>>import socket
Create a simple socket by using the following syntax :
>>s = socket.socket()
Print information about the socket by using :
>>print(s)
You should get something like this :
<socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=(‘0.0.0.0’, 0)>
Let’s break this down. This is where we can use that knowledge about file descriptors. For now, ignore what “family”, “type” and “proto” mean. We’ll expand on that later on.
Remember how we said that STDIN, STDOUT and STDERR are assigned to the numbers 0,1, and 2? And how the file descriptors are non-negative integers? Well what non-negative integer comes after 2? 3!
And that is the the exact number that is assigned to the socket ’s’! That is why we see fd=3!
Also, POSIX standards ( remember this? ) clarify why the number is three :
All functions that open one or more file descriptors shall, unless specified otherwise, automatically allocate the lowest number available (that is not already open in the calling process) to file descriptor at the time of each allocation. [1]
{edited for grammatical mistakes. If you wish to see the original statement please head over to the references section at the end of the article.}
In case you’re wondering why the numbers have to be non-negative integers, there are two reasons :
- Working with integers is easier. When is the last time that you wrote your grocery list like 0.001, 0.002, 0.003? The same thing goes for file systems. It’s easier for the programmers to work with integers than floating types.
- The reason that the integers are non-negative is because the negative numbers are assigned to error codes. For example, EACCES = -13.
File descriptors also have two additional characteristics :
- Once used, they are recycled.
- There is a limit to how many file descriptors can be used because there is a limit to how many files can be opened at a time.
I hope that this provides you with a “good enough” understanding of sockets.
Alright, by this point you can safely check off that first objective :
- Understand what sockets are. Not just the ‘what’ but also the ‘why’.
It is important to realize that the points above illustrate what sockets are. These characteristics are not specific to only network sockets.
Let’s move on to the next objective now.
Network Interactions
Machines interact with each other over a network in a specific rule-based manner. This is known as a protocol. There are two main protocols that machines use in order to communicate with each other :
TCP
The Transmission Control Protocol ( TCP ) is a transport-layer protocol. Without going into a detailed discussion about the different layers ( and how they vary according to the model ), just know that the transport-layer communicates with other machines without worrying about how the data is “packaged” and “shipped” over the network. The transport layer is more concerned about how to ensure that data is transferred appropriately.
Now, TCP is a “connection-oriented” protocol. This means that the machines that wish to communicate will ensure that data sent is received successfully. There are a lot of applications for this, some of which are :
- File sharing (FTP)
- Web Applications
- Remote Desktop Connections (SSH, Telnet)
- Sending / Receiving Email ( IMAP/POP3)
UDP
The User Datagram Protocol is another transport-layer protocol. It is a “connectionless” protocol. What this means is that the data is sent from one machine to another without regard to whether it was successfully received on or not.
But we should check for some type of acknowledgement from the other machine right? Not all the time!
Imagine that you are playing a MMORPG like World of Warcraft. Would you spend 700ms second for each combat ability? To the non-gamers reading this, you should note that anything below 100ms is considered acceptable. The higher your latency, the slower your chances of reacting ( and thus the likelihood of loss increases exponentially ). This is where UDP is used.
Another example is that of video-calling applications. You would rather send the frames as they come rather than wait for frames to be sent, processed, acknowledged and then shown. That would take too long. It would also lead to very slow conversations.
But what’s a quick way to remember this?
As TCP sends acknowledgements for the data received, you can think of TCP as a letter sent with not only the receiver’s address but also with the sender’s address (and stamps!). UDP is just a letter with stamps that contains only the sender’s address. If it’s lost, it’s gone.
This is a good analogy to remember when people refer to TCP being “expensive”. Just replace the ‘stamps’ with system resources and ‘addresses’ with IP addresses and ports.
Alright, but how does this tie in to network programming?
Well, we will be using sockets that use the TCP protocol to implement a simple scenario : client-server interaction.
As I lack web development skills, we’ll stick to a CLI ( Command Line Interface ).
The first order of business is to figure out what our Python3 scripts will do. Objective : Send data from the server to the client and from the client to the server, printing both on the console.
Alright, fire up your editor of choice and save two files : client.py and server.py.
Now, in client.py let’s type in the following :
import socket
This tells Python that we’ll be using the socket library from the standard Python libraries.
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
There’s a lot going on here.
Till now, we talked about sockets in a generalized manner. But when communicating with other end-points, it’s necessary to define what type of communication is going to occur. This why sockets are defined by three characteristics :
* Domain
* Type
* Transport protocol
If end-to-end nodes have to communicate with each other, then they must do so in a standardized manner ( Can you imagine if every operating system - or worse, every application - communicated in its own way? ). This is the responsibility of the Message Passing Interface ( MPI ). Within the MPI, there are different communication domains defined to help different types of communication to occur.
In this case, AF_INET defines the communication domain within which communication is going to take place. It selects the address family ( hence, the “AF” abbreviation! ) and it’s format. INET refers to IPv4 protocol addresses.
Next comes ‘type’. It specifies the type of sockets created. SOCK_STREAM refers to two way byte streams that are reliable and connection-oriented.
Finally, the ‘protocol’ defines the protocol used with the socket, such as TCP or UDP. Now, go back up to when we printed out fd = 3. Can you now understand what the rest of the statements mean?
Before moving on to the next code block, remember that there are two “states” of stream sockets : active and passive.
Active sockets are those that accept connections and are able to connect to other sockets. Passive sockets can only accept connections, it cannot be used to start connection requests. When it comes to client connections, we must not only connect to the server but also accept data from the server. Hence, client sockets are active. Servers, are therefore passive.
On the next line, type :
s.connect(socket.gethostname(), 1234)
This is very simple. It just fetches the hostname from the system and attaches a port number to the socket for connection. ( Just remember to keep port numbers > 1024. Port numbers < 1024 are reserved for special purposes. )
raw_server_data_receieved = s.recv(1024)
server_data_received = raw_server_data_received.decode("UTF-8")
Here, we are setting up a buffer of 1024 bytes for the socket. Basically, we are setting a cap on the amount of data that is received at a time. This ensures that we receive the stream of bytes ( not packets! ) in fixed iterations. If the number of bytes is greater, then it is simply queued by the OS for the next recv() call.
‘raw_server_data_received’ is the actual content that is sent by the server, but it is in byte-form. We have to decode it in “UTF-8” format, so that we can read it.
Now, that we have received the message, we can print it out as :
print(server_data_received)
Note : This won’t work yet, because we have not defined any server to communicate with in the first place. Once the two scripts are defined, we can use this to print the message.
s.send(bytes('Message sent from client-side.', "UTF-8"))
This sends data to the server. You might be wondering how this knows which server to send the data? We haven’t defined as address right?
We actually have. Look back at the second line. As we are using the local machine as both client and server, gethostname() refers to the server itself!
Alright, that looks good for the client side! To recap :
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(socket.gethostname(), 1234)
raw_server_data_receieved = s.recv(1024)
server_data_received = raw_server_data_received.decode("UTF-8")
print(server_data_received)
s.send(bytes('Message sent from client-side.', "UTF-8"))
On the server side, we type:
import socket
v = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
v.bind(socket.gethostname(), 1234)
v.listen(5)
Here, bind() is the method of making the socket passive. listen() is a method that defines the maximum number of connections that a server will handle simultaneously.
Then, we type :
clientsocket, address = v.accept()
clientsocket.send(bytes('Message from the server.', "UTF-8"))
The accept() method return two values : the client socket object and the address ( IPv4 ) of the client. This is important as we can specify which socket to communicate with ( because the server can handle multiple connections ).
Next :
raw_client_message = clientsocket.recv(1024)
client_message = raw_client_message.decode("UTF-8")
print(client_message)
We can use the client socket object to recv bytes from that specific socket and decode it accordingly.
So, to sum the server side :
import socket
v = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
v.bind(socket.gethostname(), 1234)
v.listen(5)
clientsocket, address = v.accept()
clientsocket.send(bytes('Message from the server.', "UTF-8"))
raw_client_message = clientsocket.recv(1024)
client_message = raw_client_message.decode("UTF-8")
print(client_message)
Now, the only thing left is to execute the scripts.
First, we’ll run the server script ( can you think why? ). In your terminal, type in :
python3 server.py
Then, in another terminal window, type in :
python3 client.py
You’ll get a message in the server terminal as :
Message from the client.
In the client terminal, you should see :
Message from the server.
Congratulations! You’ve successfully coded a simple client-server interaction. Let’s cross of the second objective now.
- Code a simple client-server script in Python3.
That certainly was a lot of information, but I’m confident that you’ll understand it in no time. Patience really is the key here.
Conclusion
Great job! Now, you should have a foundational understanding of sockets and a working knowledge of network sockets in Python. This isn’t the end by any means, but now you can explore the intricacies of network sockets with confidence.
Thank you for taking the time to read through the article, I hope you’ve learnt something new.
Bonus Questions
If you have the time, I think these are some interesting questions that you can find the answer to :
- Can you find out how many file descriptors can be opened at a given time on a standard Linux machine?
- Why are specific the non-negative integers 0,1, and 2 assigned to STDIN, STDOUT and STDERR?
- What socket type is used for UDP connections?
References :
5.Python3 Socket Module Documentation
6.Centex's Guide to Socket Programming
7.RealPython's Guide to Socket Programming
Note : I personally found Beej’s guide to be much more interesting after I developed a basic understanding. The points I have laid out should be enough to help you run through the guide with relative easy. It has a lot of information - I seriously urge you to read through it.