Python Programming Tutorials

Sockets with Python Intro

Sockets are used in networking. The idea of a socket is to aid in the communication between two entities. When you view a website, you are opening a port and connecting to that website via sockets. In this, you are the client, and the website is the server. Quite literally, you are served data.

What are Ports and what are Sockets?

A natural point of confusion here is the difference between sockets and ports. You can think of a port much like a shipping port, where boats dock at the port and unload goods. Then, you can think of the ship itself as the socket. The ocean is the internet. Much like shipping ports, a socket (our ship in this metaphor), is bound by a specific port. Docking at a different port is not allowed, for ships or sockets!

Now, let's go ahead and play with ports and sockets in Python! This can be a slightly confusing topic, so I will do my best to document everything. The video should help as well if you are finding yourself confused.

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print(s)

So, we must import socket to use it. This is an included module with your Python 3 distribution

Next, "s" here is being returned a "socket descriptor" by socket.socket. We then print "s" to show what this looks like.

Generally, we use sockets to communicate between a couple of places, so let's show an example of that. One of the most common transmissions of data is between a "client" and "server," most often in the case of a user visiting a website and being served web-content, much like you are being served this page right now. Sockets did that for you.

server = 'reddit.com'

port = 80

server_ip = socket.gethostbyname(server)
print(server_ip)

Many public websites will have port 80 open, which is for HTTP access. Most websites will have port 22 open, which is for SSH (secure shell), and many will have 20 and 21 open, which are used for FTP (File Transfer Protocol). If a website uses HTTPS, then port 443 will be open as well. Sometimes, this is going to be required, like with this website, HTTPS is forced, you can't use regular HTTP.

Here is some more information on open ports and hacking: Open Ports and Hacking

Do open ports mean you are going to be hacked?

It is a common misconception, perpetuated by the media, that an "open port" is all one needs to "hack" a something. The truth is, all websites have open ports, but each port is expecting a specific socket (ship in our metaphor from before), and that specific socket's type of payload of data (ship's cargo) is also known and expected before-hand.

Thus, in our metaphor, if we have a ship that is supposed to be bringing 50 crates full of coffee, but has instead brought over 50 crates of swordfish, immediate red flags are thrown. The same is true with sockets and ports. The socket / ship can be denied.

Then how do hackers get in?

The way sockets and ports are abused by hackers is by taking advantage of vulnerabilities in the programs that have opened specific ports. Every program that uses the internet to provide you a service uses ports, and opens them to the world. Take Skype for example. Skype uses ports 80 and 443. You already know what port 80 is for. 443 is for other types of connections besides port 80's HTTP connections. Via port 443, Skype is expecting a certain type of data, but maybe their security is not perfect, and people are able to use port 443 maliciously because Skype's protocol is not perfectly secure.

Thus, what hackers tend to do, is scan open ports. From the open ports, many times, they can deduce what programs you are running, and proceed to try various attacks against that program's vulnerabilities, especially the historical ones that are generally made public. This is why it is important to keep your software up to date. Most software updates contain security upgrades, fixes, or patches. Even if not specifically explained, the very act of patching an area of code can alert someone that there was something weak there before.

So, above, we were able to access reddit.com via port 80. From there, we were able to determine the server's IP address by using gethostbyname().

Now, let's make a request, making sure it is in-line with what the port will find acceptable from our socket:

request = "GET / HTTP/1.1\nHost: "+server+"\n\n"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("reddit.com", 80))

Above, we defined our request as an HTTP request, where we wanted to "GET" data from the "Host" of PythonProgramming.net

Next, we defined our socket in the same manner as we had before.

Finally, we make our connection to PythonProgramming.net on port 80. This is just a connection. We have defined out request, but not actually made any request, so let's make the request:

s.send(request.encode())
result = s.recv(4096)

print(result)

First we're sending the request, and encoding it.

Then we're using s.recv to receive the resulting data. The 4096 is a buffer for the data, so that you receive the data in manageable chunks rather than all at once. Finally, we're just printing the result (Though it should be noted this is printing only the first part of the buffer, so the buffer in this case is almost a waste.)

With Python 3, one of the major changes from Python 2 was the differing treatment of strings and bytes. If you want to make a request that is a string, you need to encode it. You will also need to decode any return that you wish to treat like a string. You should just get into the habit mentally that everything you send out over the internet needs to be encoded, and all that you receive needs a .decode, every time! Python 2 implicitly handled this for us. Python 3 requires us to be explicit, which is more Pythonic anyways.

One of the main pillars of Python is that "explicit is better than implicit. If you have not yet, open a console, and do the following import:

import this

Since I said the buffer was almost a waste, I should probably show how to make the output actually buffer as well. Here's how:

Instead of using print(result), comment or delete that, then do:

while (len(result) > 0):
    print(result)
    result = s.recv(4096)

If you wanted to do this with an HTTPS forcing website, such as PythonProgramming.net, you would instead do something like:

import socket
import ssl

context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True
context.load_default_certs()

server = "pythonprogramming.net"
port = 443
server_ip = socket.gethostbyname(server)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s = context.wrap_socket(s, server_hostname=server)

request = "GET / HTTP/1.1\nHost: "+server+"\n\n"

s.connect((server,port))
s.send(request.encode())
result = s.recv(4096)

while (len(result) > 0):
    print(result)
    result = s.recv(4096)

The next tutorial:

Python Introduction
Print Function and Strings
Math with Python
Variables Python Tutorial
While Loop Python Tutorial
For Loop Python Tutorial
If Statement Python Tutorial
If Else Python Tutorial
If Elif Else Python Tutorial
Functions Python Tutorial
Function Parameters Python Tutorial
Function Parameter Defaults Python Tutorial
Global and Local Variables Python Tutorial
Installing Modules Python Tutorial
How to download and install Python Packages and Modules with Pip
Common Errors Python Tutorial
Writing to a File Python Tutorial
Appending Files Python Tutorial
Reading from Files Python Tutorial
Classes Python Tutorial
Frequently asked Questions Python Tutorial
Getting User Input Python Tutorial
Statistics Module Python Tutorial
Module import Syntax Python Tutorial
Making your own Modules Python Tutorial
Python Lists vs Tuples
List Manipulation Python Tutorial
Multi-dimensional lists Python Tutorial
Reading CSV files in Python
Try and Except Error handling Python Tutorial
Multi-Line printing Python Tutorial
Python dictionaries
Built in functions Python Tutorial
OS Module Python Tutorial
SYS module Python Tutorial
Python urllib tutorial for Accessing the Internet
Regular Expressions with re Python Tutorial
How to Parse a Website with regex and urllib Python Tutorial
Tkinter intro
Tkinter buttons
Tkinter event handling
Tkinter menu bar
Tkinter images, text, and conclusion
Threading module
CX_Freeze Python Tutorial
The Subprocess Module Python Tutorial
Matplotlib Crash Course Python Tutorial
Python ftplib Tutorial
Sockets with Python Intro
Simple Port Scanner with Sockets
Threaded Port Scanner
Binding and Listening with Sockets
Client Server System with Sockets
Python 2to3 for Converting Python 2 scripts to Python 3
Python Pickle Module for saving Objects by serialization
Eval Module with Python Tutorial
Exec with Python Tutorial