Sockets with Python Intro




Sockets are used in networking. The idea of a socket is to aid in the communication between two entities. When you view a website, you are opening a port and connecting to that website via sockets. In this, you are the client, and the website is the server. Quite literally, you are served data.

What are Ports and what are Sockets?

A natural point of confusion here is the difference between sockets and ports. You can think of a port much like a shipping port, where boats dock at the port and unload goods. Then, you can think of the ship itself as the socket. The ocean is the internet. Much like shipping ports, a socket (our ship in this metaphor), is bound by a specific port. Docking at a different port is not allowed, for ships or sockets!

Now, let's go ahead and play with ports and sockets in Python! This can be a slightly confusing topic, so I will do my best to document everything. The video should help as well if you are finding yourself confused.

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print(s)
		

So, we must import socket to use it. This is an included module with your Python 3 distribution

Next, "s" here is being returned a "socket descriptor" by socket.socket. We then print "s" to show what this looks like.

Generally, we use sockets to communicate between a couple of places, so let's show an example of that. One of the most common transmissions of data is between a "client" and "server," most often in the case of a user visiting a website and being served web-content, much like you are being served this page right now. Sockets did that for you.

server = 'reddit.com'

port = 80

server_ip = socket.gethostbyname(server)
print(server_ip)
		

Many public websites will have port 80 open, which is for HTTP access. Most websites will have port 22 open, which is for SSH (secure shell), and many will have 20 and 21 open, which are used for FTP (File Transfer Protocol). If a website uses HTTPS, then port 443 will be open as well. Sometimes, this is going to be required, like with this website, HTTPS is forced, you can't use regular HTTP.

Here is some more information on open ports and hacking:

Do open ports mean you are going to be hacked?

It is a common misconception, perpetuated by the media, that an "open port" is all one needs to "hack" a something. The truth is, all websites have open ports, but each port is expecting a specific socket (ship in our metaphor from before), and that specific socket's type of payload of data (ship's cargo) is also known and expected before-hand.

Thus, in our metaphor, if we have a ship that is supposed to be bringing 50 crates full of coffee, but has instead brought over 50 crates of swordfish, immediate red flags are thrown. The same is true with sockets and ports. The socket / ship can be denied.

Then how do hackers get in?

The way sockets and ports are abused by hackers is by taking advantage of vulnerabilities in the programs that have opened specific ports. Every program that uses the internet to provide you a service uses ports, and opens them to the world. Take Skype for example. Skype uses ports 80 and 443. You already know what port 80 is for. 443 is for other types of connections besides port 80's HTTP connections. Via port 443, Skype is expecting a certain type of data, but maybe their security is not perfect, and people are able to use port 443 maliciously because Skype's protocol is not perfectly secure.

Thus, what hackers tend to do, is scan open ports. From the open ports, many times, they can deduce what programs you are running, and proceed to try various attacks against that program's vulnerabilities, especially the historical ones that are generally made public. This is why it is important to keep your software up to date. Most software updates contain security upgrades, fixes, or patches. Even if not specifically explained, the very act of patching an area of code can alert someone that there was something weak there before.

So, above, we were able to access reddit.com via port 80. From there, we were able to determine the server's IP address by using gethostbyname().

Now, let's make a request, making sure it is in-line with what the port will find acceptable from our socket:

request = "GET / HTTP/1.1\nHost: "+server+"\n\n"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("reddit.com", 80))
	  

Above, we defined our request as an HTTP request, where we wanted to "GET" data from the "Host" of PythonProgramming.net

Next, we defined our socket in the same manner as we had before.

Finally, we make our connection to PythonProgramming.net on port 80. This is just a connection. We have defined out request, but not actually made any request, so let's make the request:

s.send(request.encode())
result = s.recv(4096)

print(result)
		

First we're sending the request, and it.

Then we're using s.recv to receive the resulting data. The 4096 is a buffer for the data, so that you receive the data in manageable chunks rather than all at once. Finally, we're just printing the result (Though it should be noted this is printing only the first part of the buffer, so the buffer in this case is almost a waste.)

With Python 3, one of the major changes from Python 2 was the differing treatment of strings and bytes. If you want to make a request that is a string, you need to encode it. You will also need to decode any return that you wish to treat like a string. You should just get into the habit mentally that everything you send out over the internet needs to be encoded, and all that you receive needs a .decode, every time! Python 2 implicitly handled this for us. Python 3 requires us to be explicit, which is more anyways.

One of the main pillars of Python is that "explicit is better than implicit. If you have not yet, open a console, and do the following import:

import this

Since I said the buffer was almost a waste, I should probably show how to make the output actually buffer as well. Here's how:

Instead of using print(result), comment or delete that, then do:

while (len(result) > 0):
    print(result)
    result = s.recv(4096)
	  

If you wanted to do this with an HTTPS forcing website, such as PythonProgramming.net, you would instead do something like:

import socket
import ssl

context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.check_hostname = True
context.load_default_certs()

server = "pythonprogramming.net"
port = 443
server_ip = socket.gethostbyname(server)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s = context.wrap_socket(s, server_hostname=server)

request = "GET / HTTP/1.1\nHost: "+server+"\n\n"

s.connect((server,port))
s.send(request.encode())
result = s.recv(4096)

while (len(result) > 0):
    print(result)
    result = s.recv(4096)

The next tutorial:





  • Python Introduction
  • Print Function and Strings
  • Math with Python
  • Variables Python Tutorial
  • While Loop Python Tutorial
  • For Loop Python Tutorial
  • If Statement Python Tutorial
  • If Else Python Tutorial
  • If Elif Else Python Tutorial
  • Functions Python Tutorial
  • Function Parameters Python Tutorial
  • Function Parameter Defaults Python Tutorial
  • Global and Local Variables Python Tutorial
  • Installing Modules Python Tutorial
  • How to download and install Python Packages and Modules with Pip
  • Common Errors Python Tutorial
  • Writing to a File Python Tutorial
  • Appending Files Python Tutorial
  • Reading from Files Python Tutorial
  • Classes Python Tutorial
  • Frequently asked Questions Python Tutorial
  • Getting User Input Python Tutorial
  • Statistics Module Python Tutorial
  • Module import Syntax Python Tutorial
  • Making your own Modules Python Tutorial
  • Python Lists vs Tuples
  • List Manipulation Python Tutorial
  • Multi-dimensional lists Python Tutorial
  • Reading CSV files in Python
  • Try and Except Error handling Python Tutorial
  • Multi-Line printing Python Tutorial
  • Python dictionaries
  • Built in functions Python Tutorial
  • OS Module Python Tutorial
  • SYS module Python Tutorial
  • Python urllib tutorial for Accessing the Internet
  • Regular Expressions with re Python Tutorial
  • How to Parse a Website with regex and urllib Python Tutorial
  • Tkinter intro
  • Tkinter buttons
  • Tkinter event handling
  • Tkinter menu bar
  • Tkinter images, text, and conclusion
  • Threading module
  • CX_Freeze Python Tutorial
  • The Subprocess Module Python Tutorial
  • Matplotlib Crash Course Python Tutorial
  • Python ftplib Tutorial
  • Sockets with Python Intro
  • Simple Port Scanner with Sockets
  • Threaded Port Scanner
  • Binding and Listening with Sockets
  • Client Server System with Sockets
  • Python 2to3 for Converting Python 2 scripts to Python 3
  • Python Pickle Module for saving Objects by serialization
  • Eval Module with Python Tutorial
  • Exec with Python Tutorial