Sunday, October 31, 2021

[SOLVED] Encoding an HTTP request in Python

Issue

Short version: Is there any easy API for encoding an HTTP request (and decoding the response) without actually transmitting and receiving the encoded bytes as part of the process?

Long version: I'm writing some embedded software which uses paramiko to open an SSH session with a server. I then need to make an HTTP request across an SSH channel opened with transport.open_channel('direct-tcpip', <remote address>, <source address>).

requests has is transport adapters, which lets you substitute your own transport. But the send interface provided by BaseAdapter just accepts a PreparedRequest object which (a) doesn't provide the remote address in any useful way; you need to parse the URL to find out the host and port and (b) doesn't provide an encoded version of the request, only a dictionary of headers and the encoded body (if any). It also gives you no help in decoding the response. HTTPAdapter defers the whole lot, including encoding the request, making the network connection, sending the bytes, receiving the response bytes and decoding the response, to urllib3.

urllib3 likewise defers to http.client and http.client's HTTPConnection class has encoding and network operations all jumbled up together.

Is there a simple way to say, "Give me a bunch of bytes to send to an HTTP server," and "Here's a bunch of bytes from an HTTP server; turn them into a useful Python object"?


Solution

This is the simplest implementation of this that I can come up with:

from http.client import HTTPConnection
import requests
from requests.structures import CaseInsensitiveDict
from urllib.parse import urlparse
from argparse import ArgumentParser

class TunneledHTTPConnection(HTTPConnection):
    def __init__(self, transport, *args, **kwargs):
        self.ssh_transport = transport
        HTTPConnection.__init__(self, *args, **kwargs)

    def connect(self):
        self.sock = self.ssh_transport.open_channel(
            'direct-tcpip', (self.host, self.port), ('localhost', 0)
        )

class TunneledHTTPAdapter(requests.adapters.BaseAdapter):
    def __init__(self, transport):
        self.transport = transport

    def close(self):
        pass

    def send(self, request, **kwargs):
        scheme, location, path, params, query, anchor = urlparse(request.url)
        if ':' in location:
            host, port = location.split(':')
            port = int(port)
        else:
            host = location
            port = 80

        connection = TunneledHTTPConnection(self.transport, host, port)
        connection.request(method=request.method,
                           url=request.url,
                           body=request.body,
                           headers=request.headers)
        r = connection.getresponse()
        resp = requests.Response()
        resp.status_code = r.status
        resp.headers = CaseInsensitiveDict(r.headers)
        resp.raw = r
        resp.reason = r.reason
        resp.url = request.url
        resp.request = request
        resp.connection = connection
        resp.encoding = requests.utils.get_encoding_from_headers(response.headers)
        requests.cookies.extract_cookies_to_jar(resp.cookies, request, r)
        return resp

if __name__ == '__main__':
    import paramiko

    parser = ArgumentParser()
    parser.add_argument('-p', help='Port the SSH server listens on', default=22)
    parser.add_argument('host', help='SSH server to tunnel through')
    parser.add_argument('username', help='Username on SSH server')
    parser.add_argument('url', help='URL to perform HTTP GET on')
    args = parser.parse_args()

    client = paramiko.SSHClient()
    client.load_system_host_keys()
    client.connect(args.host, args.p, username=args.username)

    transport = client.get_transport()

    s = requests.Session()
    s.mount(url, TunneledHTTPAdapter(transport))
    response = s.get(url)
    print(response.text)

There are various options to BaseAdapter.send that it doesn't handle, and it completely ignores issues like connection pooling and so on, but it gets the job done.



Answered By - Tom