Thank you, java.net
(e.g. java.net.http.HttpRequest
) looks like it might be on the right track, but I’m still not sure it’s exactly what I’m after.
Hopefully this concrete pseudocode/example will make it crystal clear:
I need to be able to take a raw HTTP request like this:
GET /hello/world HTTP/1.1
Host: www.google.com
user-agent: curl/7.68.0
accept: */*
…parse it into some data structure that can be inspected and manipulated, e.g.:
request.method == "GET"
request.host == "www.google.com"
...
request.headers.set('Authorization', 'abc')
…serialize it back into the raw HTTP request it started as.
I need to also be able to do this for HTTP responses as well.
Something subtle I glossed over, but which the Python h11 library provides (and which I require) is the ability to know when a request/response has been completely received/parsed. E.g. If I’m reading a request/response from a TCP socket, I need to know when to stop reading from the socket (i.e. when I have the whole request/response).
This is copied from the README of the h11 GitHub project:
At a high level, working with h11 goes like this:
- First, create an
h11.Connection
object to track the state of a single HTTP/1.1 connection.
- When you read data off the network, pass it to
conn.receive_data(...)
; you’ll get back a list of objects representing high-level HTTP “events”.
- When you want to send a high-level HTTP event, create the corresponding “event” object and pass it to
conn.send(...)
; this will give you back some bytes that you can then push out through the network.
For example, a client might instantiate and then send a h11.Request
object, then zero or more h11.Data
objects for the request body (e.g., if this is a POST), and then a h11.EndOfMessage
to indicate the end of the message. Then the server would then send back a h11.Response
, some h11.Data
, and its own h11.EndOfMessage
. If either side violates the protocol, you’ll get a h11.ProtocolError
exception.
From the java.net
classes I’ve seen, they seem to be coupled to the notion of sending/receiving requests/responses (to get objects of the classes) or to manually create the objects (rather than them being created from parsing raw HTTP requests/responses).