Thank you, java.net (e.g. java.net.http.HttpRequest) looks like it might be on the right track, but I’m still not sure it’s exactly what I’m after.
Hopefully this concrete pseudocode/example will make it crystal clear:
I need to be able to take a raw HTTP request like this:
GET /hello/world HTTP/1.1
Host: www.google.com
user-agent: curl/7.68.0
accept: */*
…parse it into some data structure that can be inspected and manipulated, e.g.:
request.method == "GET"
request.host == "www.google.com"
...
request.headers.set('Authorization', 'abc')
…serialize it back into the raw HTTP request it started as.
I need to also be able to do this for HTTP responses as well.
Something subtle I glossed over, but which the Python h11 library provides (and which I require) is the ability to know when a request/response has been completely received/parsed. E.g. If I’m reading a request/response from a TCP socket, I need to know when to stop reading from the socket (i.e. when I have the whole request/response).
This is copied from the README of the h11 GitHub project:
At a high level, working with h11 goes like this:
- First, create an
h11.Connection object to track the state of a single HTTP/1.1 connection.
- When you read data off the network, pass it to
conn.receive_data(...) ; you’ll get back a list of objects representing high-level HTTP “events”.
- When you want to send a high-level HTTP event, create the corresponding “event” object and pass it to
conn.send(...) ; this will give you back some bytes that you can then push out through the network.
For example, a client might instantiate and then send a h11.Request object, then zero or more h11.Data objects for the request body (e.g., if this is a POST), and then a h11.EndOfMessage to indicate the end of the message. Then the server would then send back a h11.Response , some h11.Data , and its own h11.EndOfMessage . If either side violates the protocol, you’ll get a h11.ProtocolError exception.
From the java.net classes I’ve seen, they seem to be coupled to the notion of sending/receiving requests/responses (to get objects of the classes) or to manually create the objects (rather than them being created from parsing raw HTTP requests/responses).