Jonathan. Frech’s WebBlog

Hoisting HTTP headers home (#249)

Jonathan Frech,

While developing my new minimalistic HTTP backend — a web server called vanadium —, scourering MDN’s HTTP header doc­u­men­ta­tion and inspecting the headers sent by web servers serving pub­lic websites, I first stumbled upon the header “Server” and decided for my server to tell the world its name. Yet most servers send a lot of headers, many of which non-stan­dard and most of un­clear purpose to myself. This discovery made me think: HTTP headers may be the most commonly sent tex­tu­al data virtually invisible to most computer operators due to com­mon web browser’s failure to com­mu­ni­cate them. Thus I will in this post shine a light on my findings examining var­i­ous web server’s initial banter.

Interrogation of web servers hinges on using a ca­pa­ble client, as with­out the server’s headers, no web browser could fulfill its role as a HTTP client. Yet this com­mu­ni­ca­tion is often hidden from the user. One client which allows viewing headers is curl, although one has to read its man page thoroughly:

$ curl -fsSLD- -o/dev/null https://...

Above, curl is invoked to fail silently, be silent about its network progress yet still Show errors, fol­low Location redirects and Dump the HTTP response header to stan­dard output (indicated by a singular dash). Fur­ther­more, the response’s body is output to the null device.
Note that web servers may respond differently to dif­fer­ent user agents, about which intel is acquired via the HTTP header “User-Agent” sent by the connecting client. One may specifically remove or set this header using $ curl -A '' or the same with a non-empty flag argument. Since I did nei­ther, curl sent its own name to­geth­er with its version, for me “curl/7.77.0” and “curl/7.74.0”.

Amusing HTTP headers

Looking at the sometimes dozens of headers sent by var­i­ous websites, among the cookies and entity tags one finds sparsely sprinkled innocent items of in­for­ma­tion. Heise for one, a German publishing house operating “heise.de” and “ct.de”, configured some of their Apache and nginx servers to spew a whopping four non-stan­dard inert HTTP headers at unsuspecting surfers:

$ curl -fsSLD- -o/dev/null heise.de | grep ^X
X-Cobbler: servo65.heise.de
X-Pect: The Spanish Inquisition
X-Clacks-Overhead: GNU Terry Pratchett
X-42: DON'T PANIC
-=-

Calling those who neglegt to be wary of brutal Catholics foolish is wise in any situation, even when serving web pages. Some find referencing an overused popular culture in­te­ger to be fulfilling and panicking is seldom advisable. The first header is a to­tal mystery; it seems to have nothing to do with the build and deployment system writ­ten in slow snake speak and I nei­ther get the joke nor manage to find any­thing conclusive about it. The second-from-last header, how­ev­er, has a real story be­hind it.
First seeing “X-Clacks-Overhead: GNU Terry Pratchett”, I thought it referenced the hoofed non-Unix animal. Yet under closer inspection, it is a dif­fer­ent beast entirely: the three consecutive uppercase letters instruct a fictional network to keep its own au­thor’s name and thereby legacy alive in the real world.⁠¹⁠²⁠³ How­ev­er, I fear the commands G, N and U are never truly executed since HTTP servers gen­er­ate these bytes which are then probably promptly not under­stood by any client and in the process discarded.

Rather than paying tribute to a deceased fantasy writer by sending commands to ma­chines which cannot process them, Automattic appears to em­ploy an unconventional hiring strat­e­gy, setting “x-hacker”:

$ curl -fsSLD- -o/dev/null wordpress.com | grep ^x-h
x-hacker: If you're reading this, you shouldvisit automattic.com/jobs and apply to join the fun, mention this header.

Whilst original and somewhat charming, I am doubtful of the effectiveness of this targeted advertising campaign. Though considering the prevelance of WordPress instances, the above message may be one of the most-viewed ad of mankind — if overwhelmingly read by non-humans.
Doing the suggested, one is instructed to go where one is — “x-hacker” is sent with the same message — yet redirected to “/work-with-us” and faced with a rather quirky header:

$ curl -fsSLD- -o/dev/null http://automattic.com/jobs | grep ^x-n | uniq
x-nananana: Batcache

Most bizarrely, “wordpress.org” seems to reference a character from Disney’s Frozen, yet I can nei­ther confirm this in­ter­pre­ta­tion nor make sense of it.

$ curl -fsSLD- -o/dev/null wordpress.org | grep ^x-o
x-olaf: ⛄

Functionally in­trigu­ing HTTP headers

When I first polled “mit.edu”, I was certain to have found a typo in one of the headers:

$ curl -fsSLD- -o/dev/null mit.edu | grep ^X-C
X-Cnection: close

How­ev­er, some brief search-engining revealed it to be an F5 custom header⁠⁴, very reminiscent of HTTP’s stan­dard “Referer” header spell­ing oddity⁠⁵.

Visiting “gnu.org”, I was sur­prised to both learn about the existence of and see in action the stan­dard HTTP header “Content-Location”, allowing a web server to inform clients about alternate URLs to the requested resource.

$ curl -fsSLD- -o/dev/null gnu.org | grep ^Content-Lo
Content-Location: home.html

And both “gnu.org” as well as “gnu.org/home.html”, to which “gnu.org/index.html” redirects, serve the GNU homepage.

Yet not on­ly standardized headers which are used by nigh no one are sent; deprecated non-stan­dard headers are as well: “google.com” defines the following P3P policy:

$ curl -fsSLD- -o/dev/null google.com | grep ^P
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."

Which lead me to ask what “P3P” is in the first place. And no wonder I never heard about it since it had been already retired in late summer of 2018.⁠⁶ It seems to have attempted to standardize website’s privacy policies. A noble goal indeed, if not potentially doomed to be all but a pipe dream in the modern usage of the web.
Google’s goals, how­ev­er, are un­clear to me. One may be lead to believe that political symbolism is at play. After all, this nei­ther is a P3P policy nor would it make sense to send such a policy in 2021.

Cookie clusters

One of the most infuriating findings is many website’s handling of cookies. It truly is the worst of both worlds: upon loading, the websites clutter them­selves with a cookie banner or lock them­selves be­hind a cookie screen when cookies were already set using the “Set-Cookie” header.
But most irritatingly, the Bundesbeauftragte für den Da­ten­schutz und die Informationsfreiheit operates a web server which sets two cookies via HTTP headers, does not show a cookie banner as far as I can tell and writes in their privacy no­tice “8. Sonstige In­for­ma­ti­on­en ; Es besteht hin­sicht­lich der Da­ten­ver­ar­bei­tung des BfDI kein Be­schwer­de­recht bei einer Aufsichtsbehörde. Eine automatisierte Ent­schei­dungs­fin­dung fin­det nicht statt.⁠⁷ which translates to informing about the nonexistence of any right for complaint in regards to data processing.
I find the en­tire situation somewhat shady.

$ curl -fsSLD- -o/dev/null bfdi.bund.de | grep -i cookie
Set-Cookie: AL_SESS-S=AeWgHd5rUVOyjSxVj2CjbYUabfPTQS8Ed_n3FbXJzoAhYqQCn1gJDf91LTcH_AgT0bze; Path=/; Secure; HttpOnly; SameSite=Lax
Set-Cookie: AL_SESS-S=AWtabijCgd7AlQ!92sxZTaxs5Pr_5Clc3Rnn3mlt7Pn3qXTbuqO2i1vExDt4C3qceYd!; Path=/; Secure; HttpOnly; SameSite=Lax

Terse toilers

Lastly, I would like to men­tion websites run by re­served servers not eager to force their metadata, cookies or jokes onto potential passersby. I found of note because of their brevity “openbsd.org”, “oeis.org” and “fefe.de”. Contrastingly, “ibm.com” sets an obscene amount of headers.

$ curl -fsSLD- -o/dev/null www.openbsd.org | sed '1d;$d' | wc -l
6

$ curl -fsSLD- -o/dev/null oeis.org | sed '1d;$d' | wc -l
5

$ curl -fsSLD- -o/dev/null www.fefe.de | sed '1d;$d' | wc -l
7

$ curl -fsSLD- -o/dev/null https://www.ibm.com/de-de | sed '1d;$d' | wc -l
29

Closing thoughts

I am routinely fascinated by what RFCs have defined in the HTTP stan­dard over the years; features I never knew existed or nearly no browser supports. Yet as I started to look at run­ning server’s responses, I saw a whole par­al­lel world of non-stan­dard headers, with their meaning and aspirations as inspiring as they are wacky and waning. A whole world seen by on­ly the fewest of people, with you now amongst them.


[1]https://xclacksoverhead.org/home/about [2021-10-01]
[2]http://gnuterrypratchett.com/ [2021-10-01]
[3]https://www.bbc.com/news/technology-31907768 [2021-10-01]
[4]https://support.f5.com/csp/article/K6997 [2021-10-01]
[5]https://pkg.go.dev/net/http#Request.Referer [2021-10-01]
[6]https://www.w3.org/TR/P3P11/ [2021-10-01]
[7]https://www.bfdi.bund.de/DE/Meta/Datenschutz/datenschutz_node.html [2021-10-01]