summaryrefslogtreecommitdiff
path: root/content-org/bsd.org
diff options
context:
space:
mode:
Diffstat (limited to 'content-org/bsd.org')
-rw-r--r--content-org/bsd.org380
1 files changed, 379 insertions, 1 deletions
diff --git a/content-org/bsd.org b/content-org/bsd.org
index c786b7e..386cb7f 100644
--- a/content-org/bsd.org
+++ b/content-org/bsd.org
@@ -1,4 +1,382 @@
-* OpenBSD
+#+TITLE: BSD
+#+AUTHOR: MichaƂ Sapka
+#+URL: https://michal.sapka.me/bsd/
+#+STARTUP: show2levels indent logdone
+
+#+HUGO_BASE_DIR: ~/ghq/vcs.sapka.me/michal-sapka-me/
+#+HUGO_WEIGHT: auto
+#+HUGO_SECTION: bsd
+
+* OpenBSD :@bsd:
+** DONE OpenBSD webstack: Relayd, Httpd and Acme-client
+CLOSED: [2023-07-19 Mon 19:08]
+:PROPERTIES:
+:EXPORT_FILE_NAME: open-bsd-web-stack
+:EXPORT_HUGO_CUSTOM_FRONT_MATTER: abstract How to setup the web server stack work?
+:EXPORT_HUGO_MENU: :menu bsd-openbsd :name "Webstack: Relayd, Httpd and Acme-Client"
+:END:
+OpenBSD comes with three great tools out of the box:
+- httpd(8) - an HTTP daemon
+- relayd(8) - a relay daemon
+- acme-client(1) - a client for Automatic Certificate Management Environment (ACME)
+
+With those free things, we can serve static webpages over TLS. While you most likely already use [[https://www.nginx.com/][NGINX]] or [[https://httpd.apache.org/][Apache]][fn:win], those solutions are complex.
+They work amazingly in enterprise environments where you have people with doctorates in NGINX configuration, but most real-world examples don't need that complexity.
+A static blog most likely doesn't.
+
+Let's set it up.
+
+Due to security concerns, OpenBSD comes with doas(1) instead of sudo(1).
+Copy `/etc/examples/doas.conf` file to `/etc/doas.conf`.
+For all intends, and purposes, from now on doas(1) will work the same as sudo(1).
+
+When the system boots for the very first time, ports 80 and 443 are closed, and only the SSH port is open.
+This alone was a nice surprise for me.
+But it gets better: since all utilities are part of the OSes, they work together perfectly.
+
+Assuming your domain is already pointing at the correct IPs, let's start listening for unencrypted HTTP traffic.
+I will use "michal.sapka.me" as the domain in all examples.
+
+First, Open =/etc/httpd.conf= in your favorite editor and add
+
+#+begin_src shell
+server "michal.sapka.me" {
+ listen on * port 80
+ root "/htdocs/michal-sapka-me"
+}
+#+end_src
+
+Then create a simple HTML file under =/var/www/htdocs/michal-sapka-me/index.html=.
+
+Httpd(8) works chrooted to /var/www/, so it threats this directory as root.
+This makes the "root" option shorter to write, but it also means that the process doesn't have access to anything outside of /var/www/.
+Even if an attacker can break in via the daemon, he will be locked in the www folder, so there is no risk to the rest of the system.
+As I said, OpenBSD is secure by default[fn:nginx-sec].
+
+All we need to do now it to enable the daemon via the handy rcctl(8) tool.
+
+#+begin_src shell
+$ doas rcctl enable httpd
+#+end_src
+
+and to start it
+
+#+begin_src shell
+$ doas rcctl start httpd
+#+end_src
+
+And boom. Opening http://michal.sapka.me shows on our site both on IPv4 and IPv6.
+One thing to note here is the limitation of up to HTTP 1.1.
+HTTP 2 is not yet supported.
+
+Let's add TLS, so we have this cute lock icon.
+For this, we will request a certificate from [[https://letsencrypt.org/][Let's Encrypt]] using acme-client(1). If you used certbot, this will look familiar - just tidier.
+
+First, let's add config to =/etc/acme-client.conf=
+
+#+begin_src shell -n
+authority letsencrypt {
+ api url "https://acme-v02.api.letsencrypt.org/directory"
+ account key "/etc/acme/letsencrypt-privkey.pem"
+}
+
+authority letsencrypt-staging {
+ api url "https://acme-staging.api.letsencrypt.org/directory"
+ account key "/etc/acme/letsencrypt-staging-privkey.pem"
+}
+
+domain michal.sapka.me {
+ domain key "/etc/ssl/private/michal.sapka.me.key"
+ domain full chain certificate "/etc/ssl/michal.sapka.me.crt"
+ sign with letsencrypt
+}
+#+end_src
+
+Lines 1-9 tell our acme-client(1) how to talk with Let's Encrypt, while lines 11-15 allow us to request a certificate for our domain.
+OpenBSD comes preconfigured for Let's Encrypt, so we just enable provided settings.
+
+Nice!
+Next, we need to allow Let's Encrypt challenges.
+Acme-client(1) will manage all required files, and Let's Encrypt can read them via httpd(8).
+Again, like cogs in a well-oiled machine.
+By default, acme-client(1) will write to =/var/www/acme=, so we need to redirect =/.well-known/acme-challenge/*= there. Let's change our =httpd.conf=:
+
+#+begin_src shell
+server "michal.sapka.me" {
+ listen on * port 80
+ root "/htdocs/michal-sapka-me"
+
+ location "/.well-known/acme-challenge/*" {
+ root "/acme"
+ request strip 2
+ }
+}
+#+end_src
+
+We can now either restart httpd(8) or reload it. Let's for the latter.
+
+#+begin_src shell
+$ doas rcctl reload httpd
+#+end_src
+
+Now we can request the certificates
+
+#+begin_src shell
+$ doas rcctl reload httpd
+$ doas acme-client -v michal.sapka.me
+#+end_src
+
+OpenBSDs supplied tools don't print unnecessary information to the user, so we add the =-v= to see what's happening.
+Assuming everything went fine, let's start serving the page with TLS!
+
+For this, we will use relayd(8).
+We could use only httpd(8), but moving it one layer up is easier.
+Relayd(8) also gives us nice options for changing headers or moving some locations to a different process, like we will do with Plaroxy soon.
+This also shows us the big difference between this simple solution and NGINX: while NGINX shovels everything into one process and config, OpenBSD splits it into narrow focus areas.
+
+Let's open =/etc/relayd.conf= and add:
+
+#+begin_src shell -n
+table <httpd> { 127.0.0.1 }
+
+http protocol "https" {
+ tls keypair "michal.sapka.me"
+
+ match request quick header "Host" value "michal.sapka.me" forward to <httpd>
+}
+
+relay "https" {
+ listen on 0.0.0.0 port 443 tls
+ protocol https
+ forward to <httpd> port 8080
+
+}
+relay "https6" {
+ listen on :: port 443 tls
+ protocol https
+ forward to <httpd> port 8080
+}
+#+end_src
+
+Now, I won't go into much detail here, but what happens here is:
+1. We create two relays, one for ipv4 and one for ipv6. One relay can listen on a single port for given IP.
+ Each relay uses protocol "https" to modify and steer the request to a given process.
+2. Both relays set up forwarding to httpd (IP taken from the table on the head of the file) on port 8080.
+3. https protocol adds a TLS key pair for the session. We've got the files from Let's Encrypt in the step above.
+4. We then test each request, and if the host matches "michal.sapka.me" it will be forwarded to httpd(8).
+
+You can also see that relayd(8) can listen on a given IP or all IPs (:: in case of IPv6)
+
+But our httpd(8) listens only on port 80! Let's fix that by changing the `httpd.conf` file:
+
+#+begin_src shell
+server "michal.sapka.me" {
+ listen on * port 8080
+#+end_src
+
+We also need to redirect HTTP to HTTPS. Since we use Relayd(8) only for HTTPS, this will be done in httpd(8). Let's add a second server to our `httpd.conf`:
+
+#+begin_src shell
+server "michal.sapka.me" {
+ listen on * port 80
+ location * {
+ block return 301 "https://$HTTP_HOST$REQUEST_URI"
+ }
+}
+#+end_src
+
+Now, when the user enters the site, the flow will look like:
+
+1. httpd(8) will respond to :80 requests and return a 301 redirect to HTTPS
+2. relayd(8) will catch the request to :443 and forward it on port :8080 to httpd(8)
+3. httpd(8) will serve our site and pass the response to relayd(8) again
+4. relayd(8) can modify headers before returning the response to the client.
+
+Talking about modifying headers, let's apply some extra security!
+We can expand our https protocol with the following:
+
+#+begin_src shell
+ # Return HTTP/HTML error pages to the client
+ return error
+ match request header set "X-Forwarded-For" value "$REMOTE_ADDR"
+ match request header set "X-Forwarded-By" value "$SERVER_ADDR:$SERVER_PORT"
+ match response header remove "Server"
+ match response header append "Strict-Transport-Security" value "max-age=31536000; includeSubDomains"
+ match response header append "X-Frame-Options" value "SAMEORIGIN"
+ match response header append "X-XSS-Protection" value "1; mode=block"
+ match response header append "X-Content-Type-Options" value "nosniff"
+ match response header append "Referrer-Policy" value "strict-origin"
+ match response header append "Content-Security-Policy" value "default-src https:; style-src 'self' \
+ 'unsafe-inline'; font-src 'self' data:; script-src 'self' 'unsafe-inline' 'unsafe-eval'"
+ match response header append "Permissions-Policy" value "accelerometer=(), camera=(), \
+ geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=()"
+
+ # set recommended tcp options
+ tcp { nodelay, sack, socket buffer 65536, backlog 100 }
+
+ # set up certs
+ tls { no tlsv1.0, ciphers "HIGH:!aNULL:!SSLv3:!DSS:!ECDSA:!RSA:-ECDH:ECDHE:+SHA384:+SHA256" }
+#+end_src
+
+I won't discuss details here as each header has a dedicated MDM webdoc.
+Most of the headers here are considered a standard.
+
+Besides adding headers, we configure TLS here, disabling weak ciphers and old TLS versions and adding some standard config.
+
+Lastly, we can automate refreshing the certificate via cron(8):
+
+#+begin_src shell
+0~59 0~23 * * 1 acme-client michal.sapka.me && rcctl reload relayd
+#+end_src
+
+It looks almost like a normal cron.
+The "0~59" and "0~29" parts are unique to OpenBSD: Cron(8) will evenly split all tasks between specified time boxes so that no two jobs run simultaneously.
+
+We now have created a fully working web server without any 3rd party packages.
+All OpenBSD provided, all secure, all simple, all cool as ice.
+
+To further your knowledge, you can challenge the assumption that BSD has the best doc and read man pages for =httpd.conf(5)=, =relayd.conf(5)=, and =acme-client.conf(5)=.
+
+I also can't recommend enough "Httpd and Relayd Mastery" by Michael W. Lucas[fn:mwl2]
+
+
+[fn:nginx-sec] The ports collection of OpenBSD contains a fork of NGINX with a similar security treatment.
+[fn:mwl2] yeah, the one from the top of this article. He's a household name and a staple of the BSD community. I'm primarily a software engineer, and all this sysadmin thing I am doing is a side quest for me. His books make it so much easier. I've already read four of his books, and I will read more as they are amazing. Even a dense person like yours truly comes out smarter after the lecture. While I'm not a [Full Michael](https://www.tiltedwindmillpress.com/product/full-michael-2023-06/) kind of person, it seems my library will soon have a very strong representation of his.
+[fn:win] because there is no fourth way. Please repeat after me: there is no webserver in Windows.
+** DONE OpenBSD: Blocking bad bots using Relayd
+CLOSED: [2023-12-11 Mon 19:08]
+:PROPERTIES:
+:EXPORT_FILE_NAME: blocking-bad-bots-openbsd
+:EXPORT_HUGO_CUSTOM_FRONT_MATTER: abstract How do I fight bad crawlers?
+:EXPORT_HUGO_MENU: :menu bsd-openbsd :name "Blocking bad bots using Relayd"
+:END:
+
+The bane of existence for most of small pages: web crawlers.
+They create most traffic this site sees and makes my [[https://michal.sapka.me/site/info/#site-stats][site stats]] overly optimistic.
+We can go with [[https://en.wikipedia.org/wiki/Robots_Exclusion_Protocol][robots.txt]], but what if it's not enough?
+I can tell a valuable bot to not index some part of my site, but:
+a) some bots ignore it
+b) what if I don't want some bots to even have the chance to ask?
+
+Get that SEO scanning and LLM training out of here!
+
+*** Blocking crawlers
+
+The rest of this guide assumes webstack: Relayd and Httpd.
+Relayd is great and since it works on higher level than pf, we can read headers.
+Luckily, those crawlers send usable "User-Agents" which we can block.
+
+First, let's see who uses my site the most. Assuming you use "forwarded"[fn:log-style] style for logs, we can do:
+
+#+begin_src shell
+ awk -F '"' '{print $6}' <path to log file> | sort | uniq -c | sort#
+#+end_src
+
+Then we need to manually select agents we want to block.
+It won't be easy, as the strings are long and contain a lot of unnecessary information - which includes plain lies.
+You need to define which part of the full User-Agent is common and can be used for blocking.
+
+Then we can create block rules in a Relayd protocol.
+Relayd doesn't use regexp, and instead allows using case-sensitive Lua globs.
+Stars will match everything.
+
+#+begin_src shell
+ block request method "GET" header "User-Agent" value "*<common part>*"
+#+end_src
+
+Remember that config assumes last-one-wins, so the block rules should be the last matching.
+I just put those end the end of my config.
+You can create a `block quick...` rule if you want - it will short-circuit the entire protocol.
+
+Therefore, my "https" protocol now has a series of blocks:
+
+#+begin_src shell
+ http protocol "https" {
+ # most of the procol omitted
+ block request method "GET" header "User-Agent" value "*Bytespider*"
+ block request method "GET" header "User-Agent" value "*ahrefs*"
+ block request method "GET" header "User-Agent" value "*censys*"
+ block request method "GET" header "User-Agent" value "*commoncrawl*"
+ block request method "GET" header "User-Agent" value "*dataforseo*"
+ block request method "GET" header "User-Agent" value "*mj12*"
+ block request method "GET" header "User-Agent" value "*semrush*"
+ block request method "GET" header "User-Agent" value "*webmeup*"
+ block request method "GET" header "User-Agent" value "*zoominfo*"
+ }
+#+end_src
+
+(usage of globs was proposed to me on [OpenBSD mailing list](https://marc.info/?l=openbsd-misc&m=170206886109953&w=2)
+
+[fn:log-style]: vide https://man.openbsd.org/httpd.conf.5#style
+
+** DONE OpenBSD: Forwarding requests from Relayd to a custom webserver
+CLOSED: [2023-07-19 Mon 19:30]
+:PROPERTIES:
+:EXPORT_FILE_NAME: relayd-custom-webserver
+:EXPORT_HUGO_CUSTOM_FRONT_MATTER: abstract How to forward request to webserver?
+:EXPORT_HUGO_MENU: :menu bsd-openbsd :name "Forwarding requests from Relayd to a custom webserver"
+:END:
+One thing that OpenBSD doesn't provide (yet?) is an HTTP proxy.
+I use [[https://plausible.io/][[Plausible]][fn:nope] for basic visitor analytics [fn:privacy] here, and one of the cool things you can do is to break all adblockers via serving Plausible from my own domain[fn:adblock]
+
+After two evenings of failed attempts, I reminded myself that I am a programmer, and I wrote one myself.
+You can find it on my [no longer available].
+It was a great learning exercise and a chance to touch Golang[fn:ruby] for the first time.
+
+Assuming you have it running (it works on my machine!), let's adjust our relayd(8).
+Plaprox listens on port 9090, and we want to relay all requests to =/js/script.js= there.
+
+Let's add it to our relays in =relayd.conf=:
+
+#+begin_src shell -n
+table <plausibleproxyd> { 127.0.0.1 }
+
+http protocol "https" {
+ # all our previous content omitted
+ match request quick path "/js/script.js" forward to <plausibleproxyd>
+ match request quick path "/api/event" forward to <plausibleproxyd>
+}
+
+relay "https" {
+ listen on 0.0.0.0 port 443 tls
+ protocol https
+ forward to <httpd> port 8080
+ forward to <plausibleproxyd> port 9090
+}
+relay "https6" {
+ listen on :: port 443 tls
+ protocol https
+ forward to <httpd> port 8080
+ forward to <plausibleproxyd> port 9090
+}
+#+end_src
+
+You can also move the port number to a table.
+
+Remember that in Relayd(8) last one wins.
+We already have a match for the domain and added another matcher for the path.
+The request will be forwarded to the last marching matcher - so we put our new matchers at the end of the protocol definition.
+
+*** Updates
+
+2023-07-28: remove wrong information abot PF.
+2023-07-30: fix invalid cron format
+2023-12-12: extracted to a dedicated article
+
+[fn:privacy] Yes, I want to know what people are reading!
+For details, refer to my [[https://michal.sapka.me/about/#privacy-policy][two sence long privacy policy]].
+[fn:nope] [[https://michal.sapka.me/site/updates/2023/removed-plausible/][this is no longer the case]]
+[fn:adblock] yes, it's a dick move.
+But my reasoning was simple: Plausible gathers so little information that the harm is almost nonexistent, and I really want to know what people are reading.
+[fn:ruby] I am a Ruby developer by trade and heart, but I will try anything that is not an IDE-driven language.
+LSP for Java/Scala is still a joke, and I refuse to pollute my system with Intellij.
+[[https://go.dev/][Go][, on the other hand, is a modern language designed for humans. I am not good at it, but I am infinitetly[fn:infinit] better than a week ago.
+[fn:infinit] Any positive number would be infinite progress compared to zero, or as an old wise man once said: "to have a nickel and to not a nickel is already two nickles".
+* FreeBSD :@bsd:
+* Thinkpad :@bsd:
+* Unix history :@bsd:
+
+* WIP
** TODO XMPP (Jabber) server on OpenBSD
/intro/
*** Installing prosody