Skip to content

System Administration

robots.txt as an Insight Into Web Administration Wars

Cyber Punk AI Art Starry Night

robots.txt, or the Robot Exclusion Protocol, is one of the oldest protocols on the web. It's a file, usually stored at the top level of a domain, that provides a list of rules which politely informs web crawlers what they are and are not allowed to do. This simple file is a great insight into the kinds of struggles that web administrators have in maintaining their websites.

Prefer Systemd Timers Over Cron

systemctl list-timers command output terminal

Systemd administrators often find themselves needing to run services on their bare-metal machines. Services can be broken down into roughly two broad categories:

  1. Long-running services that are started once and will run for the lifetime of the machine.
  2. Short-running services that are started at least once and will run for a short amount of time.

Long-running services comprise of the majority of the services in use by Linux. One of the challenging aspects of long-running services in a production environment is the dual question of monitoring and reliability:

  1. How do you know if your service is running?
  2. How will you be alerted if your service dies?
  3. How do you handle automatic retries should the service die?
  4. How do you enable automatic start-up of your services when a machine boots, and how do you make them start up in the right order?