NFS Mounts and Wake-on-LAN

DJ Does
5 min readOct 28, 2020

--

tl;dr code with ELI5 comments: https://gist.github.com/dj-mcculloch/9e097535ea35df8e2ec1e6e32f7f73ac

My media server is a constant target of my need to tinker and perfect. I cannot help but try to make it as fire-and-forget as I possibly can. I just want it to always work. Turns out making something that is reliable (consistently good) and functional (provides utility) is hard — there’s an entire profession dedicated to this (ehlo, SREs).

For my use-case I want my Synology DS1819+ NAS to go to sleep, like all the way, when it is not being used for extended periods of time and automatically wake up when I need it. I don’t want to waste electricity on 8 disks and some logic boards that hardly do anything between 5AM and 3PM. Sure, I could schedule the NAS to turn off and on via a schedule, but that’s dumb.

Enter Wake-on-LAN.

How does Wake-on-LAN (WOL) work? When properly configured, devices that support WOL will power-on when they receive a special packet called a Magic Packet. On Ubuntu this can be accomplished by running etherwake, which is a utility for sending Magic Packets.

sudo etherwake 00:11:aa:bb:cc:55

etherwake works by sending an AMD format Magic Packet to the target MAC address; in this case the 00:11:aa:bb:cc:55 dummy MAC address. So how does this work with NFS mounts in Ubuntu?

If you’re mounting NFS volumes in Ubuntu you could be using autofs. I recently learned that autofs does not handle WOL for you, but it gives you some flexibility to handle it automatically. autofs describes its mounts and their local paths in/etc/auto.master. For your NFS volume your auto.master should end up with something like this in it:

/nfs /etc/auto.nfs --timeout=5

That line, /nfs /etc/auto.nfs --timeout=5, tells autofs where to go to find the details on where and how to mount the NFS volume at /nfs.

—-timeout=5 tells autofs when to unmount a volume, in this case after 5 seconds of inactivity. 5 seconds may seem quick, but if your NAS enters ACPI S3 power (otherwise known as sleep) and something tries to access /nfs it may take a long time for /nfs to become unmounted and the mounted again, or it may entirely fail to create a new mount. An aggressive timeout here is crucial for this to work.

/etc/auto.nfs is a map file that details the mount options and target. In general they kinda look like this:

video -fstype=nfs4,retry=0,timeo=50,hard,intr,tcp 192.168.1.102:/volume1/video

Instead of pointing auto.master to a file that simply describes the NFS volume like the example above, we point to an executable map that is an executable script that sends a Magic Packet to the NAS (or whatever) to wake the device up, and returns the options and target for the mount:

auto_wol.nfs

An important note on executable maps: autofs handles these files differently than normal maps in some not so obvious ways.

  1. autofs passes in a key (sub-directory) as an argument to executable map
  2. The executable map must return (echo) the mount options and target, but not the key (sub-directory)

What does this mean? echoing the following…

video -fstype=nfs4,retry=0,timeo=50,hard,intr,tcp 192.168.1.102:/volume1/video

…is illegal. Providing the key video will give you the following error as seen via sudo journalctl -unit=autofs.service -f:

validate_location: invalid character " " found in location video -fstype=nfs4 192.168.1.102:/volume1/video

So when you’re using an executable map anything you send to autofs when accessing the mount, for example, ls /nfs/video or cd /nfs/fubar will create the subdirectories video and fubar as long as the executable map returns the mount options and target. Because of this, the script I’ve created will not echo mount options and a target unless the key argument is what we expect.

Once you’ve configured all the variables in the executable map, set your executable map to be executable (surprise) via sudo chmod 755 and update the file’s ownership viasudo chown root:root and you’re good to go.

If you’re successful, a quick look at the logs for autofs will show you the executable mapping at work:

Oct 26 23:48:55 Media systemd[1]: Starting Automounts filesystems on demand...
Oct 26 23:48:55 Media automount[20560]: Starting automounter version 5.1.2, master map /etc/auto.master
Oct 26 23:48:55 Media automount[20560]: using kernel protocol version 5.02
...
Oct 26 23:48:55 Media automount[20560]: mounted indirect on /nfs with timeout 5, freq 2 seconds
Oct 26 23:48:55 Media sudo[20593]: root : TTY=unknown ; PWD=/nfs ; USER=root ; COMMAND=/usr/sbin/etherwake 00:11:aa:bb:cc:55
Oct 26 23:48:55 Media sudo[20593]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 26 23:48:55 Media sudo[20593]: pam_unix(sudo:session): session closed for user root

Nice.

Caution

If you’re like me you probably thought, “I can programmatically grab the MAC address!”

arp -an | grep $nfs_ipv4 | awk '{print $4}'

In situations where your target NAS (or whatever) has been offline for more than 60 seconds,arp may not have the MAC address of your NAS. arp entries in Ubuntu have a time-to-live (TTL) of 60 seconds.

Troubleshooting

Turning on verbose logging output, or even debug logging output for autofs is critical for problem-solving any issues with this.

sudo vim /etc/default/autofs:

OPTIONS=”--verbose” or OPTIONS=”--debug” is your friend.

Additionally, dmesg output will help you track the status or errors of things trying to interact with the NFS mount itself.

dmesg -H -w

If your --timeout in auto.master is too high or set to the default value you may see a lot of messages like this:

[Oct26 22:47] nfs: server 192.168.1.102 not responding, still trying
[ +6.144254] nfs: server 192.168.1.102 not responding, still trying
[Oct26 22:48] nfs: server 192.168.1.102 not responding, still trying
[Oct26 22:49] nfs: server 192.168.1.102 not responding, still trying
[ +36.860554] nfs: server 192.168.1.102 not responding, timed out
...
[ +24.576531] nfs: server 192.168.1.102 not responding, timed out
[Oct26 23:04] nfs: server 192.168.1.102 not responding, timed out
[ +0.000300] nfs: server 192.168.1.102 not responding, still trying
[ +15.360871] nfs: server 192.168.1.102 not responding, still trying

This means that autofs has not executed umount successfully on the mount and because it has gone stale and not been removed, anything trying to use it will timeout over and over and over… you may think, “oh, well I should use soft for the mount options” but that’s not a good idea with read/write mounts. Instead that’s why we opt for hard and intr in our mount options which helps to alleviate this issue, but doesn’t totally solve it alone.

One more thing…

This took me a hilariously long time to figure out. If you find any bugs or can suggest improvements to this entire thing, including burning autofs to the ground, I am open to feedback.

Thanks for stopping by!

--

--

DJ Does
DJ Does

Written by DJ Does

A collection of things I’ve learned while futzing with projects.

No responses yet