Skip to main content
  1. Posts/

Working with remote files as if they're local: SSHFS

·694 words·4 mins

The problem #

A common task for both digital nomads and research scientist is accessing often huge, heterogeneous datasets on remote systems. There are plenty of powerful tools out there that synchronize data between any 2 computers, but those are more tailored to copying changes back and forth, with a physical copy present on either system. Often, you do not need the entire dataset, or even part of it, you just want to browse, edit, load, view, make small changes, and save. If your data is stored in compute clusters, then SSHFS can be very useful.

The solution #

SSHFS is a filesystem in user space that take a mountpoint, a local directory, and translates any file operations to remote operations, yet to the local operating system (and you) it appears as if these are local files. SSHFS falls into the category of network mounted filesystems, other variants are e.g. NFS, but that requires the presence of an NFS server. SSHFS on the other hand will work wherever the remote server has an SSH server active, which almost all do, as this is the default way to securely authenticate and access such clusters.

Dependencies #

You need

On Fedora, you need fuse-sshfs

sudo dnf install fuse-sshfs

Next, we create a mountpoint, essentially an empty directory that will be our access point to the remote filesystem.

mkdir -p /home/$USER/remotefiles

Suppose you have access on a remote system called remote.com, with a user account remoteme, and a file accounts.txt:

ls /home/remoteme/myfiles
accounts.txt

Next, we establish the connection:

sshfs remoteme@remote.com:/home/remoteme/myfiles /home/$USER/remotefiles

remoteme would be your remote user name, and you’d first configure access (key based) in $USER/.ssh/config.

Now if we do locally:

ls /home/$USER/remotefiles
> accounts.txt

You can now do anything you’d like with this file, and the changes are written back instantly.

So far this isn’t that different from using say rsync or other copy based syncing operations. However, if your remote directory has 1e6 files, and 20PB of storage full of datasets, and you simply want to browse files or quickly view some images, the advantages are clear. SSHFS will only load what is needed (accessed), whereas syncing protocols will copy everything (there are exceptions).

Advanced usage #

Caching for low latency #

Because your file operations are now translated to network operations, on slow networks you can suffer high latency. If you know you’re the only one accessing these files, then you can afford to let SSHFS cache the files, and only write out the cache at a specified time. This dramatically speeds up file operations. Enabling this is straightforward, you add the options:

-o cache=yes -o cache_timeout=$CACHE -o kernel_cache

CACHE would be an integer value in seconds to keep the cache for this many seconds, e.g. 300=5 minutes.

Robustness with reconnecting #

If your network connection goes down, now the link with the remote server is broken, requiring you to reissue the command. Instead, you can tell SSHFS to do this itself, by adding:

-o reconnect

Quite often directories in Linux are set up as symlinks, e.g. symbolic links. By default SSHFS does not follow these, but it’s easy to enable:

-o follow_symlinks

Compression #

If your data is compressible, e.g. text files or non-compressed images (tiff), then you can further lower latency (and bandwidth), at the cost of CPU, by enabling on the fly compression:

-C

Debugging #

If you want to check if a directory is still mounted:

mount | grep sshfs

Bringing it all together #

A script that does all of the above can be found here, for example:

./remotemount.sh remote.server.com:/home/$REMOTEUSER /home/$USER/mountpoint 240

and to unmount:

./remotemount.sh /home/$USER/mountpoint

Conclusion #

SSHFS offers an elegant, fast, low latency method to access files interactively on a remote machine, relying solely on SSH. You don’t need root, apart from the installation of FUSE on your local machine, for mounting you do not need root access.

Alternatives & Resources #