Table of Contents

This is essentially a running log that explains why my SSH config is the byzantine cipher it is right now.

SSH is a program that sets up a secure (encrypted) authenticated connection between 2 hosts. It’s THE default for almost all remote administration, sure you may have RDP running somewhere, or VNC, but for command line work SSH will be your friend or foe.

But with SSH come a wealth of advantages:

Key based login, so no more passwords
Scriptable, so anything you do twice can be automated
Works everywhere, mostly because every remote *NIX system relies on it
Sets up tunnels between hosts
Allows you to jump across systems, so between you and your target can be any number of systems
Finally, allows you circumvent firewalls that block anything except 443 (TLS/SSL).

Whether it’s legal or allowed by law or policy to circumvent an firewall is your responsibility.

Use cases #

The below all assume you have setup key based login. I don’t think there a single tutorial or readme on Azure, DigitalOcean, your cluster, AWS, and most of Google that do not mention how to do that, and with good reason.

Automation saves time, so this

ssh -i <file> you@somehost.country -p 123 -o ....

becomes

# ~/.ssh/config
Host somehost.country
  User you
  Port 123
  IdentityFile <somefile>

Let’s also make our life a bit easier, by unlocking the keyfile in advance.

ssh-agent add <somefile>

Now the above has become

ssh somehost.country

SSH will figure out what to do by looking up the target host, which can be an IP as well, and applying your config.

Bells and whistles #

There’s a few ’nice to haves’ you’ll want to add

#~/.ssh/config
Host ...
  ...
  ForwardAgent yes
  PermitLocalCommand yes
  LocalCommand printf ">>Not your local shell<<\n\n"

First, ForwardAgent. We have ssh-agent on your machine with all of your 50 or so private keys unlocked, but as soon as you jump, the remote system does not know how to unlock its keys. So you need an agent there as well. If you specify ForwardAgent, you’re essentially telling the remote system to disregard its own agent, and use the one you have running. This is especially useful when you have multiple jumps. Another use case is:

you’re on a low power laptop Y
you login to a remote development machine X to do some heavy computing
you need to pull the latest source, but X does not have the private key for the github repo
you have that key at Y Another way of thinking about this is that ForwardAgent means you’re ‘you’ on any remote system.

That also means there are cases where you do NOT want to do this, ever. I’ll leave that to you to figure out when that’s the case.

The multijump #

A compute cluster usually has a topology (~ map of machines)

graph TD A[You] -->|SSH| B(Cluster Login Node) B -->|SSH| F[N3] A ==>|?| F subgraph Cluster B D[Compute Node X] E[Compute Node Y] F[Compute Node Z] end

The Compute Nodes are shielded to prevent exploitation, it also partitions on task, e.g. having heavy loads on a login node is unpleasant at best.

Let’s say you have a long running job on Z, you’d need to do:

laptop>> ssh you@login.node
login.node>> ssh you@z.node
...

The above assumes you already saved you 75% of typing by saving options to the config file, so let’s use that to our advantage.

#$./.ssh/config
Host login.node
  Port 123
  User you
  ...
  ForwardAgent Yes

Host compute.node
  Port 245
  User you
  ProxyJump login.node

Then you’d do

ssh compute.node

graph TD A[You] -->|SSH| B(Cluster Login Node) B -->|SSH| F[N3] A ==>|Automatic 2-hop Tunnel| F subgraph Cluster B D[Compute Node X] E[Compute Node Y] F[Compute Node Z] end

If you “-v” to ssh, it will show you in all its crypto glory what it is doing, but in short it will

Lookup compute.node in ~/.ssh/config
Find that it needs to Jump to login.node first
Lookup login.node in ~/.ssh/config
Use the options and perform the jump

With one command you now have a 2-step tunnel.

More fun jumping #

Sometimes you run into cases like this:

graph TD A[You] -->|SSH| X(Blocking ISP) X -->|Blocked| B(Cluster Login Node) B -->|SSH| F[N3] A ==>|?| F subgraph Cluster B D[Compute Node X] E[Compute Node Y] F[Compute Node Z] end

There’s plenty of cases where intermediate routes have filtering, throttling, or even outright ban SSH. SSH is a latency sensitive protocol, that is, you as a human expect for something to be useful to have it respond to you within 30-100ms or so. If you type a letter, you don’t want to wait 2 seconds for it to appear, even if that happens only once every 10 letters.

A similar case occurs where Slow ISP is being banned or throttled by the cluster (and rightly so) because it has too many clients launching ddos attacks. There is no way for you to fix this, so let’s work around it. Let’s assume you have access to another host. You don’t need to be rich to have this, AWS/Azure/Linode etc all provide cheap low cost VMs that serve this purpose, and they tend to reside on fast connections.

All you need to do now is add 1 more entry to your config:

Host aws.node
  Port 443
  User you
  ForwardAgent Yes

Host login.node
  Port 123
  User you
  ...
  ForwardAgent Yes
  ProxyJump aws.node

Host z.node
  Port 245
  User you
  ProxyJump login.node

graph TD A[You] -->|SSH| X(Blocking ISP) A[You] -->|SSH| Z(AWS.node) Z -->|SSH| B(Cluster Login Node) B -->|SSH| F[N3] A ==>|Automatic 3-hop Tunnel| F subgraph Cluster B D[Compute Node X] E[Compute Node Y] F[Compute Node Z] end

More advanced usage #

So far we assumed you can reach the internet to at least one host with SSH. Sometimes Wifi operators block any VPN/VNC/SSH like protocols, for liability, information extraction, throttling and so forth. The one port and protocol that will always work is TCP 443 TLS/SSL, what you use to access https://. If that’s blocked you should move.

We know that HTTPS traffic is allowed, and we also know it’s opaque, because it’s encrypted with the certificates of the target. So if we send, instead of HTTP payloads, SSH traffic, we can reach the internet, and from there bunnyhop all the way to our target.

First, tell SSH to use a different port.

Host escape.me
  Port 443
  ...


Host work.me
  ...
  ProxyJump escape.me

Next you need to tell escape.me to treat 443 traffic differently. You will usually have a web server running there, so it needs to do:

Decrypt packet @ 443
If packet.type SSH
- Forward to local SSH daemon (which you configured)
If packet.type HTTP(S)
- Forward to web server

How to do this depends really on your web server (nginx, apache) and so forth, for nginx see:

https://nginx.org/en/docs/stream/ngx_stream_ssl_preread_module.html

https://www.linode.com/docs/guides/getting-started-with-nginx-part-1-installation-and-basic-setup/

For the host (you), you’ll need a proxy tunnel program, the options are plenty, and tutorials can be found with ease, so I’m not going to replicate it here.

What all of those instructions will do is graphically represented a lot easier:

Conclusion #

While all of this may seem painful to set up, and it can be at times, it’s a one-time setup. You gain productivity because it’s very hard to block your SSH access, which I assume you’re using for work. In other words, it’s a one-time cost, that pays dividends continuously, so ROI is infinite, assuming you work forever. Often you simply can’t do without this, so automating a critical part of your setup is worth the investment. Finally, what if you have a new laptop? Having a backup of your byzantine ~/.ssh/config file will make this a 2-minute setup.

Hard to block but not impossible: the weaknesses you should be aware of #

Certificate MITM #

SSL depends on certificates, if your employer has their own installed on your machine, they can easily decrypt your traffic and see, and thus block, your tunnels. That goes for every certificate in your chain, so a poorly maintained certificate store is a huge security hole. Ask yourself: do you know where they are installed?

Let’s assume you know and have verified the certificates, you think you have a secure tunnel, with authentication, assuming the certificates work, and all public/private key pairs are verified, and all host signatures are trusted. While attacking SSL, when setup correctly, is very unlikely, it’s usually not needed not an access point to block you.

DNS #

Every step so far used DNS, and usually that means you trust your ISP and/or Wifi access point to handle DNS. If you can’t control DNS, you need to start from scratch, for example with DNSSEC, DNS over HTTPS, using an offline DNS cache, using hardcoded ip addresses and so forth. SSH does protect you here in that it will refuse to connect if the target signature changed, but that just leaves you with a blocked tunnel.

Traffic analysis #

SSH, when used to interact with command line remote systems, is a low latency protocol that sends a lot of small packets. Sure, you can send data over it, but at least a large section of the traffic will look like bursts of small packets going back and forth. That’s not really how web traffic looks like, at least not classic web. If you watch youtube, or read a page, there’s a few requests outbound, and burst of data inbound. This is in part the reason why downstream traffic is or was x10 higher bandwidth than upstream for consumer internet. With a large section of traffic advertising, that just adds to the downstream link. There are other kinds of web traffic that will look like SSH encapsulated traffic, in terms of packet size and frequency, e.g. messenger clients using https, or using an interactive web app, and so forth. SSH has another ‘strength’ that becomes a weakness here, it has keepalive settings at both ends. You can configure the server and/or client to keep connections alive by sending a ‘are you there?’ packet every k seconds. That’s a tiny packet with a very specific signature, and the exchange itself is also very specific to SSH. Still, all the access point has to look at, for each TCP connection to a remote host at port 443:

Which always have bursts of small packets?
Which have the give away ‘keepalive’ exchange?
Which sequence of packets looks like the connection establishment exchange? An access point does not need to be 100% sure to ban you, more often than not you’re using a free resource that does not owe you anything. Good luck complaining to a coffee shop owner that your SSH twice encrypted tunnel isn’t working.

Active defense #

The access point can take one step further, it can drop packets at will. We’re using TCP, which guarantees acknowledged delivery of each packet. Let’s say you own the access point, and you suspect a given TCP connection is using SSH tunneled over SSL. If you break that connection, or start dropping packets, SSH will re-establish it, with a specific exchange of packets, that is almost certainly different than you losing your connection to youtube. So not only do you have near infinite training data, by setting up a tunnel yourself and learning what it’ll look like, you can also poke it live to see if it really SSH. That also prevents customers complaining a certain website does not work, because you would not ban the target host, just drop the connection, and pay careful attention to what happens next. And again, until your confidence, using statistics, increase to a point where you issue a blank ban.

Countermeasures #

You can keep up an arms race, but at that point ROI goes negative. For example, you can force SSH traffic into larger packets, but then your interaction latency will spike. During idle times you can send large packets of nothing more than noise, e.g. " \n", which has no effect on the target shell. You can switch of ‘Keepalive’, but it’s there for a reason, so you now end up with a lot of dead tunnels. You can setup a whole group of targets, hopping from one to the other. But if you have enough money and time to do all that, losing productivity at the same time, why not just use a 5G access point? SSH is low bandwidth so it would almost certainly be cheaper, and a lot faster, and it’d work everywhere.

Latency #

SSL/SSH encryption will cost CPU cycles, but on modern CPUs with hardware acceleration for common ciphers that should be minimal extra overhead. The true cost will be in extra network hops, if you jump a continent, expect your latency to jump as well to 150-200ms depending on routes etc. If your intermediate host is heavily loaded, similar problems arise. If you need to jump 4 hosts, it may be better for your sanity to just move yourself.

Conclusion #

SSH is very powerful, and can be automated to do very complex tasks with very little work, once it’s setup. While it can be configured to circumvent a slew of complex roadblocks, usually this will only work if those roadblocks are there inadvertently, to block other traffic, and not targeted at you. In short, the above workarounds are useful to get work done when you can’t because you are inadvertently hindered from doing so, not because someone is specific prohibiting you from using SSH.

Use cases #

Automating login #

Bells and whistles #

The multijump #

More fun jumping #

More advanced usage #

Conclusion #

Hard to block but not impossible: the weaknesses you should be aware of #

Certificate MITM #

DNS #

Traffic analysis #

Active defense #

Countermeasures #

Latency #

Conclusion #