“S” Stands for Security in IoT

The thoughtful decisions to make sure you don’t let others control your home

When you start working on your home automation project always keep in mind the security risks associated with the actual function you are implementing versus the benefit from that specific function. Do you really want to remotely disable your alarm if anyway you will be at home when you want to do that? Etc…

Also take into consideration your knowledge level in certain topics when trusting them. Do you really understand what the small lock in fron of your URL means?

The defaults

When working on your project you will be using lots of existing frameworks, downloaded code segments, projects. Ideally you should look into all the codes and understand them, at the same time with a huge project like Home Assistant it is not realistic.

As a minimum you should understand what IDs, passwords, tokens have default values in the downloaded code and make sure to modify them. A few exmaples:

Zigbee2MQTT’s default network key
Devices on ZigBEE networks identify the network with a network key. If you use Zigbee2MQTT it will default to a network key. If you don’t modify it you may let unwanted devices join your network.
advanced: network_key: [7, 3, 5, 7, 9, 11, 13, 15, 0, 2, 4, 6, 8, 11, 12, 13]

Wireless Routers’ default admin passwords
All it takes to get to the admin interface on a router that doesn’t have the default admin password changed is opening up its manual.

Tokens

You will find many devices that have some level of authentication secured with a token. Tokens are usually generated for a specific purpose and have a predefined lifetime. The basic principle is that tokens are long and random enough so it would take forever to find them by brute forcing (trying all the combinations). The authentication check happens at establishing the connection and if you can’t provide the correct token you are simply rejected. So tokes are like a username and password bundled together.

In certain cases you can generate the tokens for yourself – like the Home Assistant tokens in the administration interface, or if you are interfacing with Google services.. In other cases it is quite difficult to find tokens – especially if the developer didn’t want you to get access to the device apart from the original application. A good example is Xiaomi’s vacuum cleaners where either you need to root your phone and find databases or install a specific earlier version of the official app that has the debug output enabled by accident so you can find the token in the logs. ( https://www.home-assistant.io/integrations/vacuum.xiaomi_miio/#retrieving-the-access-token )

The security level with tokens is defined by the device’s vendor. What you need to look out for is not to share your token with anyone, especially not to accidentally make it public with a git commit and such. Most frameworks have options to hide these tokens – like using !secret in Home Assistant ( https://www.home-assistant.io/docs/configuration/secrets/ )

TLS/SSL

Ever wondered what is the small lock icon next to a webpage’s URL? Here is the brief overview.

Sending information over a network

By default the information you send over the internet or other networks is not encrypted. Why you should care about this? It means that the passwords you use for logging in to your bank account, email, etc.. would be visible to 3rd parties without encryption. Thus the task is to encrypt the data in a way that only you and the original recipient can access it.

Private-key (Symmetric encryption)

Symmetric private key encryption means that before the actual communication the two (or more) parties share a key with each other and use the same key to encrypt and decrypt the messages.

Pros:

Simple algorithms, easy to implement
Doesn’t need much computational resources

Cons:

You need to keep the key secret, otherwise everyone can decrypt the messages
If you just get in touch with a totally new party how do you share the key over the network in a secure way?

Public-key (Asymmetric encryption )

Asymmetric public key encryption means that there is a pair of keys: a public and a private one. The two keys belong to each other as they are generated by a specific mathematical formula. At the same time you cannot generate one key from the other. You can freely distribute the public one, any party can use the public key to encrypt messages. These messages then only can be decrypted with the paired private key, not even with the same public key used for the encryption!

Pros:

Only the private key needs to be kept confidential, you can distribute the public one freely
Can be used to append digital signatures to the message so the sender’s identity can be validated too

Cons:

Complex algorithms, will make your small ESP sweat
Could slow the communication down due to computational needs

How to combine benefits?

A straightforward conclusion from the above point is to:

Use Asymmetric Public-Key encryption only at the beginning of the authentication process to generate and securely share a Symmetric Private Key over the network .This initial communication is relatively short so computational needs are not a big deal. Also this phase enables the parties to validate each others identities.
Now that you have a securely shared Symmetric Private key you overcame the main challenge of Symmetric encryption! Now change to Symmetric Private Key encryption and continue your communication over it. This will be comfortable for even a small ESP device.

So this is TLS

TLS or Transport Layer Security evolved from SSL or Secure Socket Layers.

TLS is practically combining the benefits of Symmetric and Asymmetric encryptions by negotiating a Private Key and Validating Identity at the beginning of the communication via Asymmetric Encryption. Then moving onto Symmetric encryption with the securely shared Private Key.

Of course in the background this is more complex as the parties negotiate on the actual encryption technology to be used, compatibility version and such. They also use a 2 pairs of Asymmetric Private and Public keys to mutually negotiate the Symmetric Private Key via an algorithm called the elliptical cure method. Random numbers play a significant role in this process too so you cannot just record and playback the communication from a third party as an attack. However for the everyday usage the information above is perfectly sufficient. If you are interested you can see the whole process here in details: https://tls.ulfheim.net/

Certifications

We have solved many challenges related to authentication and encryption but there is one more thing: who guarantees you that the public key and domain belong to each other indeed? In other words: how do you know you are communicating with a party it claims to be?

This is where certifications help. Certifications create a logical link between the public key and the domain it belongs to. As an example the downloaded certification from google.com, for example tells you that the public key contained by it belongs to google.com so you can trust the certificate. Of course you do not need to manually download the certificates and check its content, your browser does all these things in the background. If you downloaded a certificate that says it belongs to google1.com then it would mean that the certificate cannot be trusted so your communication is not safe. Also if you visited google1.com and the downloaded certification said it belongs to google.com the the google1.com website shouldn’t be trusted.

Certificate Authorities (CA)

Now we have the Certifications that say a certain public key belongs to a certain site. But why you should trust the Certification?

Because the Certification is Digitally Signed. Good news is that we already covered a topic needed to understand Digital Signatures and that is Asymmetric Encryption. So signing something digitally happens like this:

You generate a special segment of data based on the message to be sent and a mathematical algorithm – this is called a hash. This hash is typically much shorter than the message itself and its main characteristic is that if you change even a small piece of the message then the hash significantly changes too, in a way that to get the new hash you need to recalculate it for the whole new message.
After the hash is calculated you encrypt it with your Private Key, append it to the original message and send it to the recipient party.
The recipient party then decodes the hash with its public key pair. If it is successful the sender’s ID is validated.
Then the recipient party recalculates the hash for the message and if the recalculated hash equals to the decrypted hash then it is safe to process the message because it was not changed after having it signed.

So the Certificate Authority signs your Certificate and thus the other party knows it can be trusted.

But how can you trust a Certificate Authority?

Because an other Certificate Authority proves its identity. And how can you trust that one? Because there is an other one too. Fortunately this is not an endless cycle but a well defined chain of trust in a hierarchy.

There are so called Root Certificate Authorities and there is only a few of them. By legal regulations they are required to put some real effort in validating Certificate Authorities lower in the hierarchy so those can be trusted too. Then those lower ones are also required by law to validate the authenticity of the even lower ones – till to the very Certificate you download from a specific website.

Summary

To summarize what we have learnt so far:

The main goal of the TLS Certificates is to encrypt the data sent over the network so it will be confidential and also to validate the identity of the holder of the Certificate
It means that you will still need your Username and Password to log into your bank account. At the same time this data will be sent in a secure way and you can be sure that you send this information to the bank indeed and not some other website that wants you to believe it is your bank
The TLS Certificates are usually provided by the server side of the communication. However there are cases when Client Certificates are used and in those cases it can be used to validate the identity. As in Home Automation systems you won’t be using them a lot we are not covering this topic here. By understanding the server side certificates it is quite easy to guess how it works from a client side too. More information can be found here: https://en.wikipedia.org/wiki/Transport_Layer_Security#Client-authenticated_TLS_handshake