This is an old revision of the document!


01. [??p] Commit signing

When looking at the git log of a repository, you would normally see a sequence of entries such as this:

Each commit is identified via a 40-character long hexstring. A hexstring is the hexadecimal representation of a sequence of bytes, where every nibble (i.e.: 4 bits) is represented by a character ranging from '0' (= 0b0000) to 'F' (= 0b1111). But how is this identifier calculated?

Well… this commit identifier is called a hash value or a digest, and is the output of a hash function. A hash function takes an arbitrary amount of data and outputs a fixed-size bit array that is representative of the input. Normally, one would be weary of collisions: if the function's domain is virtually infinite and the co-domain is not only finite but rather small in comparison, wouldn't it be possible to create two commits with the same digest? Possible – yes, likely – no. Git uses SHA1, a cryptographic hash function. What makes a cryptographic hash function so special is that it provides certain guarantees. For example, it should be impossible to calculate potential messages from a digest (meaning that the function is non-invertible). Moreover, any change in the input – no matter how small, should change the hash value so extensively that the new value and the old should appear uncorrelated. As such, being able to craft a commit such that it's not only comprehensible, but also creates a certain desired digest is so unlikely that it occurring naturally should not pose any risk.

There is, however, a more significant risk here. What guarantee do you have that the author of the commit above is actually Linus himself? Using something like git-blame-someone-else, you could overwrite commits and change their author only to then git push --force and replace the remote history. The answer to this problem is commit signing. Cryptographic algorithms can be loosely categorized an symmetric and asymmetric. Symmetric algorithms like AES use the same key for both encryption and decryption. Asymmetric algorithms like RSA, on the other hand, utilize two keys. One for encryption and one for decryption. Usually, the one used for encryption is called the private key and the one used for decryption, the public key. The private key is your identity, so you don't share it with anyone. The public key you configure on remote servers to give them the ability to verify your identity (e.g.: configuring SSH keys on fep.grid.pub.ro). As a rule, you use asymmetric cryptography in cases where you need to prove your identity to a remote host, establish secure communication channels, negotiate session parameters, etc. The reason for this is that asymmetric algorithms are orders of magnitude slower than their symmetric counterparts. While encrypting the output of a SHA256 function (32 bytes) with a 4096-bit RSA key takes about 5ms on regular CPUs, it takes almost a full minute on an Arduino Mega (with a MCU running at 16MHz). AES-256 on the other hand, encrypts the same amount of data in less than 5 microseconds. The downside is that you need to share your key with other systems for them to extract the plaintext message.

In the following tasks we will introduce GNU Privacy Guard (gpg) an open source encryption and signing tool. git can use gpg to sign your commits as you create them. For it to work, you will also have to upload your public key to https://github.com/settings/keys.

[??p] Task A - GNU Privacy Guard

ii/labs/04/tasks/01.1638729853.txt.gz · Last modified: 2021/12/05 20:44 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0