03. [50p] Solving a substitution cipher

Time to get your hands dirty! Below, you have a ciphertext. Specifically, this output is the result of a substitution cipher, meaning that every letter in the English alphabet has been assigned a random, unique correspondent. As you may have noticed, digits and special characters remain unchanged.

Click to display ⇲

Click to hide ⇱

HXY KIRNUJ SPHFO LTPHPSIOUJ KTLZ OFU HXSQLRBUT QVLO LK 1605, I KIPVUB
WLSJQPTIWY GY I HTLXQ LK QTLEPSWPIV USHVPJF WIOFLVPWJ OL IJJIJJPSIOU OFU
QTLOUJOISO NPSH MIZUJ P LK USHVISB ISB EP LK JWLOVISB ISB TUQVIWU FPZ RPOF
I WIOFLVPW FUIB LK JOIOU. PS OFU PZZUBPIOU IKOUTZIOF LK OFU 5 SLEUZGUT ITTUJO
LK HXY KIRNUJ, WIXHFO HXITBPSH I WIWFU LK UDQVLJPEUJ QVIWUB GUSUIOF OFU FLXJU
LK VLTBJ, MIZUJ'J WLXSWPV IVVLRUB OFU QXGVPW OL WUVUGTIOU OFU NPSH'J JXTEPEIV
RPOF GLSKPTUJ, JL VLSH IJ OFUY RUTU "RPOFLXO ISY BISHUT LT BPJLTBUT". OFPJ ZIBU
1605 OFU KPTJO YUIT OFU QVLO'J KIPVXTU RIJ WUVUGTIOUB.

OFU KLVVLRPSH MISXITY, BIYJ GUKLTU OFU JXTEPEPSH WLSJQPTIOLTJ RUTU UDUWXOUB,
QITVPIZUSO, IO OFU PSPOPIOPLS LK MIZUJ P, QIJJUB OFU LGJUTEISWU LK 5OF SLEUZGUT
IWO, WLZZLSVY NSLRS IJ OFU "OFISNJHPEPSH IWO". PO RIJ QTLQLJUB GY I QXTPOIS
ZUZGUT LK QITVPIZUSO, UBRITB ZLSOIHX, RFL JXHHUJOUB OFIO OFU NPSH'J IQQITUSO
BUVPEUTISWU GY BPEPSU PSOUTEUSOPLS BUJUTEUB JLZU ZUIJXTU LK LKKPWPIV
TUWLHSPOPLS, ISB NUQO 5 SLEUZGUT KTUU IJ I BIY LK OFISNJHPEPSH RFPVU PS OFULTY
ZINPSH IOOUSBISWU IO WFXTWF ZISBIOLTY.[4] I SUR KLTZ LK JUTEPWU RIJ IVJL IBBUB
OL OFU WFXTWF LK USHVISB'J GLLN LK WLZZLS QTIYUT, KLT XJU LS OFIO BIOU. VPOOVU
PJ NSLRS IGLXO OFU UITVPUJO WUVUGTIOPLSJ. PS JUOOVUZUSOJ JXWF IJ WITVPJVU,
SLTRPWF, ISB SLOOPSHFIZ, WLTQLTIOPLSJ (OLRS HLEUTSZUSOJ) QTLEPBUB ZXJPW ISB
ITOPVVUTY JIVXOUJ. WISOUTGXTY WUVUGTIOUB 5 SLEUZGUT 1607 RPOF 106 QLXSBJ (48 NH)
LK HXSQLRBUT ISB 14 QLXSBJ (6.4 NH) LK ZIOWF, ISB OFTUU YUITJ VIOUT KLLB ISB
BTPSN RIJ QTLEPBUB KLT VLWIV BPHSPOITPUJ, IJ RUVV IJ ZXJPW, UDQVLJPLSJ, ISB I
QITIBU GY OFU VLWIV ZPVPOPI. UEUS VUJJ PJ NSLRS LK FLR OFU LWWIJPLS RIJ KPTJO
WLZZUZLTIOUB GY OFU HUSUTIV QXGVPW, IVOFLXHF TUWLTBJ PSBPWIOU OFIO PS OFU
QTLOUJOISO JOTLSHFLVB LK BLTWFUJOUT I JUTZLS RIJ TUIB, OFU WFXTWF GUVVJ TXSH,
ISB GLSKPTUJ ISB KPTURLTNJ VPO.

Your task is to write a Python script that will help you break the cipher and decode the original text:

Remember ANSI codes?

$ echo -e "\033[1;34m I'm blue, da ba dee, dabba daa-ee, dabba dee-a dabba da \033[0m"

In breaking a short substitution cipher like this while also knowing the original language, you need to look at bigrams and trigrams. Small groups of letters that have a limited amount of possible values that make sense: “to”, “and”, “the”, etc. As you reveal more and more of the original text, words will begin to form, making everything progressively easier.

If you need an extra hint:

Click to display ⇲

Click to hide ⇱

“5 SLEUZGUT” looks like a date. Hmm… “SLEUZGUT”

Already done? Try this challenge as well.


Alternatively, if you want to keep learning python, take a look at these labs for an introduction into NumPy (a fundamental module for numeric computation) and matplotlib (a module for plotting).