This shows you the differences between two versions of the page.
|
ii:labs:02:tasks:03 [2021/11/07 16:29] radu.mantu |
ii:labs:02:tasks:03 [2024/11/01 10:59] (current) radu.mantu [03. [50p] Solving a substitution cipher] |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ==== 03. [50p] Solving a substitution cipher ==== | ==== 03. [50p] Solving a substitution cipher ==== | ||
| + | |||
| + | [[https://ocw.cs.pub.ro/courses/_media/ii/labs/02/tasks/crack_ciphertext.png|{{ :ii:labs:02:tasks:crack_ciphertext.png?700 |}}]] | ||
| Time to get your hands dirty! Below, you have a [[https://en.wikipedia.org/wiki/Ciphertext|ciphertext]]. Specifically, this output is the result of a [[https://en.wikipedia.org/wiki/Substitution_cipher|substitution cipher]], meaning that every letter in the English alphabet has been assigned a random, unique correspondent. As you may have noticed, digits and special characters remain unchanged. | Time to get your hands dirty! Below, you have a [[https://en.wikipedia.org/wiki/Ciphertext|ciphertext]]. Specifically, this output is the result of a [[https://en.wikipedia.org/wiki/Substitution_cipher|substitution cipher]], meaning that every letter in the English alphabet has been assigned a random, unique correspondent. As you may have noticed, digits and special characters remain unchanged. | ||
| - | Your task is to write a //Python// script that will help you break the cipher and decode the original text. | + | <spoiler> |
| + | <code> | ||
| + | HXY KIRNUJ SPHFO LTPHPSIOUJ KTLZ OFU HXSQLRBUT QVLO LK 1605, I KIPVUB | ||
| + | WLSJQPTIWY GY I HTLXQ LK QTLEPSWPIV USHVPJF WIOFLVPWJ OL IJJIJJPSIOU OFU | ||
| + | QTLOUJOISO NPSH MIZUJ P LK USHVISB ISB EP LK JWLOVISB ISB TUQVIWU FPZ RPOF | ||
| + | I WIOFLVPW FUIB LK JOIOU. PS OFU PZZUBPIOU IKOUTZIOF LK OFU 5 SLEUZGUT ITTUJO | ||
| + | LK HXY KIRNUJ, WIXHFO HXITBPSH I WIWFU LK UDQVLJPEUJ QVIWUB GUSUIOF OFU FLXJU | ||
| + | LK VLTBJ, MIZUJ'J WLXSWPV IVVLRUB OFU QXGVPW OL WUVUGTIOU OFU NPSH'J JXTEPEIV | ||
| + | RPOF GLSKPTUJ, JL VLSH IJ OFUY RUTU "RPOFLXO ISY BISHUT LT BPJLTBUT". OFPJ ZIBU | ||
| + | 1605 OFU KPTJO YUIT OFU QVLO'J KIPVXTU RIJ WUVUGTIOUB. | ||
| + | |||
| + | OFU KLVVLRPSH MISXITY, BIYJ GUKLTU OFU JXTEPEPSH WLSJQPTIOLTJ RUTU UDUWXOUB, | ||
| + | QITVPIZUSO, IO OFU PSPOPIOPLS LK MIZUJ P, QIJJUB OFU LGJUTEISWU LK 5OF SLEUZGUT | ||
| + | IWO, WLZZLSVY NSLRS IJ OFU "OFISNJHPEPSH IWO". PO RIJ QTLQLJUB GY I QXTPOIS | ||
| + | ZUZGUT LK QITVPIZUSO, UBRITB ZLSOIHX, RFL JXHHUJOUB OFIO OFU NPSH'J IQQITUSO | ||
| + | BUVPEUTISWU GY BPEPSU PSOUTEUSOPLS BUJUTEUB JLZU ZUIJXTU LK LKKPWPIV | ||
| + | TUWLHSPOPLS, ISB NUQO 5 SLEUZGUT KTUU IJ I BIY LK OFISNJHPEPSH RFPVU PS OFULTY | ||
| + | ZINPSH IOOUSBISWU IO WFXTWF ZISBIOLTY.[4] I SUR KLTZ LK JUTEPWU RIJ IVJL IBBUB | ||
| + | OL OFU WFXTWF LK USHVISB'J GLLN LK WLZZLS QTIYUT, KLT XJU LS OFIO BIOU. VPOOVU | ||
| + | PJ NSLRS IGLXO OFU UITVPUJO WUVUGTIOPLSJ. PS JUOOVUZUSOJ JXWF IJ WITVPJVU, | ||
| + | SLTRPWF, ISB SLOOPSHFIZ, WLTQLTIOPLSJ (OLRS HLEUTSZUSOJ) QTLEPBUB ZXJPW ISB | ||
| + | ITOPVVUTY JIVXOUJ. WISOUTGXTY WUVUGTIOUB 5 SLEUZGUT 1607 RPOF 106 QLXSBJ (48 NH) | ||
| + | LK HXSQLRBUT ISB 14 QLXSBJ (6.4 NH) LK ZIOWF, ISB OFTUU YUITJ VIOUT KLLB ISB | ||
| + | BTPSN RIJ QTLEPBUB KLT VLWIV BPHSPOITPUJ, IJ RUVV IJ ZXJPW, UDQVLJPLSJ, ISB I | ||
| + | QITIBU GY OFU VLWIV ZPVPOPI. UEUS VUJJ PJ NSLRS LK FLR OFU LWWIJPLS RIJ KPTJO | ||
| + | WLZZUZLTIOUB GY OFU HUSUTIV QXGVPW, IVOFLXHF TUWLTBJ PSBPWIOU OFIO PS OFU | ||
| + | QTLOUJOISO JOTLSHFLVB LK BLTWFUJOUT I JUTZLS RIJ TUIB, OFU WFXTWF GUVVJ TXSH, | ||
| + | ISB GLSKPTUJ ISB KPTURLTNJ VPO. | ||
| + | </code> | ||
| + | </spoiler> | ||
| + | |||
| + | Your task is to write a //Python// script that will help you break the cipher and decode the original text: | ||
| + | * [[https://www.pythontutorial.net/python-basics/python-read-text-file/|read]] the ciphertext from a file specified as a command line argument | ||
| * use a __dictionary__ to map each encoded character back to it's original value | * use a __dictionary__ to map each encoded character back to it's original value | ||
| - | * populate this dictionary as you progress in your attempt | + | * __manually__ populate this dictionary as you progress in your attempt and reveal new characters |
| * whenever you run the script, it should print the text to the screen, with a few minor changes: | * whenever you run the script, it should print the text to the screen, with a few minor changes: | ||
| * any character that exists as a key in the dictionary should be replaced with what you think the correspondent is. | * any character that exists as a key in the dictionary should be replaced with what you think the correspondent is. | ||
| * any replaced character should be highlighted in __bold red__. | * any replaced character should be highlighted in __bold red__. | ||
| - | |||
| - | <spoiler> | ||
| - | <file text uncrackable.txt> | ||
| - | MIT YSAU OL OYGFSBDGRTKFEKBHMGCALSOQTMIOL. UTFTKAMTR ZB DAKQGX EIAOF GY MIT | ||
| - | COQOHTROA HAUT GF EASXOF AFR IGZZTL. ZT CTKT SGFU, MIT YSACL GF A 2005 HKTLTFM | ||
| - | MODTL MIAF LMADOFA GK A CTTQSB LWFRAB, RTETDZTK 21, 1989 1990, MIT RKTC TROMGKL | ||
| - | CAL WHKGGMTR TXTKB CGKSR EAF ZT YGWFR MIT EGFMOFWTR MG CGKQ AM A YAOMIYWS | ||
| - | KTHSOTL CITKT IGZZTL, LMBST AOD EASXOF, AMMAEQ ZGMI LORTL MG DAKQL, "CIAM RG | ||
| - | EGFMKGSSOFU AF AEMWAS ZGAKR ZGVTL OF MIT HKTHAKTFML FADT, OL ODHWSLOXT KADHAUTL | ||
| - | OF CIOEI ASCABL KTYTKTFETL MIT HALLCGKR, CIOEI DGFTB, AFR MITB IAR SOMMST YKGFM | ||
| - | BAKR IOL YKWLMKAMTR EGSGK WFOJWT AZOSOMB COMI AFR OFROLHTFLAMT YGK MTAEI GMITK | ||
| - | LMWROTL, AKT ACAKRL ZARUTL, HWZSOLITR ZTYGKT CTSS AL A YOKT UKGLL HSAFL CTKT | ||
| - | GKOUOFASSB EIAKAEMTKL OF MIT LMKOH MG CIOEI LTTD MG OM CITF MTDHTKTR OF AFR | ||
| - | IASSGCOFU MITB'KT LODHSB RKACOFU OF UOXTL GF" HKOFEOHAS LHOMMST ROLMGKM, | ||
| - | KTARTKL EGDOEL AKT WLT, CAMMTKLGF MGGQ MCG 16-DGFMIL AYMTK KTLOLMAQTL A DGKT | ||
| - | EKTAM RTAS MG EASXOF GYMTF IGZZTL MG ARDOML "LSODB, "ZWM OM'L FADTR A FOUIM GWM | ||
| - | LIT OL HGOFM GY FGM LTTF IGZZTL MIT ZGGQL AM MIAM O KTDAOFOFU ZGGQ IADLMTK IWTB | ||
| - | AKT AHHTAKAFET: RTETDZTK 6, 1995 DGD'L YKADTL GY EASXOF UOXTF A CAUGF, LGDTMODTL | ||
| - | MIAM LG OM'L YAMITKT'L YADOSB FG EAFETSSAMOGFLIOH CAL HKTLTFML YKGD FGXTDZTK 21, | ||
| - | 1985 SALM AHHTAK AZLTFET OF AFGMITKCOLT OM IAHHB MG KWF OM YGK MIOL RAR AL "A | ||
| - | SOMMST MG MGSTKAMT EASXOF'L YADOSB RKACF ASDGLM EGDDTFRTR WH ZTOFU HTGHST | ||
| - | OFLMAFET, UTM DAKKOTR ZB A RAFET EASXOF'L GWMSAFROLOFU MIT FTCLHAHTK GK MAZSGOR | ||
| - | FTCLHAHTK ZWLOFTLL LIGC OL GF!" AFR LHKOFML GY EIOSRKTF'L RAR'L YKWLMKAMTR ZB | ||
| - | MWKF IWDGK, CAL HWZSOE ROASGU MITKT'L FGM DWEI AL "'94 DGRTKFOLD" CAMMTKLGF IAL | ||
| - | RTSOUIML GY YAFMALB SOYT CAMMTKLGF LABL LTKXTL AL AF AKMOLML OL RTLMKWEMOGF | ||
| - | ZWLOFTLL, LHAETYAKTK GY MIT GHHGKMWFOMOTL BGW ZGMI A MGHOE YGK IOL IGDT | ||
| - | MGFUWT-OF-EITTQ HGHWSAK MIAM OM CAL "IGF" AFR JWAKMTK HAUT DGKT LHAEOGWL | ||
| - | EAFETSSAMOGF MIT HAOK AKT ESTAKSB OF HLBEIOE MKAFLDGUKOYOTK'L "NAH" LGWFR TYYTEM | ||
| - | BGW MIOFQTK CAMMTKLGF ASLG UKTC OFEKTROZST LHAET ZWBL OF EGDDGFSB CIOST GMITKCOLT | ||
| - | OM'L FADT OL FGMAZST LMGKBSOFT UAXT MIT GHHGKMWFOMOTL BGW EAFETSSAMOGF MIT "EASXOF | ||
| - | GYYTK MG DAQT IOD OFEGKKTEM AFLCTKL CAMMTK AKMCGKQ GMITK GYMTF CIOEI OL TXORTFM MG | ||
| - | GMITK LMKOH OL MG MITOK WLT GY KWSTL MIAM LIGCF GF LAFROYTK, CIG WLTL A EKGCJWOSS | ||
| - | ZT LTTF "USWTR" MG MIT GFSB HTKL AFR IOL YAMITK LWHHGKM OL SWFEISOFT UAXT MITLT | ||
| - | MIOF A BTAK OF DWSMODAMTKOAS AFR GZMAOF GF LAFMALB, IOL WLT, CAMMTKL ROASGUWT OL | ||
| - | AF "AKMOLM'L LMAMWL AL "A ROD XOTC OF MIT TLLTFMOASSB MG DAQT IOD LTTD MG OFESWRTR | ||
| - | MIAM EASXOF OL AF GRR ROASGUWT DGLM GY MIT ESWZ IAL TVHKTLLOGF GWMLORT AXAOSAZST MG | ||
| - | </file> | ||
| - | </spoiler> | ||
| <note tip> | <note tip> | ||
| - | You can either [[https://www.pythontutorial.net/python-basics/python-read-text-file/|read from a file]] or hardcode the text in your script. The choice is yours. | ||
| - | |||
| - | ---- | ||
| Remember ANSI codes? | Remember ANSI codes? | ||
| <code bash> | <code bash> | ||
| - | $ echo "\033[1;34m I'm blue... aba di... aba die... aba di aba die...\033[0m" | + | $ echo -e "\033[1;34m I'm blue, da ba dee, dabba daa-ee, dabba dee-a dabba da \033[0m" |
| </code> | </code> | ||
| ---- | ---- | ||
| - | In breaking a short substitution cipher like this while also knowing the original language, you need to look at bigrams and trigrams. Small group of letters that have a limited amount of possible values that make sense: //"to"//, //"and"//, //"the"//, etc. | + | In breaking a short substitution cipher like this while also knowing the original language, you need to look at bigrams and trigrams. Small groups of letters that have a limited amount of possible values that make sense: //"to"//, //"and"//, //"the"//, etc. As you reveal more and more of the original text, words will begin to form, making everything progressively easier. |
| If you need an extra hint: | If you need an extra hint: | ||
| - | <spolier> | + | <spoiler> |
| - | //"RTETDZTK 6, 1995"// looks like a date. Hmm... | + | //"5 SLEUZGUT"// looks like a date. Hmm... //**"SLEUZGUT"**//... |
| </spoiler> | </spoiler> | ||
| </note> | </note> | ||
| + | |||
| + | <note> | ||
| + | Already done? Try [[https://ctflearn.com/challenge/238|this challenge]] as well. | ||
| + | |||
| + | ---- | ||
| + | |||
| + | Alternatively, if you want to keep learning python, take a look at [[https://ocw.cs.pub.ro/courses/ep/labs/05|these labs]] for an introduction into //NumPy// (a fundamental module for numeric computation) and //matplotlib// (a module for plotting). | ||
| + | </note> | ||
| + | |||
| + | <solution -hidden> | ||
| + | <file python cracker.py> | ||
| + | #!/usr/bin/python3 | ||
| + | |||
| + | import sys | ||
| + | |||
| + | inv = { | ||
| + | # 'A' : '', | ||
| + | 'B' : 'D', | ||
| + | # 'C' : '', | ||
| + | 'D' : 'X', | ||
| + | 'E' : 'V', | ||
| + | 'F' : 'H', | ||
| + | 'G' : 'B', | ||
| + | 'H' : 'G', | ||
| + | 'I' : 'A', | ||
| + | 'J' : 'S', | ||
| + | 'K' : 'F', | ||
| + | 'L' : 'O', | ||
| + | 'M' : 'J', | ||
| + | 'N' : 'K', | ||
| + | 'O' : 'T', | ||
| + | 'P' : 'I', | ||
| + | 'Q' : 'P', | ||
| + | 'R' : 'W', | ||
| + | 'S' : 'N', | ||
| + | 'T' : 'R', | ||
| + | 'U' : 'E', | ||
| + | 'V' : 'L', | ||
| + | 'W' : 'C', | ||
| + | 'X' : 'U', | ||
| + | 'Y' : 'Y', | ||
| + | 'Z' : 'M', | ||
| + | } | ||
| + | |||
| + | def main(): | ||
| + | # cli arguments check | ||
| + | if len(sys.argv) != 2: | ||
| + | print("Usage ./cracker.py <input_file>") | ||
| + | |||
| + | # read contents of file | ||
| + | with open(sys.argv[1]) as f: | ||
| + | enc = f.read() | ||
| + | |||
| + | # make sure text is uppercase | ||
| + | enc = enc.upper() | ||
| + | |||
| + | # create (partially) translated text | ||
| + | dec = [ it if it not in inv else '\033[1;31m%s\033[0m' % inv[it] \ | ||
| + | for it in enc ] | ||
| + | dec = ''.join(dec) | ||
| + | |||
| + | # print translated text | ||
| + | print(dec) | ||
| + | | ||
| + | # program entry point | ||
| + | if __name__ == '__main__': | ||
| + | main() | ||
| + | </file> | ||
| + | </solution> | ||