Anecdota

Laughter is the Best Medicine

RubyConf 2017: That time I used Ruby to crack my Reddit password by Haseeb Qureshi


(upbeat music) – I think it’s probably
about time to get started. So, how’s everybody doing? (audience murmurs) Yeah, good morning. Alright awesome. Uh so I am here talk to you guys about the time I used Ruby
to crack my Reddit password, kind of. So my name is Haseeb Qureshi,
I’m a software engineer at Earn.Com and I am going
to tell you guys a story. So, I used to be addicted
to useless websites. I still am, but I used to be too, and I’m sure you guys
know what this is like. Here’s just an artist’s depiction of what a useless website might look like. We all have our own poisons you know, you probably have something like this. Also I should note that I have
no self control to speak of. So that’s pretty bad,
so this is actually a. This is kind of a nightmare
but this is actually, if you guys remember your English class from like high school or something when you read The Odyssey, this is a famous story about when Odysseus tried to maintain some self control by tying himself to his own mast. My problems are not quite this severe, but they’re kind of
similar in their own way. And so, you know, keying off the story of the Odyssey I decided to use that ancient psychological technique that Odysseus used
which is locking himself out of his online accounts. What he did was a little bit different, but basically I feel like
it’s kind of the same, right? Basically now I guess. So let me tell you exactly what I did to try to spend less time
wasted on useless websites. So on most these websites
you have some kind of password system and
obviously you can change your password. So what I did was I typed in
just some random gibberish. So I just grabbed my keyboard
and started smacking away. Came up with something sufficiently weird and entered that in as my new password. And of course I don’t
remember what this string is, it’s just a bunch of random characters. I hold onto it though and I also change my account recovery e-mail. So now I have a new password
that I have no idea what it is, also I’ve changed my
account recovery e-mail to a throwaway e-mail
that I’ve just created that I also threw away the password to. So there’s no way to get my account back save for this password, okay. This password is the key to my kingdom. Alright, so, what I do, now my plan is to prevent
myself from having access to this password until some
later date in the future. So for now I want to go into crunch mode, I want to study, I want to practice, I want to do whatever it is I gotta do that you know these time wasting websites are keeping me away from. So I need some way to
receive this password at a later date. Now unfortunately, normally
the way you do this is through this mechanism
known as friends. Not, I figure there’s probably some way to automate this, right? You don’t need friends. So I go and use the Google
to try to figure this out and I come across this
great wonder website called “Letter Me Later”. And this kind of sounds nice like, okay, it allows you to send e-mails
at a future date and time you choose, no friends required. This is pretty perfect for me. So you know it’s a little 1995 looking, but that’s okay, you know. Maybe they’re just really
focusing on what they do. So what I do is I go ahead
and compose a new e-mail to myself I create an account. And what I do is I fill in subject line I’m gonna call it password,
because it’s my password. Set a date in the future where I’m gonna e-mail it to myself. Put in my password in there, and I set it onto hide mode. And with hide mode what
that allows me to do is not actually click on it when I log in to letter me later. And so when it’s hidden, I have actually no way to see it until it gets sent to me. Right so I gotta full proof, I know that I’m just an awful human being and so the only way that I
will not access my password is if there’s literally
no way I can get to it. So for a while, I
actually used this system to keep myself from wasting time on highly addictive useless websites. So, my talk is actually
not about, you know, productivity techniques, this
is a programming conference after all, so why am I
telling you all this. So actually the story is a
little bit more involved. So I used this for a while
it was pretty effective, but later on it ended up
coming back to bite me. So cut to two years later. So two years later I was working at airbnb and I had a job, I was gainfully employed, surprise surprise. I know I don’t really believe it either, but that’s fine. And so you know what that means, I’m working at airbnb they have a huge rails app, lot of tests. And of course that means a lot of waiting. And waiting means its now time to start wasting company time. So uh what can I do to, you know, obviously I can’t do work
while the test suite is running that would be silly. So what I do instead is I want to go and get back into some online
time wasting activities. So I want to go log
back in to this website. So I go back to letter me later, I remember that I locked my password away, go back to letter me later,
retrieve my password. And it’s been a while, it’s
been like a couple years. So I kicked the habit a good bit. But now it’s time to go
back and get another fix. So when I logged back in I realize that it’s still grayed out. That’s kind of weird, because you know I think usually I’d send it to myself like a month later at a time, you know, give myself a period of time to just focus and then a little bit of time to get back to the candy. And I realized oh shit,
I scheduled it for 2018, the last time that I put this in. I guess I didn’t remember that I’d gotten so annoyed at myself that
I actually set some date like super far in the future, just be like screw you Haseeb, like you need to get a grip on yourself. And I was like, I can’t wait that long. This test week isn’t that slow. So, I got to thinking, maybe there’s some way around this. So right here is letter me later, here is the actual real website. And so I can log on into my account. You can see here this thing is hidden, there is no way for me to see it, it’s scheduled for June. Other than that I’m screwed. Um so as I was clicking around on this, I realized, I was about ready to give up, but of course you gotta
try a few things first. I realized that there’s a search bar here. Okay, well what can I search for. So maybe I can search for my name and see if my name is in there. And it’s not, so it’s
not indexing my name. Alright fine, now what
if I search for “a”. Okay so that popped up. Right so maybe, might be looking
for subject lines, right? So I can ascertain that pretty quickly by typing password I can see, okay yeah it’s definitely, it’s
indexing the subject line, that makes sense. But remember the body of my e-mail was just the actual password. So if I search for a letter
that’s not in password, let’s look for e. E is not in there, okay. What about one? One is not in there, what about two? Two is in there. Okay, so what’s going on here. (audience laughs) What’s going on here. Alright, hold on, hold on. Let’s think for a second. So I think what’s going on here, correct me guys if I’m wrong, but I have a way to do substring
inquiries into my password. I have an oracle. And my oracle will basically
give me this query, right. It will tell me if the
body plus the subject. The subject was password, the
body was the actual password. So if any of that can
kind of knit together includes any string that I ask it, okay. So when I realize this, I ran home. I was off work at this
point, so I didn’t just run home from work. But I ran home, busted out
a piece of paper and a pen and was like okay, let me
see if I can figure this out. How can I retrieve my password. So here’s the algorithm. Alright let’s think of this like. Think about it wheel
of fortune style, okay. So I have this big long thing and I have this subject at the
top and the body down here. And I don’t know any of
the characters in the body but I do know the
characters in the subject. You can imagine that I
have like a word bank. And the word bank is all the letters. Except I’ve already, you
can imagine I’ve already tried the letters password, because if I do a substring query for “p” and I find that it returns too, I don’t know if it’s true
because it was in the body or if it was because
it was in the subject. The subject would
automatically give a hit. But if you think about it,
if I try all the letters that are not in password,
then I know for certain that I’ve hit a letter
that is only in the body and not in the password. Okay, so, if I just keep trying letters that are not in the string password, eventually I will make a hit. Once I make a hit, I know I’m in. I have one of the characters
somewhere in my password. On average this will take, actually it won’t take A/2
guesses it’ll take less but you can imagine it being somewhere in the middle of the
alphabet I’ll try letters until I get a hit. Then what I do is I try
to append another letter and do a longer search, because I know that you know that letter plus the letter after it
will be a valid substring. And so I just keep iterating
through every single letter including the letters in password, ’cause now any letter
can be appended to this to create a substring. And I just keep going down, one by one, until I find the next character. And on average I’ll find
the character somewhere around the middle of the alphabet. And so then I just keep repeating that. And every single time it’s
gonna take A/2 guesses where A is the size of the alphabet until I finally have a letter where no other letter works. And if no other letter works, then I know I’ve fallen off the end. That’s the end of this part of the string. Okay, so, what that means I have is not the entire string, it means I have a suffix
to the string, right? But I don’t know where in
the string I started with. So what I can do after that, is I can repeat the process
going backwards, right? Instead of appending to
the end of the string, I prepend to the beginning of the string. And I just keep going
until again I fall off going in the other direction. And then I know, okay, cool, that should be my entire string. So, if I do this, now if you think of it sort of like on this
illustration not exactly correct because on the ends it
would take a guesses rather than A/2 guesses, ’cause I have to know, I have to exhaust the entire search space and know, okay there’s no other letter that fulfills this string going longer. So, 2A guesses for the ends, A/2 times N minus two for
everything in the beginning because they’re N minus two characters if you don’t include the ends. Where A is the alphabet
and N is the length. So if you assume that A, if
the alphabet is a through z and zero through nine all lower case because that’s how you slam on keyboards. Also the password length is 22, then basically that means
to do this entire thing will take about 432 queries. That’s actually doable. That’s like a reasonable number of things that you can just, you can
do entirely though API calls. Um, so let’s do it. How’s that sound? Yeah, okay, alright, let’s do this. Okay so here’s what I’m gonna do. So I’m going to create a lettermenow.rb and let’s go ahead and open lettermenow. That is, that’s the wrong folder. Oh it’s because I’m in the desktop. Livecoding and let’s touch
lettermenow.rb, there we go. Okay so, so first things first. I gotta figure out how
I’m going to actually do this querying, right. I need some kind of access to this oracle. So I’m gonna go ahead
and build this first. So you can see here that
the way this is set up it’s like lettermelater.com/acccount.php and then a query string and
its like qe equals query. So if I change this query to ab, then I can see qe become ab. Okay, so obviously this thing has no API so I’m gonna have to
scrape this thing directly, that’s cool I can do that. So let’s go ahead and start this. So I’m gonna create an API class. And here I’m gonna have a URL that I used for the API class and I’m gonna remove this part so I can put the query
string in programmatically. So what I’m gonna use is, I’m
gonna use the Faraday gem, which is just a nice simple
gem for making HTP requests. And I’m gonna have a def self.get method and it’s gonna take in a query. And what I’m gonna do is I’m gonna say Faraday.get the URL
and the second argument to Faraday is going to
be the query string. And the query string here
is just going to be qe, should be the query, okay. So what I’m gonna do right
now is I’m just gonna run this and kind of see if I can make this work. So let’s open up pry, I’m
gonna paste this code in and I’m gonna say api.get and let’s say I’m gonna search for the string password. Okay so that didn’t work and the reason that it
didn’t work of course is because it’s giving me a 302 redirect and it’s saying you must be signed in to see this page, so obviously I don’t
have any of my cookies. So in order for this to work, I’m gonna need to make sure I patch those in in the headers. So that should be easy enough. Just go to inspect, go to network, grab any, okay, let’s go
ahead and refresh here. And we can go ahead and look, whoops, we can go ahead and look and see okay we’ve got this cookie, I can just go ahead and grab it. I will make sure to sign out when this talk is done. That way this cookie is invalidated, but let’s go ahead and do this. So cookie equals this. Alright great, it’s
always good when they’re storing the user id client side. Um, but oh well, you know. Any port in a storm. So first argument is the query string second argument is the headers so I need to provide cookie as cookie and now if I am not mistaken, this should do the trick. So if I do api.get, hello. Boom alright this looks
like the actual webpage. So this is a 200 and you can see, yeah. It looks good, I don’t see
my name anywhere in there but I’m sure it’s somewhere in there. Okay great, so that’s fine. Now of course what I’m getting is I’m getting all the html of this webpage. That’s kind of not really what I want. I want to know, I want to somehow know did this query return true or false. So the way I can figure that out is the easiest way is to just look. Okay in the HTML that comes back, is there some unique
string that I can find that will unique identify that yes in fact this returned true. So you can see there are a
few things that showed up when like they’re scheduled. I don’t think it shows up
anywhere else on the page, there’s also password,
that doesn’t show up anywhere else on the page. So let’s just use password for simplicity. And we’ll say instead of having the self. Self.include query and
this will just return get query.include cause I gotta make sure this return a .body and if the body of the
htp request includes the string password, then I’m good to go. And so let’s go ahead and
paste this in one more time. Api.include now a, turns true and ab which turns false. Cool so I’ve got my oracle. So now as I’m testing this I’m not gonna want to use the real oracle ’cause that’s gonna be really slow I’m making a bunch of http requests. I don’t want to do that
to myself or to them. So I’m gonna go ahead and
create just a stubbedapi that I can use while I’m testing and I can sub that out
later with a real api. So this one is gonna have a fake password, which is just gonna be
some random characters. Okay great. And then we’re gonna
have a def self.include same interface, it’s just gonna be fake password.include the query. Okay and so now I use the stubbed API, so it’s gonna be much faster. Right I don’t need to make an
htp request while I’m testing. Okay, so I need to build that algorithm. So let’s go ahead and do that. So we’ll just have a password cracker. And the password cracker
is going to be state fold it’s going to take in the API so we’ll inject that dependency just to make things a little bit easier. So we’ll have the API equal that we’ll also set the password to start off as an empty string and
we’ll successively find the password through different stages of this algorithm.
And I also want to set the number of iterations so
I can keep count of that. Initialize that to zero. Okay, so, what is this actually gonna do? So we’re gonna have, let’s
call it a crack method. And in the crack method we’re gonna sort of go through each of the steps of that algorithm. So if you remember, the
first thing we talked about was getting the first letter. And I need to like do some logic to figure out what the first letter is. So we’ll just do that, we’ll
just say find first letter. Okay great, actually
let’s put a bang on it. So find first letter. Okay so once I found the first letter, then I can keep building forward appending different characters
and seeing if it works. So we’ll just call that build forward. Okay, once I’ve built all the way forward I’ve fallen off the end of the string, it’s time to go backwards, right? So then we’ll do build backward. And then at the very end I’ll
just return the password. So that’s roughly the code that I want. Obviously it doesn’t do anything yet, but let’s fix that. So find first letter, so in
order to find the first letter I need to know the
alphabet I’m working with. I also need to know the
subject line, right? Because remember I’m not gonna try any of the characters in the subject line because they don’t necessarily tell me the characters in the password. So let’s just make those constants so say subject line equals password.chars. Those are all the characters
in the subject line if you guys remember. And the alphabet, for the alphabet I’m jut gonna use the letters a through z all lowercase because I’m gonna assume everything was lowercase
if there was uppercase this would fail and I would
try uppercase as well. And I’m gonna also do zero through nine. Zero through nine. And I need to splat that. And actually what I’m gonna do as well is I’m gonna shuffle all this just so that in case, just to kind of make
the numbers more round in the distribution in
case you know the letters are all on one side or something. Um, okay, cool. So that’s my alphabet,
I’ve got my subject line. Now to find the first letter, I want to iterate through
each of the characters that are not in the subject line, but still in my alphabet. Okay, so, should be easy enough. So alphabet minus subject
line .each do character. And for each character
I’m going to look it up in the oracle, so I’ll
say @api.include char. So that’s to check if this
is the first character. I’ll say if that’s true, then what do I want to do? If that’s true then I wanna say @password equals that character so
I have my first character and then I can return, I go and jump out of here. And if that completely fails
and I don’t find anything, I probably want to raise and say “could not find a first letter”. Okay, so this should be
able to find a first letter. Now, alright cool, now
one thing I want to do is I want to keep count of
how many iterations I’m doing throughout this whole algorithm. So make that a little
bit easier for myself, I’m also just gonna
have the include method that both does all the api calls but also logs the iterations. Whoops, okay. I’ll say @iterations plus equals one. Great now instead of include
api I’ll just do include there. So that should return the same thing but also just handle
all the logging for me. Cool, so, alright so that should handle getting the first letter. Now what if I want to build forward. So to build forward it’s
pretty similar logic, right. So now I’m not worried about
the subject line anymore. I’m just gonna iterate
throughout the entire alphabet. So I’ll say alphabet each do character and for each character in the alphabet, what I want to do is I want to see try out the current password plus one more character on the end. Alright so I’ll say query
equals @password plus char. Just appending it to
the end of the current currently known good password. So once I have that query, I can say if include the query then, then I know okay great the
password should be the query now. The query is correct. So I’ll say @password equals
query, take that char on and the easiest way that
I like to just do this is use it recursively. It’s actually quite
all we did recursively. So what you do is you
just build forward again. And once you’re done with
all the building forwards and eventually build
forward finally terminates, then you just jump out
of all the stack frames and just return all the way back up. So because we don’t expect
the password to be very long, this is totally fine. And this is actually just really nice. This is pretty much all you need to implement building forward. Now, to build backward, because remember we have to, whoops, build backward. To build backward we
basically do the same thing except, so we can just
copy this code right here. But instead of the password plus the char, we prepend the char to the password. So we just flip this around
we say char plus @password. And instead of build forward
we build backward, okay. And I think this should
basically do the trick. So provided that we’re, we
didn’t mess anything up, which is quite likely, but let’s go and see. So we can puts passwordcracker.new pass in the stubbed api so
we can see what’s going on and call .crack. Okay, let’s just cross some fingers. Alright, letter me now,
let’s see what happens. Oh, that, was that it? Yeah, that’s it okay, cool. So that seemed to work
with the stubbed api. That was a little bit. (audience applauds) A little bit underwhelming. Okay, so, hold on, hold your applause hold your applause. Um so that’s pretty good, but we like to have some
more logging, right? That was a little bit anti-climactic having that thing just
plop out on the screen. So let’s get a little more context going. So we’ll puts here cracking
password beep boop. Okay cool, so I like my
programs to talk to me and sound like computers. Alright fine for first letters we’ll puts found first letter and we’ll put the @password. Alright okay so then we’ll say, puts building forward okay
so it’s building forward. We’ll puts building
backward, builds backwards and then at the end we’ll say, Congratulations your password was found in @iterations iterations. Okay and then basically as I build forward and build backward I want to always mention once I say okay great your password is elongated, let’s just print out the password. So we’ll just puts @password
every time that happens. So cool, now one thing before we, before we actually run this. You know we’re all engineers, we notice there’s some code duplication. Let’s refactor this thing, let’s dry it up. Alright so there’s a nice way to do that. Instead of build forward
and build backward, we can just have one method called build. And build uh the thing goes on this side. Build will just take an
argument that’s like forward okay we’ll have a keyword argument. And we’ll just say, if forward then we’ll append the char
to the end of the password otherwise we’ll say
it’s char plus @password will prepend if it’s not
forward, if it’s going backwards. Alright so pretty straightforward. And then basically instead
of doing this build forward we just build and we pass forward whatever was forward originally. And now we can just
delete the two of these, uh whoops, yeah we can
just delete the old one and now instead of build forward this just becomes, just becomes build. Build forward true and this
becomes build forward plus. So now we should be able to try this out with the real api. So let’s see what happens. Okay fingers crossed. Hope the internet here is good. Otherwise we’ll be in trouble. Alright here goes nothing. Cracking password, beep boop. Alright it’s talking to us, found first letter, first letter is j. Here we go coming at j2. Alright this is exciting. (audience member cheers) Excited, yeah, alright! (labored breathing) (audience laughs) This is completely out of
my control at this point. It’s just all up to internet connection. Okay alright, it’s going look at that. We got a 2492, there’s a
little bit of symmetry there, that’s kind of cool. Um, okay we got a g. Things are moving. Um, I don’t remember
actually what my password is, so I’m just as clueless as you guys are. Alright cool, it’s still moving forward, it’s still going. Two nine, okay. I think it’s like 20 something characters, so it’s not, we’re not gonna
be here all day I promise. We’re gonna get out on time. Okay, okay, good, I’m, I get worried when it
sticks there too long, I’m like shit did one of the checks fail? Alright another j nice, wow. Alright, still going yeah. How are you guys doing by the way? (audience laughs) Doing alright? Yeah, we’ve got a minute here to kind of chat and check
in with one another. Alright oh nice, off the end r, p, o. Okay, oh no, no please don’t. Oh, okay, okay. So it fell off the end, so now it’s going to build backwards. It’s figured out that’s
the complete suffix. And it’s building backwards now. Okay it’s got the I, things
are moving, this is good. That’s a good sign. This is hundreds of
api calls made serially none of this is paralyzed so that’s this is poor guys are
running their servers. Yeah, I kind of expect
nobody’s at the wheel right now but that’s fine. You know maybe in a few days
when they check the logs they’ll see what the hell is going on. (audience laughs) Okay, a,s,d,o, okay. Oh and that’s it, 403 iterations. (audience applauds) Oh my god, we did it, yeah, yeah, woo. Oh that’s so, feels so good. Oh my goodness. Yeah, okay, so that was
me cracking my password. Now note I just wanted to
quickly go back to the math. So we decided that it would
take 432 expected queries just doing the math straightforwardly and our actual number was 403, so we were within 10%
of the actual answer. So holy shit math actually works. (audience laughs) That’s pretty cool. You know, sometimes it doesn’t so, that’s the thing about math. Um alright so takeaways. This is mostly just a funny story, but I felt like there was something here. So the cool thing at least to me, this is actually the first problem that I really actually had
other than unemployment that I solved through programming. And it’s actually kind of a magic moment in a programmers life when there’s like something that is like broken or lost or just you can’t do that’s
not you know your job related or launching a web app or whatever. I think we kind of get
acclimated to that sort of thing. But the idea that, I
feel like I lost my keys in a desert or something like that. In a way that was totally unretrievable and with the power of programming I can fix it, I can solve it. I can do this amazing thing that before I was able to program I was never able to do. And um it’s just kind of
awesome to have that power. And I think as programmers it’s easy to get complacent or get
pissed off at our tools or our abilities, but there’s something really awesome and magical about that that I think you know. It takes an experience like that to really drill that
in and drive that home. And I think that’s just pretty crazy that we can do shit like that. So that’s it for me, just
wanted to share that story with you all, I’m Haseeb Qureshi, I work at Earn.com, which
is a block chain company, you should check it out if
you haven’t heard of it. You can find me on Twitter @hosseeb you can read my blog
where I write about this and other stuff at haseebq.com and thank you so much for listening guys. (audience applauds)

12 thoughts on “RubyConf 2017: That time I used Ruby to crack my Reddit password by Haseeb Qureshi

  1. Quick correction: my slide analyzing the number of queries was close, but incorrect.

    The total number of queries is more correctly 2A + A/2 * N; I erroneously claimed that str[0] and str[-1] should take A queries. It’s actually AFTER we hit those characters that we get an extra factor of A when we fall off the end of the string. But that’s after hitting every character in roughly A/2 guesses.

    If you ignore the first character (which should be hit very quickly for any reasonably large password), then a tighter estimate is 2A + A / 2 * (N – 1). That leads to a final estimate of 450 queries, rather than 432. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *