CS50 for Lawyers 2019 – Programming Languages


[MUSIC PLAYING] DAVID MALAN: There’s any number of
languages in which we can communicate, English being just one of them. But computers, of course, only
understand binary zeros and ones. And so somehow we have to communicate
our thoughts– ultimately in binary– in order to solve some
problem with a computer. But certainly, we don’t want to program
computers by writing zeros and ones and memorizing the patterns
that they’ll understand, so we do somehow need to
adopt a process by which we can express our thoughts and
the solutions to the problems that we have in mind but in such a way
that the computer can understand them. Now it turns out it’s
quite meaningful when you see something like
Intel inside or AMD or any number of other
computer manufacturers that make what are called CPUs,
Central Processing Units, which you can think of as the brains
of sorts inside of your computer. Well, turns out that Intel
and AMD and other companies have decided in advance
what patterns of bits– zeros and ones– that
their CPUs understand. A certain pattern of zeros and
ones might represent addition. Another pattern of zeros and ones might
represent subtraction or multiplication or division or the act of moving
information around in memory or saving information from memory. And so those patterns are very
much computer or CPU specific. And frankly, I’d like to be
able to write software and write code that can run on your
computer and my computer and some other manufacturer’s
computer without really having to know all about those zeros and ones. So there too it would be nice if
there’s a process, a workflow, a tool chain via which we can communicate our
thoughts in a fairly accessible way but have them ultimately translated
into those zeros and ones. So what is it that Intel
inside really means? What is it that a CPU
actually understands? Well, it’s what’s called machine code,
zeros and ones that ultimately dictate what the computer should do– add,
subtract, multiply, or something else altogether. That machine code literally
might look something like this. In fact, let me give you just a moment
and from all of these zeros and ones, can you glean perhaps what this very
program would do if run on a computer? No? Well, odds are you couldn’t imagine
what this would do because even I can’t read these zeros and ones. But if it turns out you fed these
very zeros and ones to a computer, it would print out on
the screen quite simply “hello world,” which is sort
of the canonical phrase, one of the first phrases ever printed
on a computer screen back in the day when computer languages
were first being invented. Now you, of course, would never
know that, should never know that, nor should even the most
sophisticated of programmers because this is just too low of a
level to communicate one’s thoughts in. Far more compelling would be to
operate at a level closer to English or whatever your spoken
language might be. And so quickly, when humans
invented computers decades ago, did we decide we need something
different from machine code. We need something at a
higher level of abstraction, if you will, something
that’s more familiar to us but that’s close enough that the
computer can somehow figure out what to do. And so thus was born assembly code. Assembly code is an
example more generally of what’s called source
code, which is typically English-like syntax more familiar to us
humans that can somehow be translated eventually down to machine code. And assembly code, which
is one of the earliest incarnations of this general idea,
looked a little something like this. Now it too looks pretty
cryptic, though hopefully not quite as cryptic as just seemingly
random patterns of zeros and ones because there is some
organization to this text here. It’s not quite English-like
I would say, but there’s some familiar sequences of
characters that I can perhaps ascribe some meaning to. I see words that look a little
familiar– push queue or at least push, move queue, or perhaps move. Sub– maybe that means subtract
or call, as in to call a function or a procedure, xor and
add and pop and others– these seem to be reminiscent
of English words. And so while cryptic– and I would surely need a manual in
order to figure out what these mean– this is a standardization of how
you might communicate instructions to a computer. Indeed, all of those keywords push
queue, move queue, sub queue and so forth are literally called
instructions, and those are the names given to the instructions,
the commands that Intel and AMD have decided that their brains,
their CPUs shall understand. To the right of these instructions
is some cryptic looking syntax now– dollar signs and percent
signs, commas, and others. Well, those are used to communicate
what are called registers. It turns out that the
smallest unit of useful memory typically inside of a computer– in particular inside of a CPU– is what’s called a register. A register might be eight
bits back in the day, 32 bits more moderately, or even 64 bits. And that is really the
smallest piece of information that you can do some operation on,
the smallest unit of information that you can add to something else or
subtract from something else and so forth. So when a CPU is doing its arithmetic–
addition, subtraction, multiplication, division, and so forth– it’s operating on pretty small values. They might be big numbers, but they
only take up maybe 32 or 64 bits. And those registers, those
chunks of memory have names. The names to be fair are
cryptic, but they’re expressed in the same language, assembly code. So that you’re telling the computer
in this language what should you move to where and what
should you add to what. And so once you acquire a taste,
if you will, for this language, does all of this begin
to make more sense. And frankly, if I really scour
it, aha, down here at the bottom, I do see explicit mention
of that phrase hello world. And these other lines simply call in
to action the printing of that phrase on the screen. But frankly, this doesn’t look
all that compelling still. It’s certainly better
than zeros and ones, but assembly code is
generally considered to be fairly low level, not as low
level as zeros and ones and not as low level as electricity from the wall. But it’s still low level
enough that it’s not really that pleasant to program in. Now back in the day, decades ago,
this was all you had at your disposal. And so surely, this was
better than nothing else. And in fact, some of the earliest
games and some of the earliest software were written in assembly language. So it truly was experts back
in the day writing frequently in this low language, and
you might still use it today for the smallest of details. But on top of assembly language
have evolved more modern forms of source code– newer languages
with easier to understand syntax and more and more features. And one of the first successors
to something like assembly code was a language called C, quite simply. C looks like this. Now I dare say this too
remains fairly cryptic, but I feel like we’re
walking up a set of stairs here where things are finally
starting to look a little more familiar and a little more comfortable,
even though there might still be some distractions of syntax. These angled braces
and these curly braces and quotes and parentheses
and a semicolon– all of that you might get to in an
actual course on programming itself. But here we use this as demonstrative
of a fairly more English-like syntax with which you can
express the same program. Printing hello world
in a language called C can be implemented
with precisely this code. But frankly, we’re starting to stray
pretty far from that low level language that computers ultimately understand and
need to accept as their input binary. So how do we get from this higher
level language, so to speak, called C down to those zeros and ones? Well, frankly, the process
by which this happens tends to make an intermediate stop in
what’s called that assembly language. So a human might write code
like this, like I did here in C. You might then use a program that
converts the C code to assembly code and then another program that
converts that assembly code down to those zeros and ones. And frankly, you could
probably use a tool that does both of those
steps at once so that it creates the illusion of going directly
from this to so-called machine code. Those zeros and ones. So let’s take a moment and actually
do that on my computer here. This is a process that
you can do on a Mac or PC running Mac OS, Windows, Linux, or
any number of operating systems. I happen to be doing it on a Mac here. And I’m going to use a fairly common
program these days called a text editor. This is a very lightweight
version of a word processor. It doesn’t have bold facing
and underline and italics. It really just allows
you to type text, but it does tend to colorize it for
you to draw your attention to the disparate parts of a program. And I’m also going to open up
what’s called a terminal window. A terminal window is a keyboard-only
interface to what your computer can do. So while I still have
access to my mouse, it’s not going to be all that
useful for this environment because anytime I want to run a program
or execute a command or make my Mac do something, I’m going to have
to do it from my keyboard alone. Let’s take a look. Now here I am in front
of my text editor, and I’m going to go ahead and
create a file called sayhello.c, dot c being a conventional file
name to indicate to the computer that the code I am
about to write is going to be implemented in that
language called C. Now at the top of this program is
where I’m going to write my code, and I’m simply going to
transcribe what we just saw. Include standard IO dot h int
main void open curly brace followed by a closing
curly brace ultimately and then printf quote unquote hello
comma world backslash n, finally, a semicolon. So herein I’ve written my source code. And indeed, it’s been
colorized by the text editor simply to draw my attention to
disparate parts of this program and were we to dive deeper
into C, in particular, we’d see what each of these
different symbols mean. But for now, let me
propose that we only care about what this program
is meant to do, which is to print ultimately hello world. But all I’ve done is
write a program in C. I somehow have to get it to
its form of zeros and ones that the computer
ultimately understands. So it’s not sufficient just to
save the file because, indeed, all that’s been saved are these
letters and symbols in some file called hello.c. I somehow have to convert
that file to zeros and ones. Well, it turns out that there
exists tools called compilers. A compiler is simply
a piece of software– written presumably by someone else– that knows how to understand C,
perhaps knows about assembly code, but definitely knows about
those patterns and ones that Intel and AMD ultimately
require that I output in order to get them to execute commands. So how do I go about compiling hello.c? Well, it turns out installed
it on my computer– and perhaps even yours– is a program
called CC, the C compiler, if you will– compile simply referring
to this process of translating one language to another. And so I’m going to go down here
to the lower portion of my screen wherein I have a prompt,
dollar sign that for whatever historical reason simply represents
a prompt in a terminal window. And it’s in here that I can
type these textural commands, cc -o hello space hello.c– a fairly cryptic
incantation of commands. But ultimately, this
has created a new file on my computer called quite simply
hello with no file extension. Now on a typical Mac or PC,
when you want to run a program, you would typically just double click
its icon, and it would be loaded up. And you would see its ultimate behavior. But here in this command
line interface, so to speak, wherein I can only type
commands textually, I have to tell the computer to run
this program only via my keyboard. And the convention via
which you can do this is quite simply to say dot slash hello
where dot refers to the current folder or directory in which this file is. The slash just separates from
its name, and hello, of course, is the name of the program I’ve written. And here we go. With the stroke of enter,
I now see hello world. And thus was born my
very first program in C. Of course, it took me quite
a while to get to this point, and I didn’t necessarily even
understand all of those lines of code along the way. But I did understand that my goal
at hand and the problem to be solved was to print quite simply hello. But there’s a bit of
overhead, of course, to a language like C wherein there’s
not only the syntactic complexity of it, but that frankly gets much
more familiar over time. There’s also this additional
step, this middleman, a compiler that has to exist and
somehow translate your source code to machine code. In fact, what we’ve effectively
done just now is this. If up here is my so-called
source code stored in any file, for instance, hello.c, and I
want to convert it ultimately to so-called machine code, the zeros
and ones that my computer understands, I somehow have to get
from input to output. And the middleman here
is again this tool called compiler in the context of my
having written this program call in C. I used a program, a compiler, called CC,
but any number of other options exist. You might have heard of Visual Studio
perhaps or Eclipse or yet others still. This middleman simply takes
as input that source code in C and produces as its output that
machine code that the computer expects. And so when I type that
Command cc -o hello hello.c, it ultimately was telling my
computer take as input hello.c, produce as output a new
file called hello, inside of which are those zeros and ones. Now not all languages operate like this. It turns out that more modern
languages skip that step of compilation altogether or at least hide
that detail from the user so that he or she doesn’t necessarily
need to know how to compile their code. It’s handled more automatically. Now some languages instead use
not a compiler but an interpreter. Whereas a compiler takes as input one
language like C and produces as output another language like
machine code, an interpreter instead takes as input some
source code and then runs it or interprets it line by line
top to bottom, left to right. And whenever it sees an instruction
like print, it thinks to itself, oh, I know how to print
something on the screen, and it goes and does it on
behalf of that source code. It does not, strictly speaking,
convert those instructions instead to zeros and one. It is instead the
interpreter itself which is just a program that itself is
implemented with zeros and ones that the CPU understands. And those zeros and
ones collectively know how to recognize keywords or
functions in that source code language it takes as input in order to
execute it on the program’s behalf. So what is an example of
an interpreted language? Well, among the most popular
ones today is that called Python. Python is especially popular in
the world of data science and web programming and in
command line applications ones written at the so-called terminal
window via which I can solve problems. And so Python is notable too for its
relative simplicity– certainly vis a vie something like c. In fact, in order to implement
the equivalent program in Python that I just implemented in C,
it suffices to write just this. Say what you mean and little more. There’s less syntax here. There’s fewer keywords
that are unfamiliar. It’s just instead the verb or function
print followed by hello world– no semicolon, no curly braces,
fewer symbols altogether. But how do I go about
running a program in Python? Well, it turns out that typically
Python is interpreted, not compiled. So I’m not going to run it
through a compiler per se, but I am instead going to interpret
it line by line– albeit just one line with this particular program. So let me go back into my text
editor and terminal window and this time create a
file called hello.py– dot py being the conventional file extension
for any program written in Python. And in hello.py, I am now
going to write that one line program print open parenthesis
quote unquote hello world. How do I interpret this
file called hello.py? Well, it turns out I run
a program that itself is called Python, which is my interpreter. And so I run in my terminal
window Python space hello.py, and the output is now the same. So what has just happened? Albeit just one line, what that
program called Python has done is open up this file called
hello, read it top to bottom, left to right, albeit
quite quickly, recognized that it knows this keyword or function
called print and therefore knew what to do next. It went ahead and printed hello world on
the screen and then automatically quit. So this seems like a nice thing. No longer do I have to remember and
take the time to compile my code, but surely, there must be some price. And indeed, one of the implications
of saving that step no longer having to compile your code but
instead just jumping right to its execution or interpretation
is that you pay potentially a performance penalty. You certainly can’t quite see it in
a program as short as hello world. But if you were to write a
program with hundreds or thousands or millions of lines,
the overhead required to read that file top to
bottom, left to right, and to figure out based
on the instructions therein what it is the
programmer intended actually does take non-zero amount of time. And it can surely add up for the most
computationally complex of problems– anything involving an analysis,
anything involving loops or cycles, you can certainly begin
to feel its effects. But that’s OK because we humans
have been fairly creative over time. And as we’ve invented more and
more programming languages, we have fortunately also
invented more and more solutions to problems like these. And so it turns out that even though
this all happened quite quickly, when I ran this interpreter called Python,
odds are if my computer were smart, it was probably doing me a
favor underneath the hood without my even knowing. And in fact, what Python
and some other interpreters do is actually compile
your code for you, save the results in a temporary file
that you yourself might not even see, and the next time I run this program,
especially if it’s large and complex, Python will skip this step of
reinterpreting the file again and again and instead look at that precompiled
version of my same program– therefore, saving some time
but achieving the same results. Only if I go back and change my
code and make changes to my program does Python need to regenerate that
cached version of code, if you will, in order to reuse that again and again. And this intermediately cached
code is generally called byte code. It’s not quite zeros and ones, but
it’s closer to it than Python itself. And so for this same program
in Python, were my computer to actually compile it for me, what I
would actually see is code like this. Much like assembly code is it
fairly cryptic, but at least in there is some familiar
phrase hello world as well as the function I ultimately called,
which is that known as print. Now Python and C are not the
only languages out there. In fact, there are dozens
in vogue these days. And there are hundreds–
if not thousands– that humans have created over time. For instance, depicted here
is perhaps one with which you yourself are familiar
or at least heard of. It’s called Java, and it happens to
be an object oriented programming language in a language,
which means it has features beyond those earliest of
ones like C. Here, though, is perhaps the simplest way via which
you can implement that same program hello world, but it–
not unlike C– has a bit of overhead, a number of samples
and words that at first glance certainly are not obvious. But ultimately, that’s
all this program does. But Java is distinct in that it took a
different approach to another problem that we’ve not yet tripped over. I’ve been running and running
this program thus far in my Mac, and I compiled it particularly
for this Mac on an Intel CPU. But it certainly stands to
reason that you or someone else might not have the same computer
or operating system as I, and it would seem to be quite
the burden on the programmer if they have to compile their code
in a different way for you and for me and for everyone else. And so it turns out that this cost
of doing business, if you will, shipping different shrink wrapped boxes
back in the day of the same program for different computers
and OSes is ultimately solved by way of a virtual machine. A virtual machine, as
the name implies, is not a physical machine but a virtual one
implemented, as they say in software– software that humans have
written that mimics the behavior of a virtual imaginary machine. And then companies like Sun and
others have implemented support for that virtual machine
for Macs and for PCs and for multiple operating systems. And so Java subscribes to the
monitor of write once run anywhere. You needn’t compile it again and
again for different platforms, rather you can install on each
platform its own virtual machine and run the exact same code. So it’s simply a different approach
to an otherwise omnipresent problem, but Java falls into a class of languages
that uses that particular technique. And what other languages are out there? Well, very popular these days on both
servers and clients is a language called JavaScript, such
as that pictured here. Here is yet another
language– this one called Ruby– that replaces the
word print with just put, but it too is more syntactically
simpler, much like Python itself is. On the other hand, here is
C++, incredibly common still, especially on PCs, along
with other languages as well. But in C++, you see code reminiscent of
C itself, also conventionally compiled. Now if you’d like to see any number of
overwhelming examples of how you might quite simply in hundreds of
languages say hello world, take a look at this URL here. And in fact, there’s
so many other languages that are popular and powerful,
sometimes in different ways. In fact, it’s not just the case
that you use one particular language for one particular job. There are many tools that
you might bring to bear on the exact same problem at hand. In fact, the reason that
so many languages exist is that over time, we humans have
perhaps rightly or arrogantly decided that we can do
better than languages past. And so humans invent new languages
that have other features, different approaches, and of course,
reasonable people can disagree. And so you have some languages that
can achieve the very same task. They just do it differently. The text looks a little different. The features are a little bit different. And so it’s up to the programmer
as part of the design process to decide what tool is best for
the job, not unlike a physical tool that you might have
in a home tool chest. Among the most popular these days
perhaps are these here Bash and C and C++ and C#, Closure and Erlang and
F# and Go, Haskell, Java, JavaScript, Objective OCaml and then PHP, Python,
R, Ruby, Scala, Scheme, SQL, Swift, and so many more. In fact, if you’d like to see
a nearly exhaustive list of all of the language humans have
invented, take a look at Wikipedia here, which goes into
so much more detail. So suffice it to say that
computers can print hello world, but they can do so much more as well. So what are the basic building blocks
that can be found in languages like C, like Python, Java, C++,
and any number of others? Well, let’s return to the pseudocode
with which we began to program, albeit verbally, some time ago. Here we had the algorithm via which
to find Mike Smith in a phone book and to search more generally. Recall, though, that laced
throughout this pseudocode or program really were a few constructs that were
semantically distinct from each other. Let me go ahead and highlight in yellow
some of the verbs that we saw last. Pickup and open to and look at
call, open and open and quit– all of these were calls to
actions, verbs or functions, as we described them previously. Well, it turns out that in C and in
Python and other languages as well, we have not necessarily these
same words but these same actions. For instance, we’ve
seen in two languages already that you can print
hello world on the screen. That function or verb in C
happens to be called printf. In Python, it happens to be
called more specifically print. And so those are examples of
functions in those languages as well. What else did we see last time? Well, we had if and else
if and else if and else, and we describe these as
conditions or branches via which you can make decisions
and either go left or right down the fork in the road. Well, in Python in C
and in other languages too do we have those
same features as well. We then looked at Boolean
expressions, questions that you can ask to which there are
yes and no or true and false answers. In Python and C, we’re
going to see those as well. And lastly, we saw loops– go back
to step two, go back to step two– a programming construct via which
you can do something again and again. Now here we did this in order
to keep looking for Mike. But in the real world, might you be
searching for data in a large database or an Excel file, a CSV
file, Comma Separated Values. And so you might want to have
some programming code that opens that file and then
iterate or loops over one line after another doing some
calculation or analysis, loops enable us to do exactly that. So let’s go ahead now
and see if we can see these constructs in a common
modern programming language. We’ll use Python if only because it’s
incredibly commonly used at the command line within a terminal window. It’s incredibly commonly used
nowadays for web programming. And it’s ultimately incredibly useful
in the context of data analysis and data science, more generally. So let’s begin where we left off
recreating a program in Python that quite simply says hello world. To do that in my text
editor here, I’ve created a file now called hello0.py to
suggest that we’re going to do this a few times, starting at zero, in order
to iteratively improve on this program. Well, herein I’m going to do print quote
unquote hello world, saving my file. No need to compile it but
I’m now going to go ahead and type Python space hello0.py. And there we have hello world. Now this program, of course, is by
definition not terribly dynamic. No matter how many times
I run this program, it’s going to print hello world,
hello world again and again. Suppose that I instead
wanted to get user inputs. Well, it turns out that Python– like other languages– has a function
via which you can do exactly that. And a function, of course,
takes optionally inputs and optionally produces some output. Now in Python, perhaps the easiest
function with which to get user’s input is called quite simply input. So let’s see how we might use that. Let me go ahead and create
a new file called hello1.py. And in this file am I first going
to call that function called input, and input is called exactly that. But it takes as input some input, which
is to say the string or the phrase, the sentence via which you want to
prompt the human for his or her input. So for instance, I’ll
go ahead here and ask what is your name question mark space. And now this function is going to
return to me a value, so to speak. I don’t know how input is implemented. Someone else yesteryear
implement this for me. But I do know per its
documentation that when I call it, it’s going to do the equivalent
of handing me back a slip of paper on which is written the user’s
input from his or her keyboard. But for me to make use of this
input, I need to store it somewhere. And in a programming
language, much like in math, you have access to what
are called variables. Now a mathematician might call
their variables x or y or z. And in programming could
we also call our variables the same but more
conventional is ways to use names that are a little more
descriptive– actual words that describe their contents and the means
by which I can store the return value, so to speak. What is handed back from a function
like input is in a variable that I’ll call perhaps aptly name. And in order to store the return value
of input into a variable called name, it suffices simply to
use a single equal sign. And this equal sign is not to
be confused with the equal sign that you and I know in math. After all, in math,
using an equal sign would apply that what is on the left
equals what is on the right. And ultimately, that’s our goal. But initially, we only know
the value on the right– the so-called return value from input
that’s handed back to me, so to speak, from the function. So in fact, with the equal sign in
many programming languages is called– Python among them– is the
so-called assignment operator. And so you consider what’s
happening here as being this. On the right-hand side is a function. It’s handing back some
value, the user’s name. The equal sign implies
copy that value from right to left into whatever is on the left. And so if on the left I
have a variable called name, that variable– much
like x or y or z– is going to store the
user’s input ultimately, so I can use it subsequently
in some other line of code. Now what might that
subsequent line of code be? Well, suppose I’d like to print not
just hello world but hello, David, or hello, someone else. Well, now I can simply structure
the second line is taking as input the contents of that variable. So I might go ahead here and say
print quote unquote hello comma, and I now somehow need to
append to that shorter phrase the name that has
actually been typed in. And there’s a number
of ways we can do this. For instance, we might do it as follows. It turns out in Python
that you can use plus not only to add two numbers together
but also, in some sense, to add two words or two strings
as they’re called in a language. And if I do plus in between
hello and that variable name, the effect is going to be to join or to
contaminate those two strings of text together so that would
ultimately is printed is hopefully hello comma
David or hello comma yourself. So let me go ahead and save this file
and in my terminal window now run Python of hello.py, I’ll
input my name David. And now notice my input is separated
by one space from that question mark because on line one did
I preemptively include that space between those double quotes. Now I’m going to go ahead
and hit enter, the effect of which is to have input
return that value, storing it in the variable called name, and
then in line two, just go ahead and print hello comma followed by name. Now just to be clear that this
is not hardcoded somewhere else. Let me go ahead and
run it one more time. Python hello1.py. And let’s go ahead and type
your name and there too do we see hello your name. So the program is now dynamic,
and I’ve outputted dynamically whatever it is the human is typed in. But it turns out there’s any number
of other ways we can do this. And so just so that we’ve
seen a few different ways, let me go ahead and create
a new file called hello2.py wherein we have almost the same
code, but we’re displaying your name or mine a little bit differently. As before, I’ll go ahead and define
my variable called name, assign to it the return value of input asking
the user for his or her name. And then on my second line
of code, we’ll again go ahead and say print hello comma. And then following that,
I actually have a choice. I don’t necessarily
have to just concatenate one input onto that first string hello. I can actually combine these as
two separate inputs to print. It turns out that print,
like many functions, can take either zero or one or
two or any number of inputs. You simply, as the programmer,
need to separate them by commas, not the comma
that’s inside of the quotes. That’s just my English grammar
separating hello from your name but outside of the quotes. And so my text editor is displaying
it a little bit disparately in white. So in this case, am I
using the same function print passing in not one but two
arguments or inputs, the second of which is that actual name. Now there’s a minor bug here
that I’m not yet anticipating, but let’s see what happens next. If I go ahead now and run Python
hello2.py, type in my name, voila it’s so close and almost right. But here perhaps is my first bug. It’s not necessarily a bug that’s
breaking the program altogether. But to be a little bit nit-picky, I
think we could do a little bit better, say, grammatically there seems
to be two spaces instead of one but where are they coming from? Well, it turns out that
functions, when they take inputs, can also have default behavior. And in this case, print
is designed by its authors whenever you pass in
two or more arguments to just separate them by
default with one space. Now I can fix this mistake
in a couple of ways. But the simplest way is just
to remove it from my own input, saving the file again and
rerunning Python of hello2.py gives me now what is your name? David. And we’re back– hello, David. Now it turns out there is yet more
ways than you can format information on the screen. And just so that we’ve
seen one other way, allow me to create a fourth
file called hello4.py. In hello3.py, am I going to start almost
exactly the same declaring a variable called name, assigning to it
the return value of input, asking the user for their name. And then in my second line
of code am I again going to use print starting with a
parenthesis and a close parenthesis. In there am I ultimately going
to say hello comma something, and the something this time is literally
going to be name but not quite name on its own. Indeed, if I were to
run this program now, what do you think might be printed? I’ve said to the program print
hello comma name, but unfortunately, name is as written, N-A-M-E. And so I
would literally see that on the screen no matter what I typed in. But it turns out in Python, if you
surround your variable’s name with curly braces, typically found
just above the Enter key– at least on a US keyboard– you can tell
Python that this N-A-M-E is actually the name of a variable, not just
a string of text on its own. But in order to tell Python that it
should treat this string, this input to print a little bit
differently than usual, you have to fairly cryptically
prefix it with a single F, thereby telling Python
that this is what should be called a format string or string– That is, a string of text
a sequence of characters that should be treated as
formatted in a special way. And according to Python’s
own documentation, if you format your input using
these curly braces inside of which is the name of a
variable, Python will go through the trouble of plugging
in the value of that variable there and therefore
formatting it for you. And so I’ll again go ahead and run
Python of hello3.pi, input my name, and voila, we’re back to Hello, David. So what are the takeaways then
from the simplest of programs? Well, we clearly have the ability
to print information on the screen. But we have at our disposal
a function, a piece of code that someone wrote long before me
that takes as input perhaps just a string like hello world. But if you pass it multiple strings
will it handle those as well. And if you pass it a special type of
string will it handle that as well too. And so depending on the
documentation for some language do you have these and so many
more features available to you. And so one of the first steps in
learning any programming language is not to take a formal
lesson or class but simply, quite honestly to read
the documentation. And once you have under your
belt some knowledge of one or two or few programming languages, it is
much easier in the computer world to pick up new ones than I daresay
it is in the human world where you have a much larger vocabulary at hand. All right, enough about printing. What more can we do? Well, programming
languages can typically handle any number of arithmetic
or mathematical operations– addition, subtraction, and
many others among them. So why don’t we go ahead and create
a program that quite simply prompts the user not just for
their name or string but rather two inputs, two numbers. And we’ll call them
more familiarly x and y. I’m going to go ahead and
declare a variable called x assigned to it from right
to left the result of asking for the user’s input of x. And then I’m going to go ahead
and do precisely the same thing on another line of code
defining a variable called y and assigning to it from right to
left the return value of another call or invocation of input– this time, prompting the user for y. And then I quite simply am going to
go ahead and print out the result. So print x plus y– this time not using any
double quotes at all because what I literally
want to print is x plus y, not the concatenation of two
strings, not a string at all, but rather just a number. In my terminal window now
will I go ahead and run Python on the name of this file, arithmetic.py. I’m prompted for x, which I’ll say one. I’m prompted for y, so I’ll say two. And indeed, 1 plus 2 is 12,
but no 1 plus 2 is not 12. So what’s actually happened here? Well, this is the first real bug, if you
will, that I’ve introduced to my code. Now step one here has x equals input,
prompting the user for x, and step two has y equals input
prompting the user for y– little different from before that
we’ve chosen different variable names. Now on line four– and I’ve separated with
this blank space just to make it a bit easier to
read do I have print x plus y. But 1 plus 1 is surely not 12. But what appears to be happening? Well, it’s no coincidence
that the answer Python’s giving me is my first input
followed by my second. It would seem, in fact,
that Python is contaminating my first input to my second
that is joining them just like hello comma David. So why is that happening? Well, I would hope that
plus when given two numbers would, in fact, add those two together. But here’s a catch. It turns out that underneath the
hood, Python, like many languages, actually has what are
called types or data types– that is to say, different ways
of representing information if that information is
a number like an integer or even a real number
with a decimal point or a string of text that is a
sequence of characters or maybe more generally, a value like true or false,
a so-called Boolean value, and even others still. And in this case, indeed, it turns out–
and you would only know this by trial and error by having read
the documentation first– does the input function
built into Python return to you not a number, but a string. Even though what I typed on my
keyboard looks like a number and surely is in practice, it’s actually
not being stored as such by Python. It’s being stored and said in such a
way that the computer is interpreting it as a string of text. So we have to be ever more explicit. And indeed, computers don’t
necessarily know what I intend. Maybe the goal at hand was to
write a program that concatenates one number against another. And so if I really want the user’s
input to be treated as numbers, I somehow have to coerce
it to such or convert it or more technically, to cast
it from string to an int. Now it turns out Python has
other functions with which I can fix this mistake, and I might
do this in between these lines here. X should not just be
whatever the user typed in. I want x to be the result
of converting whatever the human typed in into the integer
or into version of that string. Meanwhile, can I fix y by
the same y equals int of y, thereby telling Python even though,
yes, the human typed something in a keyboard, therefore
implying a string, go ahead and convert
that one and that two to an actual integer underneath the
hood, a pattern of bits, if you will, that represents not the ASCII
value or the Unicode value that the user typed in but
the underlying pattern of bits that represents one and represents two. Let’s go ahead now and rerun this
file as Python arithmetic py, inputting again 1, inputting again
2, and lastly, hitting Enter. This time, we have three. Unfortunately, we’ve paid a bit of a
price, albeit just a matter of style. I seem to have increased the length
of this program from just three lines to five just to fix a simple mistake. But it turns out that just as functions
take input and produce outputs, so can you pipeline them, so to speak,
or nest them so that one function’s output is another function’s input. The result of which is that we
can kind of tighten this up. I can instead on line 2
here delete what I have and on line one alone
simply say that once I get back the return value of input– the user’s input, if you will, return
on a conceptual sheet of paper– go ahead and pass it immediately
to that function called int and surround the whole
thing with parentheses, thereby passing the output of
input into the input of int and then assigning the
result from right to left. And again, here can I get rid of line
3, passing the output of this input to the input of this int
and again surrounding it with parentheses in order to
pass its output into its input, ultimately assigning from
right to left that value. Let’s to be sure go
ahead and save this file. And then in my terminal window, run
one final time Python of arithmetic.py, inputting 1, inputting
2, and still do we get 3. Suppose, though, that we not
only want to take a user’s input but make a decision based
on it, taking out for a spin this notion of a condition
with some Boolean expressions. How might we do that? Well, let me go ahead and create
here a file called conditions.py, inside of which I’m again going
to ask the user for two inputs, call it x and y, and then I’d like to
determine whether x is less than y, greater than y, or exactly the same. Well, as before, I can declare my
variable called x, assign to it– preemptively this time– the result of passing to int
the return value of input asking the user for just x and
then define another variable called y, assigning
to it the return value of int, which is passed, in
turn, the return value of input, asking the user for y. And now with these two
numbers in hand, am I going to proceed to do the following. If x is less than y followed by
a colon, indented below that, I’m going to say quite
simply, well, you know what? X is less than y, simply
to inform the user as much. And then back aligns with the if. Am I going to say else if x
is greater than y with a colon and then indented below that,
print x is greater than y. Now those are the
semantics that I intend, but it turns out you have
to read the fine print. In Python, it is not
else if that you can use. Humans some years ago decided that
slightly more succinct than else if would be literally elif. And that, in fact, is the correct syntax
to use when you have a second fork here in the road. But indeed when we have now a
third, we might do this else if– there I go again– elif x equals y, let me go
ahead and say print x equals y. But there’s a bug here already because
equal does not mean what you think. Indeed, before we’ve been using the
equal sign as the assignment operator, so to speak, copying from
right to left some value. And in fact, here on line
8, there’s already a bug. Using a single equal
sign between my x and y here would have the effect not
of comparing those two values but instead copying from
right to left that value in y into that value for x, thereby
making them equal no matter what. And so it turns out that
we humans painted ourselves into a bit of a corner some years ago. We’ve already used equal to
assign one thing to another. So it turns out that
humans in many languages decided well, let’s still use
equal but two of them back to back. And so if you want to use not the
assignment operator but the equality operator, do you want to use two
of these things back to back. And so now I have a program that asks
quite simply if x less than y, print as much. Elif x greater than
y, print as much then. Elif x equals y, print
the same, x equals y. But we don’t necessarily need this
third condition, it turns out. Logically, if you’ve got
two numbers, two integers, I’m pretty sure, by definition,
it’ll either be greater than 1, less than the other,
or exactly the same. And so we can save a
tiny bit of time here by not even asking that third question. If I know that x is not less than y and
I know that x is not greater than y, I might as well just confer logically
that they must be actually equal. And so here we have the
first of my Python programs that much like my pseudocode
for finding Mike Smith allows me to take users’ input
and then compare it in order to take a different fork
in the road based on its value. So my Boolean expressions here are
x less than y and x greater than y and that’s it. And my conditions here or the syntax
with which I induced these branches are my if, my elif and my else. But important in Python– unlike in pseudocode–
is some of this syntax. The colon very specifically say
do the following if this is true, and the indentation,
the multiple spaces– here I typed four– is ever so important as well. Whereas many programming
languages are a bit loose when it comes to
whitespace, so to speak, how many times you hit
the spacebar or tab. Python enforces that you have the
same amount of indentation everywhere. And so if I want lines four and six and
eight here to all line up logically, they must literally do so in the file. Similarly, must lines five
and seven and nine line up just right so that
they are only executed if the lines are just
above them are true. Let me go ahead and save this file. And in my terminal window,
run Python of conditions.py, typing in shall we say 1 for x, 2 for y. And indeed, x is less than y. Let’s run it again, this
time, flipping things around. X is 2. Y is 1. And indeed, x is greater than y. And one third time, which should
be inferred if x is 1 and y is 1, then, indeed, x equals y. Now programming is
not all about numbers. In fact, pictured here
in a program I wrote already is answer.py wherein we
have lines of code that, again, prompt the user for input but
this time, leave it as a string. And so on line four am I asking
the user for his or her answer to a yes or no question. Presumably the user might type little
y or little n or perhaps capital Y or capital N. And indeed, I’d like this program
to handle any number of those cases, as any program might. On line 7 here am I asking
then two possible questions. If c– the name of the variable
I gave to the users’ input– equals equals capital Y. Or to
be robust, that variable c equals equals lowercase y. Let me go ahead and conclude and
print that the user meant yes. Meanwhile, if c equals
equals n capitalized or if c equals equals c
in lowercase, similarly do I want to conclude
that the user meant no. And so here the new operative
word is quite literally or. In Python, it tends to
be fairly English-like, ever more so than C and
other languages where if you want to do something
or something else, you literally say quite simply or. If I wanted both situations to
be true, albeit illogically, could I use the actual word and. But of course, the user’s input
can’t simultaneously be capital Y or lowercase y. And so here is using or apt. Now when programming,
you don’t have to use only those functions that are handed
to you by the particular language. You yourself can invent
functions of your own. For instance, let me go
ahead and in a file called return.py implement a program
that takes as input an integer or number from the user and then quite
simply prints out the square thereof. So if the human types
in 2, I’ll print out 4. If the human types in
3, I’ll print out 9. And so how when might
we go about doing this? Well, perhaps in a
familiar way now, x shall equal the result of converting to an
integer whatever the user’s input is after prompting them for x. And then I’m going to go ahead quite
simply and print out, well, x times x. That is the square of x. I’ll go ahead and save this
and in my terminal window run Python on return.py and as proposed,
square 2 and as proposed, square 3. And indeed, this program works exactly
like this but to square a value has kind of a nice ring to it. And the fact that it happens
to be implemented as x times x is really just an mathematical
implementation detail– something that I shouldn’t really
have to worry about or remember. I would just like to
square the user’s input. So wouldn’t it be nice if there
were a function in Python– or any language for that matter– quite simply called square. Indeed, I can make that happen,
whether or not it exists, and simply define it myself. And so here I’m going to
go and do the following. Using Python’s keyword called def– short for define– I’m going to go ahead and
define a function called square. I’m going to specify to Python that
that new function shall take input that I’ll arbitrarily but
conventionally call n for number. And then with a colon as
before and some indentation 2 am I’m going to go ahead and
return quite simply n times n. In other words, the math is the same. The implementation details are the same. But what’s new here is
this new keyword return. Just like with the input
function built into Python, some human in that
function’s own implementation had a line of code that said return to
the user whatever they have typed in. Here am I doing the same but returning
not some users’ input but rather n times n. And so you can think of this function
called square much like that input function, jotting down on a
digital piece of paper that value, handing it back to the caller
or whoever uses this function, and letting them use it as they see fit. So now rather than use x times x myself
can I more conceptually clearly say square of the user’s input x. And the fact that x is not
the same as n is quite OK. It is this function square that
presumes to call its own input n. I can call my own input to
square whatever I want, say, x. So let’s go ahead now and run
this program and see what happens. Based on these definitions, it would
seem that I could square x in this way. If I go ahead and run this program
again, Python return.py, and type to, I get more output than
I surely intended. In fact, this is the first
of my truly bad mistakes. Here do I see what’s called
traceback, which is sort of a trace or a log of everything
the computer tried to do. And you’ll see some hints, however,
arcane this output, that line two is where my mistake probably is. In particular, I have some
kind of name error in Python where the named square is not
defined, and yet it’s right here. Well, it turns out that
Python and a lot of languages take things fairly literally. And if when you’re
interpreting your file they’re reading top to
bottom, left to right, unfortunately, it’s too late
to define square on line four if you yourself want
to use it on line two. But we could surely fix this logically. As by moving that code up top down
below, defining square at the top, writing my own logic below, now trusting
that Python will see square and only use it on line five. Let’s go ahead and save this file,
rerunning Python of return.py, again typing 2. And voila, now it works. But this isn’t quite the most
conventional way to solve this problem. As naive as Python is reading
your code top to bottom, it’s a bit of a regression
now, a bit of a mistake that I’m putting the
actual code that I care about at the bottom and
the actual code that I was trying to abstract away, if you
will, and give name to at the very top. At the end of this day,
the program I care about are these lines here, not
my implementation of square. And so I would actually prefer,
albeit a bit nit pickily, to put that code actually where it was. And so if you do that, thereby
keeping the main part of your program at the top, as is convention, you
still have to solve the same problem. So how might we do that? Well, the Pythonic way or the
conventional way in Python is to do this– to define a function that most
people call itself main, again, ending it with a colon, indenting
below that the lines you have written. And then at the bottom of the
file is the very last thing you do, telling Python to call
that function called main. Because here now is what Python will do. It will read this file in interpreting
it top to bottom, left to right, defining a function called main and then
here defining a function called square and then here calling one of
those two functions, which in turn, calls the second. But in this way have you taken care
to define all of your functions first and never calling any of them
until everything’s been defined. Now so that you’ve seen it too, it’s
not quite conventional to just run main at the bottom. Instead, you’ll typically see a
more magical incantation like this. If underscore name underscore underscore
equals equals quote unquote underscore underscore main underscore
colon, indented below that will be your actual call to main. This, for more arcane reasons,
ensures that if you’re using a small program
as part of a bigger one, it won’t necessarily get
executed at the wrong time. But logically, what
the key takeaways here are what you can actually do
by defining your own functions. Here too do we have an abstraction. What does it mean to square two values? Simply multiplying
one against the other. But it would be nice to just refer to
that as a verb unto itself like square. And so by defining this function do we
now abstract away that multiplication and just treat this as the
idea we actually care about. Now, of course, we’re not
saving all that much time, and my programs even
bigger than need be. But it’s demonstrative of this principle
of abstracting away implementation details and building your more
interesting product on top of work that you or someone
else has already done. Now what if we want to do more
than just square a number? We instead want to prompt the user for
input, convert that input to a number, and then ensure that that number
is the type of number we want. Indeed, it’s not uncommon in a
program to ask the user for a positive integer– something useful– again and again until he
or she provides just that. For instance, the user might type 0
or negative or something else still, but you want to pester them
again and again until they provide exactly the input you expect. Much like in a website where
you’re forced to type a number or email address or
something else, similarly can we do that in Python in code. So let’s go ahead and do exactly that. And assume for the moment that there
exists already a function called, say, get positive int– a function whose purpose in
life is to get from the human an integer from one on up. Let me go ahead and
preemptively this time to find my own main function with def. And inside of that code, go ahead and
declare a variable called I for integer and then just presume
for the moment to call a function called get positive int,
which itself will take a prompt as before asking the user,
say, for I. And what do I want it down to with this number? Well, let’s keep it simple for
now and just print I itself. But I now need to implement that
function called get positive int. So for that, I can use def
and say def get positive int. But I need this function
to take itself a prompt. And I’m going to go
ahead and call it exactly that, which is to say
when I call this function, as I’ve done on line two
passing in some string of text that I want the user to see, well, in my
definition of get positive int on line five, I need to tell Python
to give that input a name, so I can refer to it ultimately. Because, indeed, when I’m going
to do after adding that colon and indenting underneath
is ultimately, we want to call input, passing
in precisely that prompt. After all, get positive
int is not going to presume to know in advance what the programmer
wants to prompt the user with. Instead, it’s going to get it just
in time via that input or argument. But I need to pester the user again
and again if they don’t actually give me a positive int. And so how might I do that? Well, just like in pseudocode, we
might define for ourselves loops– blocks of code that do something again. And again as we go to step two, as
before, so can I do that in Python in any number of ways, but perhaps
the simplest here is this– to simply say you know what, Python? Go ahead and give me an
infinite loop while true– while being my operative word here,
inducing a loop while something is true. Well, you know what’s true
always is the word true. And indeed, built into
Python are Boolean values– true and false literally– by
definition, capital T and capital F. So by saying while true colon,
I’m saying Python, please go ahead and do something again
and again until I tell you to stop. Well, what do you want
Python to do in this loop? I want to go ahead and declare
a variable called say n, assign to it the return value
of calling that in function, passing to it the output of input. And then and only then if the human has
obliged and given me a positive int, I’ll go ahead and say, well,
if n greater than zero, go ahead and break out of this loop. So a different approach than we saw
in pseudocode where I simply said go to and go to and go to again. Here I’ve instead said, Python,
do this forever until I say break. And only once n is greater than
zero, as per the user’s input, do I break out of this loop
entirely and therefore return when I’m ready that value called n. And so here on my last line of code
am I again using return, handing back, if you will, a sheet of paper
on which is that number. But I only reach this line 10
after I’ve said break on line 9, and so does this function get positive
in ultimately return exactly that. So as always, let me go ahead
now and save this file but only after adding that last cryptic line. If the name of this file is
implicitly underscore underscore main, then do I want to go
ahead and call main so that we avoid all of those issues
of code in the wrong order. I’ll go ahead and click Save
and do Python of positive.py, providing an input,
say, negative 1, being prompted again for a number
so negative 2– still not a positive int nor a zero. But if I finally type
of value like one do I actually see the one that I inputted. But it turns out Python
supports other types of loops as well, not just via this
keyword called while but actually via a preposition called for. For instance, suppose that I
want to implement a program that, not unlike a charting
program like Excel, prints for me some kind of bar chart. These bar charts will be
purely textual using, say, hash marks to represent values. But to do this, I’m going to
have to prompt the user for input and then print out precisely
that many hashes horizontally. Well, let’s see what I get. I’m going to go ahead and as
always prompt the user for input. We’ll call it say n. And that user’s input shall be
converted via int after asking them for that value of n. And then once I have that
value do I want to iterate that is loop some number of times– some number of times equal to
whatever the user’s input was. So if the user has inputted one,
I’ll print just one hash mark. If the user inputs 10, I want
to print 10 of those hashes. But how do I do this? A while true loop or forever loop
that infinitely loops is probably not the right approach here. But rather I want to iterate
some finite number of times. And so a for loop allows
us to do exactly that with built-in functionality as follows. Let me go ahead and say
Python for I in the range of n, which is the user’s
input, go ahead per the colon and do the following next. Go ahead and print out a
single hash for each value. And so what is this line of code doing? Here on line 3 do I have
for I in the range of n. Well, it turns out that range
is a function built into Python that returns to you
effectively a range of values. By default, that range starts at 0 and
goes up to but not through the value you ask for. So if you pass to range the value
like 1, you will iterate only one time the range of 0
to but not through 1. If you instead input a value of 10,
you’ll iterate over a range of 0 through 9 up to but not through 10
and get precisely that many hashes. My goal, again, is to
print a bar chart of sorts with one hash representing
each of these values from left to right, a
horizontal bar chart. So let me go ahead and save this
file here and in my terminal window run Python of score.py. We’ll input a number like 10. And unfortunately, they
all seem to be vertical. And if I scrolled up higher in my
terminal window would I see all 10 but again, one on top of the other. So how do I somehow keep my cursor,
if you will, on the same line? Well, all this time,
I’ve been using print. I’ve been getting a new
line for free, so to speak. At the end of printing
anything has Python been moving my cursor, not
unlike an old school typewriter, to the bottom left of the next line. But sometimes I want my cursor
to stay on the same line, even as I do something again and again. And it turns out Python– and knowably know this by having
looked at the documentation, therefore is that the
print function can take a second input that is
not necessarily just some other string you want to print. But instead it’s a named parameter– that is, an input that has
a predetermined named– in this case called n– that you can set equal to a
specific value like nothing. It turns out, albeit non-obviously,
that, by default, Python ends each line with a carriage
return, if you will, or a blank line, otherwise represented here technically
as a backslash n, which itself is technically distinct from
an old school carriage return but has the effect of moving that
cursor down to the next line. So this is implicit. It would be incredibly annoying
if any time you wrote Python code and wanted to print
something that you had to type out that sequence of symbols. And so you get those
for free, so to speak. But if you want to override
that default behavior, you need to instead tell Python’s
print function, you know what? End your lines with nothing at all,
quote unquote with nothing in between. But when I’m done
printing all of them, it would be nice to move my cursor to
the next line so that my next prompt– that dollar sign we keep
seeing in my terminal window– is at least on its own. So I’m going to go ahead and say
print open paren closed paren with nothing inside that because
if I get for free a blank line, I don’t need to pass anything to print. That is as the very last step just going
to move my cursor to the next line. So let me go ahead and save
this program now and again, in my terminal, window run Python
of score.py, typing in this time 10. And there do I get my
10 hashes horizontally. So it turns out Python has what are
called types, and the only time you really need to know or care
about this is when, frankly, it starts to bite you, like it did us. Indeed, when I asked
for the user’s input and expecting an integer
but the user typed exactly that but I didn’t convert
it in advance to an int, I got ultimately that weird behavior
of concatenating one string to another. So underneath the hood are there are
any number of types built into Python– a bool like true/false, integers like
numbers, strs or strings of text, and even floats, real numbers
that have a decimal point and some number of digits after. But beyond that are more sophisticated
data types or data structures still. Dict or dictionary, which is as
we’ll call it a hash table of sorts, list which can be any number
of values back to back, a range of values as we’ve
just seen, or a set wherein you have no duplicate values and a
tuple, not unlike x comma y or latitude comma longitude. But it turns out that an
appreciation of these types can help you avoid some very
serious mistakes in code because it turns out that depending on
how you store your data in a computer’s memory, you might actually get behavior
that you didn’t actually intend. For instance, let me quite simply
write a program that prints out the value of, oh, say, 1 divided by 10. I have here a file called, say,
imprecision that I quite simply am going to do this– prompt the user for an input called
x, converting their input to an int, as always, asking for x, and then
defining another variable called y– this time, converting to an integer
the user’s input after prompting for y. And then quite simply, I’m
going to print x divided by y. Let me go ahead and save this
and in my terminal window run Python of imprecision.py,
hitting Enter here, typing in, say, 1 divided by 10. And so 0.1 is the answer,
just as you’d expect. Let’s go ahead and print out more
digits than 1 after that decimal point, just to make sure that 1/10 is indeed
0.1 with implicitly an infinite number of zeros to the right. Well, let me go ahead
just for simplicity’s sake and first store the value x divided
by y and a third variable z. And then in my print statement here,
let me print out exactly that z, but let me format it a bit
differently than usual. Using Python support
for an or format string, using that prefix f, which
connotes give me a format string, am I going to print exactly
that quote unquote z. But recall that you need to surround
it with those so-called curly braces to make clear to Python that you want to
plug in its value and not literally z. Well, it turns out there’s additional
syntax we can use, albeit cryptic, via which to tell Python, yes, print
z but to this many decimal places. And the syntax for that is a
colon right after the variable, followed by a literal period,
and the number of decimal points that you’d like to print. And because this is a so-called
floating point value or real number, we need one additional
F. Now with this syntax should I be able to print that
same value but to a specific number of decimal places. Let’s see. In my terminal window, let me go ahead
and run Python and imprecision.py, again inputting 1, again
inputting 10, and whew, I indeed see point 1
followed by nine more zeros– a total of 10 digits. Well, let me get a little more curious
and instead print out, oh, shall we say 20 decimal places– again running my
program in precision.py, inputting 1, followed by 10, and all
looks almost good until wow, 5, 5, 5. I’m a little curious now
as to what’s going on. Let me go ahead and run this one last
time after printing 30 decimal places. Here I’m going to go ahead and
run Python of imprecision.py, hoping that the I’m not going
to get worse, and it does. It seems if you look far enough out,
you start to see some weirdness. In fact, let me go as far as out as– I don’t know– 55 decimal
places, going ahead and running Python of imprecision.py,
and putting one in 10. Oh, my god, it does get worse. So it would seem that all of us taught
in grade school that one divided by 10 indeed equals 1/10 is not quite true. If you look far enough beyond
the decimal point, eventually, things go horribly, horribly awry, and
that is because computers quite often, as powerful and sophisticated as
they are, can’t quite do everything and can’t quite do
everything that we humans do. Now why is this? It would seem that Python
is ever so slightly off when it comes to the
representation of this floating point or this real value. Now why is that? Well, inside of a computer is hardware
like this, RAM or Random Access Memory, which is a little chip of memory
inside of your computer wherein files and programs or stored when
they’re open or running. And inside of each of these black
chips is some number of bytes or bits that are ultimately used to represent
any of the values in your program. The catch here, though, is that this
device, like any physical device in the real world, has only a finite
amount of space or capacity, which is to say, no matter how big or
expensive this particular RAM is, it has a finite number of bytes– maybe one billion if it’s a gigabyte
or two billion if it’s two gigs. But it’s a finite number in total. And by default, what Python
and most languages do is they decide a priori
how many bits or bytes to use to represent any of
the values in your program. And so if your number
is so precise or so big that it can’t quite be represented
in only that many bits, the language, like Python, is
going to come as close as it can and represent that value
with some approximation. And that’s what we’re seeing. One divided by 10 is surely a
mathematically well-defined number. Indeed, it’s 1/10 or 0.1 and
mathematically should be 0.10000 ad nauseum infinitely. But in Python, if you’re only using,
say, 32 or 64 any number of bits, you can’t possibly represent
the infinite number of numbers that exist in the
world and represent all of them perfectly, precisely. To do so, you would surely need
an infinite number of bits, and we don’t have that
in our physical world. And so you have to
suffer, unfortunately, this potential for floating point
imprecision where values you care about are going to be close
but not quite what you intend, unless you, the programmer
or designer of the system, are willing to spend more
than just 32 or 64 bits but more and more and
more and enough that you can get that decimal point and those
values as far off to the right, so to speak, as you can. But darn it, if there isn’t
another problem that derives from precisely the same constraint. Not only can you have imprecision when
it comes to floating point values, even integers are potentially
flawed, not necessarily in Python because in the latest version of Python
have they designed in the language the ability to use as many bits as
you need to represent integers– specifically, numbers like negative
1 and 0 and 1 and everything to the left and everything to the right. The language itself will
use more and more bits to store exactly the
integer you want, but that did not used to be the case in
Python and is still not the case in some languages. Some languages are vulnerable to
what’s called integer overflow whereby if in that language you try
to count so high that you need to represent a number that’s too
big to fit in the amount of storage you’ve allocated– 32 or 60 or some number of other bits– you’re going to overflow the value. The result of which is that you might
be going up and up and up and up and representing a
bigger and bigger value. But it gets so big that all of the
ones in that number become zeros. And somehow accidentally. you end up overflowing and
starting all over numerically. Now how might that be? Well, consider a number that’s
represented with only three digits, and let’s start counting,
for instance, from 123. Adding ones to that gives you 124,
followed by 125, 126, 127, 128, 129. And what do we do in our
human mathematical world? Well, if you were about to hit
nine and we now need to go to 10, you don’t just write 10. Rather you write zero, and you
carry the one, so to speak, continuing now with your logic, adding
that one to that two, giving you 130. And that’s OK. We’ve stayed within the confines
of that three-digit number. But of course, if we go count up long
enough, we’ll eventually reach 999. But if we have decided to only
allocate three digits for this number where or what is going to happen
when we add one number to this? Well, you might be inclined to carry
the one and carry then one again. And in the world where you have
the luxury of pen and paper, you might simply write down
1,000, which is the right value. But if your computer or device is only
representing values with three digits, you have overflowed
this particular value, and you’ve overflowed in the sense
that even though I have it here on the screen, that doesn’t
actually have room in which to fit. And so your number 1,000
becomes mistaken for 0 0 0, thereby having you’ve overflowed
and wrapped around from a big number to a small. Now you might think that this is
fairly contrived and why would you ever do something so foolish
as to only represent numbers with three digits. Well, we humans have done worse. Now it wasn’t all that long ago that
we humans made precisely this mistake using just two digits to store years. After all, if almost all of your
dates start with 1900-something, you might as well just
store those last two digits. Unfortunately, by December 31, 1999,
were many of us quite a bit nervous that we hadn’t found all of the code
and all of the devices in the world that we’re still using
just two digits because, as with integer overflow, if you have
a number already counted up to 99 and you only have two digits, you
might run the risk that 99 rolls over, overflowing to 0 0. And all of a sudden, it is not
the year 2000 but 1900 again– quite simply the result
of integer overflow by using a fixed amount
of memory to represent something and not having anticipated
that you might eventually need more. But you can certainly design for this
and engineer defenses against this. Indeed, some games have done
better than we as a society. Indeed, this game here Lego Star
Wars has a point system wherein you can accumulate coins over time. And if you play this
game long enough, you can accumulate apparently as
many as 4 billion of these coins but unfortunately no more
because the engineers who designed this game decided that the
maximum number of points you can accrue is just that– 4 billion. But why? Well, it turns out that in many
computers and game consoles, it’s conventional to store your
integers or ints 32-bit values– 32 zeros or ones back
to back, which means you have 2 to the 32
possible permutations thereof, which means you have
roughly four billion possible values. You actually have a few
more than 4 billion, but it’s perhaps cleaner
in a game to just choose a clean value with lots of zeros. But in this game did they anticipate
that you’d play this game too long, and you might eventually overflow. And who knows what might
happen to that gamer if he or she plays the game
so long, and all of a sudden, their high score becomes zero. So these problems are solvable. You just have to anticipate and
actually engineer those solutions. But sometimes we don’t,
including companies like Boeing. It wasn’t all that long ago that the
Boeing 747 had a software bug in it whereby the plane’s power
system might actually turn off while the plane in the
worst case were actually flying. One article put it as follows. A model 787 airplane that has been
powered continuously for 248 days can lose all alternating current– AC electrical power– due to
the Generator Control Units, GCUs, simultaneously going into
failsafe mode, the memo stated. This condition is caused
by the software counter internal to the GCUs that will overflow
after 248 days of continuous power. Boeing is in the process of
developing a GCU software upgrade that will remedy the unsafe condition. Another website analyzed
the situation as follows. A simple guess suggests that the
problem is a signed 32-bit overflow as 2 to the 31st power is the number of
seconds in 248 days multiplied by 100– that is, a counter in
hundreds of second. Which is to say it is presumed that
Boeing had stored some form of integer in its own software,
and that integer was representing the number
of hundreds of seconds for which the power had been on. But if you’re only using 32 bits
and only have at your disposal roughly 4 billion one hundredths of
seconds, turns out mathematically, after 248 days, that counter,
which is clearly important, might overflow and wrap around
to zero, the result of which is that the power of an
airplane might shut off. And so the temporary work around, before
the software upgrade was deployed, was to quite literally reboot
the airplane while on the ground before that 248th day. Now we’ve only just
scratched the surface of what you could do with
Python, and in fact, we’ve looked at some of those characteristics
that are demonstrative of the features that you might find in any
number of programming languages. But we’re not even limited
to the functions that come built into the core language itself. It turns out there are called
libraries and frameworks and yet more in any number of languages that provide
additional features that you somehow have to load or import
manually in order to use. One such example might be in Python. If you want to generate random or
technically pseudorandom numbers to create some kind of variation
in how your program behaves, it turns out that you can’t just call
a random function right out of the box. You need to tell Python to please
load or import that feature for you. So let’s go ahead and write a program
in a file called pseudorandom.py that allows us to generate a random
number between, say, 1 in 10. I want to go ahead,
though, first and do this. From a library called random,
go ahead and import a function called randint for random int. And then if I want to go ahead
and generate or select and then print a pseudorandom number
between 1 and 10 inclusive, I can simply do print
rand int 1 comma 10, and the effect will be to
let Python somehow figure out how to choose a number
between those two values and return to it to me
so that I can print it. Let’s go ahead and save the file
and then in my terminal window run Python of pseudorandom.py and 10. Let’s go ahead and run it again. And this time, I get six. Let’s go ahead and run it yet
again, and this time, I get 10. Yet again, let’s go ahead and
run it, and this, time I get one. And if I were to run it shall we say
an infinite number of times, over time, I would see a uniform distribution,
hopefully, of those 10 possible values. And that’s what’s meant
by pseudorandom itself. It turns out that languages and
computers more generally can’t really pick a random number off the
top of their head like you and I can, rather they need to
use algorithms, which themselves are deterministic processes– code
that does the same thing again and again in order to create, if
you will, the illusion of randomness by creating a statistically uniform
distribution over some range of values. Now often, a computer
will use something that’s changing, like the clock
that’s built into it, taking a look at the current
time, and then generating a random number or again pseudorandom
number based on that variation or if it has a microphone or a camera
taking some ambient noise of sorts and using that to feed into
whatever algorithm it’s using to choose something randomly. But suppose I want to now do something
with this value and not just print it. Suppose that we want to implement
a bit of a game for a user– pick a random number between 1 and 10
and see if they can guess it correctly. Well, let’s see how a program might
like that might be implemented. Let’s first go ahead. And from that library called
random, import as before, a function called rand int. Although it turns out if you want to
use not only this function or others, you can more succinctly
just say import random. The difference being if
you only import random, we are going to have to prefix
with the word random followed by a dot every use of a function. So we’ll do it that way this time. Let’s go ahead and declare
a variable called n and assign to it right to left
the result of calling rand int. But this time, because we’ve
not mentioned rand int by name, I need to qualify this symbol
and say random dot rand int, thereby making clear to Python that
the function I’d like you to call is actually inside of that
library called random. But I can otherwise use it as
before passing in two values– a lower and upper bound
like this, one comma 10– and that should give me an n,
a random number in that range. Now I want to go ahead and
ask the user for their guess, and I’ll go ahead and define
another variable called, say, guess. That is the result of
converting to an int whatever the user’s input is for their guess. And now, quite simply for this game,
I want to compare those two values and print, say, correct or incorrect. And so I’ll go ahead and say if
the users guess equals equals n, go ahead and print out as much correct. Else if the user’s guess is
not right, let’s implicitly infer that, nope, incorrect. And so we’ll print that instead. Let me go ahead and save this file
now and run Python of guess.py. This time, I’ll be
prompted for my guess. I’ll say five. Unfortunately, it’s incorrect. Let’s go ahead and play again
this time running Python guess.py. This time, I’ll go with 10 since we saw
it so many times before and correct. I’ll run it a third time
and see what’s going on. This time, I’ll guess one but incorrect. Unfortunately, I’ve implemented
perhaps the most frustrating game ever because I’m not even telling the
user what the actual number was. But surely, I could do that by
printing the value of the variable I stored that random int in, but that
would be yet another game altogether. All right, let’s now bring all of this
together and solve an actual problem– one say from yesteryear. You might recall this game here, Super
Mario Brothers, the original, and this was a two-dimensional world
built up of images like this. Well, there in the sky so, to
speak, do I see four question marks. And that seems like an opportunity
to do something again and again. How might I print out for question marks
in a row, not nearly as graphically as that here but just with my
terminal window and text editor? Well, here let me go
ahead and open those two. And in a file called, say, mario.py,
let me go ahead and print out, quite simply, four questions. I’ll go ahead and print out quote
unquote question mark question mark question mark question mark. Saving that file again in mario0.py,
running in my terminal window Python of mario0.puy and voila
do I get an approximation of what Nintendo did in yesteryear. Now of course, doing
something again and again is clearly an opportunity for, say,
a loop and not just printing it all at once but just doing
something again and again. And while this will
complicate the code initially, it sets us up for a more
interesting solution thereafter. In a file then called
mario1.py, let me go ahead and implement that same
sequence of question marks but this time using that familiar loop,
not a while loop or an infinite loop but perhaps just a for loop like this. For I in the range of 0
up to, but not through 4, go ahead and print out
just one question mark. Saving that file brings me now to my
terminal window in Python of mario1.py enter. Unfortunately, I have created
not quite the right level. But that’s OK because
remember that with print, you get one new line for free
every time you call it, unless you override
that default behavior. So let’s say, no, Python instead
and your line with nothing, and only once I’m completely
done do I want you to print one of those free new lines for me. I’ll go ahead and save my
file again in my terminal window, rerun Python of mario.py, and
now we have the same exact result. But later in the game do we see
different aspects of Mario’s world, not unlike this thing here underground. Pictured here are a number
of blocks in the underworld, and it looks to me
like that bigger block there is a composition of, say, 4. So let’s go ahead now and
print out a block of bricks, these are all represented by hashes
so that I have four horizontally, four vertically as well, and
everything else filled in too. We’ve not yet printed out
anything on multiple axes, if you will, both rows
and columns of sorts. So how to do this? Well, in my terminal window, I’m going
to create a file called mario2.py. And in this file, I’m going to decompose
this problem conceptually, so to speak, into two different problems. Built into that
underworld are some number of rows of bricks within
which are these columns. And I bet I could bite those
off each one at a time. So let me go ahead and do this. For I in range of 4, go
ahead and print what? Well, for every row of bricks, do I want
to print some number of columns too? Because it’s a square, the same
number of columns as rows and so I know how to print that many things too. I can simply use another loop
perhaps with a different variable with which to count like for j, as
is conventional in range also of 4. And then inside of this inner
nested loop, so to speak, might I go ahead and print out just
one hash ending each of my lines with, as before, nothing. In fact, I only want to move
my cursor to the next line after I’ve printed
each of those columns. And so only underneath that innermost
loop do I want that call to print. Let me go ahead and save this file
now and run Python of mario2.py. And if I’ve gotten this logic
right, I can go ahead and print out those rows and columns too. And indeed, that’s what I get. It’s not quite a perfect
square because those hashes are a little taller than they are wider. But via a for loop, one
nested inside of another can I handle two problems at once– the act of iterating from row to row
to row and within each row, iterating left to right via column. And in fact, to make
that ever more clear, why do I even call my variables more aptly. For each row in the range of four
in each column in the range of four, go ahead and print each of those hashes. Frankly, it doesn’t even matter what I
call these variables because I’m never actually using them per se. I’m simply telling Python to
count up from 0 to 4 using those particular names. Those then are some
programming languages– Python especially among them. And just as we saw in pseudocode,
the ability to express functions and conditions with Boolean expressions
and loops and then things like variables and more, so can we express
those exact same ideas in Python, in C, C++, and Java. And with each of those languages
do you get different features, with each of those languages do
you get different techniques. But ultimately, those
languages are all just tools for one’s toolkit with which to
solve any number of problems with data.

5 Comments

Add a Comment

Your email address will not be published. Required fields are marked *