Introducing the Shell
|
A shell is a program whose primary purpose is to read commands and run other programs.
The shell’s main advantages are its high action-to-keystroke ratio, its support for automating repetitive tasks, and its capacity to access networked machines.
The shell’s main disadvantages are its primarily textual nature and how cryptic its commands and operation can be.
|
Navigating Files and Directories
|
The file system is responsible for managing information on the disk.
Information is stored in files, which are stored in directories (folders).
Directories can also store other directories, which forms a directory tree.
cd path changes the current working directory.
ls path prints a listing of a specific file or directory; ls on its own lists the current working directory.
pwd prints the user’s current working directory.
whoami shows the user’s current identity.
/ on its own is the root directory of the whole file system.
A relative path specifies a location starting from the current location.
An absolute path specifies a location from the root of the file system.
Directory names in a path are separated with ‘/’ on Unix, but ‘\’ on Windows.
’..’ means ‘the directory above the current one’; ‘.’ on its own means ‘the current directory’.
Most files’ names are something.extension . The extension isn’t required, and doesn’t guarantee anything, but is normally used to indicate the type of data in the file.
Most commands take options (flags) which begin with a ‘-‘.
|
Working With Files and Directories
|
cp old new copies a file.
mkdir path creates a new directory.
mv old new moves (renames) a file or directory.
rm path removes (deletes) a file.
Use of the Control key may be described in many ways, including Ctrl-X , Control-X , and ^X .
The shell does not have a trash bin: once something is deleted, it’s really gone.
Depending on the type of work you do, you may need a more powerful text editor than Nano.
|
Pipes and Filters
|
cat displays the contents of its inputs.
head displays the first few lines of its input.
tail displays the last few lines of its input.
sort sorts its inputs.
wc counts lines, words, and characters in its inputs.
* matches zero or more characters in a filename, so *.txt matches all files ending in .txt .
? matches any single character in a filename, so ?.txt matches a.txt but not any.txt .
command > file redirects a command’s output to a file.
first | second is a pipeline: the output of the first command is used as the input to the second.
The best way to use the shell is to use pipes to combine simple single-purpose programs (filters).
|
Loops
|
A for loop repeats commands once for every thing in a list.
Every for loop needs a variable to refer to the thing it is currently operating on.
Use $name to expand a variable (i.e., get its value). ${name} can also be used.
Do not use spaces, quotes, or wildcard characters such as ‘*’ or ‘?’ in filenames, as it complicates variable expansion.
Give files consistent names that are easy to match with wildcard patterns to make it easy to select them for looping.
Use the up-arrow key to scroll up through previous commands to edit and repeat them.
Use Ctrl-R to search through the previously entered commands.
Use history to display recent commands, and !number to repeat a command by number.
|
Shell Scripts
|
Save commands in files (usually called shell scripts) for re-use.
bash filename runs the commands saved in a file.
$@ refers to all of a shell script’s command-line parameters.
$1 , $2 , etc., refer to the first command-line parameter, the second command-line parameter, etc.
Place variables in quotes if the values might have spaces in them.
Letting users decide what files to process is more flexible and more consistent with built-in Unix commands.
|
Finding Things
|
find finds files with specific properties that match patterns.
grep selects lines in files that match patterns.
--help is a flag supported by many bash commands, and programs that can be run from within Bash, to display more information on how to use these commands or programs.
man command displays the manual page for a given command.
$(command) inserts a command’s output in place.
|
Running and Quitting
|
python , ipython and idle commands all give an interactive Python shell (REPL).
Python programs are plain text files.
You can use IDLE for creating and running Python programs.
|
Variables and Assignment
|
Use variables to store values.
Use print to display values.
Variables must be created before they are used.
Variables can be used in calculations.
Use an index to get a single character from a string.
Use a slice to get a substring.
Use the built-in function len to find the length of a string.
Python is case-sensitive.
Use meaningful variable names.
|
Data Types and Type Conversion
|
Every value has a type.
Use the built-in function type to find the type of a value.
Types control what operations can be done on values.
Strings can be added and multiplied.
Strings have a length (but numbers don’t).
Must convert numbers to strings or vice versa when operating on them.
Can mix integers and floats freely in operations.
Variables only change value when something is assigned to them.
|
Built-in Functions and Help
|
Use comments to add documentation to programs.
A function may take zero or more arguments.
Commonly-used built-in functions include max , min , and round .
Functions may only work for certain (combinations of) arguments.
Functions may have default values for some arguments.
Use the built-in function help to get help for a function.
Every function returns something.
Python reports a syntax error when it can’t understand the source of a program.
Python reports a runtime error when something goes wrong while a program is executing.
Fix syntax errors by reading the source code, and runtime errors by tracing the program’s execution.
|
Libraries
|
Most of the power of a programming language is in its libraries.
A program must import a library module in order to use it.
Use help to learn about the contents of a library module.
Import specific items from a library to shorten programs.
Create an alias for a library when importing it to shorten programs.
|
Reading Tabular Data into DataFrames
|
Use the Pandas library to do statistics on tabular data.
Use index_col to specify that a column’s values should be used as row headings.
Use DataFrame.info to find out more about a dataframe.
The DataFrame.columns variable stores information about the dataframe’s columns.
Use DataFrame.T to transpose a dataframe.
Use DataFrame.describe to get summary statistics about data.
|
Pandas DataFrames
|
Use DataFrame.iloc[..., ...] to select values by integer location.
Use : on its own to mean all columns or all rows.
Select multiple columns or rows using DataFrame.loc and a named slice.
Result of slicing can be used in further operations.
Use comparisons to select data based on value.
Select values or NaN using a Boolean mask.
|
Plotting
|
matplotlib is the most widely used scientific plotting library in Python.
Plot data directly from a Pandas dataframe.
Select and transform data, then plot it.
Many styles of plot are available.
Can plot many sets of data together.
|
Lists
|
A list stores many values in a single structure.
Use an item’s index to fetch it from a list.
Lists’ values can be replaced by assigning to them.
Appending items to a list lengthens it.
Use del to remove items from a list entirely.
The empty list contains no values.
Lists may contain values of different types.
Character strings can be indexed like lists.
Character strings are immutable.
Indexing beyond the end of the collection is an error.
|
For Loops
|
A for loop executes commands once for each value in a collection.
The first line of the for loop must end with a colon, and the body must be indented.
Indentation is always meaningful in Python.
A for loop is made up of a collection, a loop variable, and a body.
Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable).
The body of a loop can contain many statements.
Use range to iterate over a sequence of numbers.
The Accumulator pattern turns many values into one.
|
Looping Over Data Sets
|
Use a for loop to process files given a list of their names.
Use glob.glob to find sets of files whose names match a pattern.
Use glob and for to process batches of files.
|
Writing Functions
|
Break programs down into functions to make them easier to understand.
Define a function using def with a name, parameters, and a block of code.
Defining a function does not run it.
Arguments in call are matched to parameters in definition.
Functions may return a result to their caller using return .
|
Variable Scope
|
|
Conditionals
|
Use if statements to control whether or not a block of code is executed.
Conditionals are often used inside loops.
Use else to execute a block of code when an if condition is not true.
Use elif to specify additional tests.
Conditions are tested once, in order.
Create a table showing variables’ values to trace a program’s execution.
|
Programming Style
|
|
Wrap-Up
|
|
Feedback
|
|