Welcome to Software Carpentry Etherpad!
This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.
Use of this service is restricted to members of the Software Carpentry and Data Carpentry community; this is not for general purpose use (for that, try etherpad.wikimedia.org).
Users are expected to follow our code of conduct: http://software-carpentry.org/conduct.html
All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/
Managing Data with SQL
First steps:
Next steps for those wanting to use a relational database:
Problem solving
If up-arrow doesn't get you to earlier commands, ask a helper! Probably need to change PATH to use non-Anaconda version of sqlite3, or use "rlwrap".
To keep conda environment controlled look at the end of the etherpad for David's setup.
If .tables stops working for whatever reason:
- exit sqlite3, close terminal
- re-download the db file
- open it again with a fresh terminal, making sure you're in the directory that contains the file!
Some sqlite3 useful commands
.help - show available commands
.quit or .exit - close database session
.databases - list names and files of attached databases
.tables - show all existent tables in current session
.mode column - set output mode to left-aligned columns
.header on - Turn on displays of resutl headers
; - exit from a wrong command
My First steps Programming with Python
Follow the steps in here:
http://rits.github-pages.ucl.ac.uk/2017-04-27-UCL_software_carpentry/python/
David's repo from class: https://github.com/dpshelio/LearningPython
Version control with git
to exit paginated output, press q . To go to the next page, press space. You can also move up and down with the arrow keys.
if for any reason you find yourself in the vi editor, type the escape key, followed by :q! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.
Advanced use of 'git add':
- 'git add -u' stages the working copy versions of all files being tracked by git
- 'git add -p' takes you through each change in turn interactively and asks whether to stage it
- Both of these can also take file/folder name arguments, so you can do e.g. 'git add -u chapter1/' to stage only modified files in the chapter1 folder
Other options for conflict resolution:
git checkout --theirs -- path/to/conflicted-file.txt
git checkout --ours -- path/to/conflicted-file.txt
See `git checkout --help` for more!
Socrative (www.socrative.com) room name for this session: RITS
LInks & Resources
The software carpentry lesson this was based on: http://swcarpentry.github.io/git-novice/
More learning resources:
* http://marklodato.github.io/visual-git-guide/index-en.html (Visual Git Reference - pictorial representations of what Git commands do)
* http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
* http://learngitbranching.js.org/?demo
* https://github.com/jlord/git-it-electron
* https://presentate.com/bobthecow/talks/changing-history (from this one I like what it says in slide 21!)
Useful tools:
* Graphical interfaces: https://git-scm.com/downloads/guis
(David has gitg)
* Diff/merge tools: https://sourcegear.com/diffmerge/
(but there are many and git asks you what to use)
Using Word files with git:
* https://github.com/vigente/gerardus/wiki/Integrate-git-diffs-with-word-docx-files
* http://ben.balter.com/2015/02/06/word-diff/
Excel seems to be a bit more complicated, but various people have solutions:
* http://programmaticallyspeaking.com/git-diffing-excel-files.html
* https://wiki.ucl.ac.uk/display/~ucftpw2/2013/10/18/Using+git+for+version+control+of+spreadsheet+models+-+part+1+of+3 (by a former UCL student)
* http://stackoverflow.com/questions/17083502/how-to-perform-better-document-version-control-on-excel-files-and-sql-schema-fil
* https://xltools.net/excel-version-control/ (paid-for product)
Unix Shell
Socrative room name (www.socrative.com):
UCLSWC2017
ls
Example files for Unix Shell:
http://swcarpentry.github.io/shell-novice/data/shell-novice-data.zip
To exit a command in bash: ctrl-c
what is the effect of this loop?
for species in *.pdb
do
echo $species
cat $species > alkanes.pdb
done
- Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, pentane.pdb and propane.pdb, and the text from propane.pdb will be saved to a file called alkanes.pdb.
- Prints cubane.pdb, ethane.pdb, and methane.pdb, and the text from all three files would be concatenated and saved to a file called alkanes.pdb.
- Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, and pentane.pdb, andcat the text from propane.pdb will be saved to a file called alkanes.pdb.
- None of the above.
his exercise refers to the data-shell/molecules directory. ls gives the following output:
cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb
What is the output of the following code?
for datafile in *.pdb
do
ls *.pdb
done
Now, what is the output of the following code?
for datafile in *.pdb
do
ls $datafile
done
Why do these two loops give different outputs?
Please use nano to edit files, unless you are already used to a different editor.
the "^" in the nano commands listed in the bottom, means ctrl
In the molecules directory, imagine you have a shell script called script.sh containing the following commands:
head -n $2 $1
tail -n $3 $1
While you are in the molecules directory, you type the following command:
bash script.sh '*.pdb' 1 1
Which of the following outputs would you expect to see?
- All of the lines between the first and the last lines of each file ending in .pdb in the molecules directory
- The first and the last line of each file ending in .pdb in the molecules directory
- The first and the last line of each file in the molecules directory
- An error because of the quotes around *.pdb
For this question, consider the data-shell/molecules directory once again. This contains a number of .pdb files in addition to any other files you may have created. Explain what a script called example.sh would do when run as bash example.sh *.pdb if it contained the following lines:
# Script 1
echo *.*
# Script 2
for filename in $1 $2 $3
do
cat $filename
done
# Script 3
Keeping Anaconda environment controlled
Whether you want to keep sqlite separated, or you want to keep using your python system install for something else, or whatever other reason... you can add
the following to your: ".bashrc", ".profile" or similar startup file you have. If you don't know what that's about - ask :)
Your .bashrc file will contain something like:
PATH=/opt/anaconda/bin:$PATH
Change that by the following, keeping in mind that your anaconda may have been installed in some other place, (e.g., /anaconda, /home/youruser/anaconda/, ...)
--------------------------------------------------------------------------------------------------------------------------------------
# Anaconda stuff
function conda_up () {
# This function (conda_up) will activate anaconda in your path - to deactivate run conda_down
CONDA_PATH=/opt/anaconda/bin
if [[ ":$PATH:" != *":$CONDA_PATH:"* ]]; then
PATH=$CONDA_PATH:$PATH;
fi
}
function conda_down () {
# Remove conda from the path
# from: http://stackoverflow.com/a/370192/1087595
export PATH=`echo ${PATH} | awk -v RS=: -v ORS=: '/anaconda/ {next} {print}'`
}
function sac () {
# This function is to activate different environments in your conda instalation.
# Look at https://conda.io/docs/using/envs.html to learn more about conda environments
# You can do `conda env list` to see all the conda environments.
conda_up
source activate $1
}
alias sd='source deactivate' # alias to deactivate the environemnt and go back to the default conda environment (root)
--------------------------------------------------------------------------------------------------------------------------------------