Welcome to Software Carpentry Etherpad!

This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.

Use of this service is restricted to members of the Software Carpentry and Data Carpentry community; this is not for general purpose use (for that, try etherpad.wikimedia.org).

Users are expected to follow our code of conduct: http://software-carpentry.org/conduct.html

All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/

Managing Data with SQL

First steps:

check SQLIte is installed: type "sqlite3" at the command prompt
point browser to lesson materials: http://swcarpentry.github.io/sql-novice-survey/
download the file survey.db linked from the above web page

Next steps for those wanting to use a relational database:

work through the rest of the notes and questions at http://swcarpentry.github.io/sql-novice-survey/
to get data from a spreadsheet into an SQLite database, see https://github.com/swcarpentry/capstone-novice-spreadsheet-biblio

Problem solving

If up-arrow doesn't get you to earlier commands, ask a helper! Probably need to change PATH to use non-Anaconda version of sqlite3, or use "rlwrap".
To keep conda environment controlled look at the end of the etherpad for David's setup.

If .tables stops working for whatever reason:
    - exit sqlite3, close terminal
    - re-download the db file
    - open it again with a fresh terminal, making sure you're in the directory that contains the file!

Some sqlite3 useful commands

.help - show available commands
.quit or .exit - close database session
.databases - list names and files of attached databases
.tables - show all existent tables in current session
.mode column - set output mode to left-aligned columns
.header on - Turn on displays of resutl headers
; - exit from a wrong command

My First steps Programming with Python

Follow the steps in here:
http://rits.github-pages.ucl.ac.uk/2017-04-27-UCL_software_carpentry/python/
David's repo from class: https://github.com/dpshelio/LearningPython

Version control with git

to exit paginated output, press q . To go to the next page, press space. You can also move up and down with the arrow keys.
if for any reason you find yourself in the vi editor, type the escape key, followed by :q! (colon, lower-case 'q', exclamation mark), then hitting Return to return to the shell.

Advanced use of 'git add':

'git add -u' stages the working copy versions of all files being tracked by git
'git add -p' takes you through each change in turn interactively and asks whether to stage it
Both of these can also take file/folder name arguments, so you can do e.g. 'git add -u chapter1/' to stage only modified files in the chapter1 folder

Other options for conflict resolution:

git checkout --theirs -- path/to/conflicted-file.txt

git checkout --ours -- path/to/conflicted-file.txt

See `git checkout --help` for more!

Socrative (www.socrative.com) room name for this session: RITS

LInks & Resources

The software carpentry lesson this was based on: http://swcarpentry.github.io/git-novice/

More learning resources:

* http://marklodato.github.io/visual-git-guide/index-en.html (Visual Git Reference - pictorial representations of what Git commands do)
* http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
* http://learngitbranching.js.org/?demo
* https://github.com/jlord/git-it-electron
* https://presentate.com/bobthecow/talks/changing-history (from this one I like what it says in slide 21!)

Useful tools:

* Graphical interfaces: https://git-scm.com/downloads/guis
(David has gitg)
* Diff/merge tools: https://sourcegear.com/diffmerge/
(but there are many and git asks you what to use)

Using Word files with git:

* https://github.com/vigente/gerardus/wiki/Integrate-git-diffs-with-word-docx-files
* http://ben.balter.com/2015/02/06/word-diff/

Excel seems to be a bit more complicated, but various people have solutions:

* http://programmaticallyspeaking.com/git-diffing-excel-files.html
* https://wiki.ucl.ac.uk/display/~ucftpw2/2013/10/18/Using+git+for+version+control+of+spreadsheet+models+-+part+1+of+3 (by a former UCL student)
* http://stackoverflow.com/questions/17083502/how-to-perform-better-document-version-control-on-excel-files-and-sql-schema-fil
* https://xltools.net/excel-version-control/ (paid-for product)

Unix Shell

Socrative room name (www.socrative.com):
    UCLSWC2017
    ls
Example files for Unix Shell:
    http://swcarpentry.github.io/shell-novice/data/shell-novice-data.zip

    To exit a command in bash: ctrl-c

   what is the effect of this loop?
for species in *.pdb
do echo $species cat $species > alkanes.pdb done

Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, pentane.pdb and propane.pdb, and the text from propane.pdb will be saved to a file called alkanes.pdb.
Prints cubane.pdb, ethane.pdb, and methane.pdb, and the text from all three files would be concatenated and saved to a file called alkanes.pdb.
Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, and pentane.pdb, andcat the text from propane.pdb will be saved to a file called alkanes.pdb.
None of the above.

his exercise refers to the data-shell/molecules directory. ls gives the following output:
cubane.pdb ethane.pdb methane.pdb octane.pdb pentane.pdb propane.pdb What is the output of the following code? for datafile in *.pdb
do ls *.pdb done Now, what is the output of the following code?
for datafile in *.pdb
do ls $datafile done Why do these two loops give different outputs?

Please use nano to edit files, unless you are already used to a different editor.
the "^" in the nano commands listed in the bottom, means ctrl

In the molecules directory, imagine you have a shell script called script.sh containing the following commands:
head -n $2 $1 tail -n $3 $1 While you are in the molecules directory, you type the following command: bash script.sh '*.pdb' 1 1
Which of the following outputs would you expect to see?

All of the lines between the first and the last lines of each file ending in .pdb in the molecules directory
The first and the last line of each file ending in .pdb in the molecules directory
The first and the last line of each file in the molecules directory
An error because of the quotes around *.pdb

For this question, consider the data-shell/molecules directory once again. This contains a number of .pdb files in addition to any other files you may have created. Explain what a script called example.sh would do when run as bash example.sh *.pdb if it contained the following lines:
# Script 1
echo *.* # Script 2
for filename in $1 $2 $3 do cat $filename done # Script 3

echo $@.pdb

Keeping Anaconda environment controlled

Whether you want to keep sqlite separated, or you want to keep using your python system install for something else, or whatever other reason... you can add
the following to your: ".bashrc", ".profile" or similar startup file you have. If you don't know what that's about - ask :)

Your .bashrc file will contain something like:

    PATH=/opt/anaconda/bin:$PATH

Change that by the following, keeping in mind that your anaconda may have been installed in some other place, (e.g., /anaconda, /home/youruser/anaconda/, ...)

--------------------------------------------------------------------------------------------------------------------------------------
   # Anaconda stuff

   function conda_up () {
       # This function (conda_up) will activate anaconda in your path - to deactivate run conda_down
       CONDA_PATH=/opt/anaconda/bin
       if [[ ":$PATH:" != *":$CONDA_PATH:"* ]]; then
           PATH=$CONDA_PATH:$PATH;
       fi
       }

function conda_down () {
    # Remove conda from the path
    # from: http://stackoverflow.com/a/370192/1087595
    export PATH=`echo ${PATH} | awk -v RS=: -v ORS=: '/anaconda/ {next} {print}'`
}

   function sac () {
       # This function is to activate different environments in your conda instalation.
       # Look at https://conda.io/docs/using/envs.html to learn more about conda environments
       # You can do `conda env list` to see all the conda environments.
       conda_up
       source activate $1
   }

   alias sd='source deactivate' # alias to deactivate the environemnt and go back to the default conda environment (root)
--------------------------------------------------------------------------------------------------------------------------------------