Rosetta Workshop 2020 Preparation

Preparation

Choosing your machine.

Rosetta is designed as a command line program for Unix-like operating systems. It will run best on a Linux machine, or in the Terminal program (or similar) of a Macintosh. It won’t run natively on Windows or Chromebooks. Windows 10 does have a “Windows Subsystem for Linux” which allows you to run Linux programs from within Windows. While you may be able to get this to work, we do not test running Rosetta under those conditions, and you may encounter issues running Rosetta which we won’t be able to help with. If you don’t have access to a Linux or Mac machine, we would recommend looking into a Virtual Machine (VM) program (such as Virtual Box), and installing a Linux distribution (Ubuntu is a good choice) within the VM program.

While production runs of Rosetta are most often run on remote computing clusters, we encourage you to do the tutorials on a local machine with a graphical interface. Many of the preparation and analysis steps rely on visualizing information in a way that’s difficult under a purely text-based interface. You may be able to use a remote cluster for the heavy processing steps, provided you’re comfortable moving data back and forth to do the preparation and analysis steps locally.

As each computer and operating system is different, the instructors for the Rosetta Workshop will likely be unable to assist with details on working with your operating system. Please consult a local system administrator or operating system specific documentation and resources for help troubleshooting such issues.

Working with the command line.

The tutorial will mainly be run from the command line. A certain level of familiarity with working with the Unix command line and running command line programs is expected. If you don’t have much experience with the command line, we encourage you to look at online resources for learning the command line. Two examples we’ve found helpful in the past are http://www.ee.surrey.ac.uk/Teaching/Unix/ and http://www.linuxcommand.org/lc3_learning_the_shell.php, though there are numerous resources on the web, targeted to different levels.

Obtaining Rosetta

The tutorials are written assuming that you will be using Rosetta version 3.12. Rosetta is released as both a numbered version (like 3.12) once or twice a year, as well as a series of “weekly” releases (like 2019.42). There is little difference in testing and validation between the numbered releases and the weekly. (In fact, the numbered releases are simply a relabled weekly.) However, as Rosetta is continually being developed and changed, we settled on using Rosetta 3.12 as consistent version for the tutorials. While other recent versions (such as the most recent weekly release) may also work with the tutorials, they might also have subtle differences in how input/options are specified, which may require debugging. (I would definitely avoid using Rosetta versions prior to 3.11 or weeklies from before mid-2019.)

To obtain Rosetta, go to RosettaCommons.org and click “Software” and then “License and Download”. In order to install Rosetta on your machine, you’ll need to obtain a license for Rosetta. The license is free for academic users, and the approval process for those with a recognized academic email address (e.g. *.edu) is automated and relatively rapid. For academic users with a non-recognized email domain, the process may take longer due to validation. For commercial users, please consult your local administrators to see if you already have valid Rosetta credentials, and contact license@uw.edu if not.

Once you’ve obtained a license and associated download credentials, you can obtain the Rosetta software via the License and Download page. Scroll down to version 3.12. There are several options to download. If your platform is listed, you can download a pre-compiled version. If not, download the source distribution, extract it using OS tools (e.g. gzip) and compile it. See https://www.rosettacommons.org/docs/latest/build_documentation/Build-Documentation for details on how to compile Rosetta. (Generally, running the command ./scons.py mode=release bin in the Rosetta/main/source/ directory works.) Note that Rosetta takes quite some time to compile.

The tutorials will be written assuming that you’ve installed Rosetta under the ~/rosetta_workshop/rosetta/ directory. Rosetta can be installed in any directory (administrator access not needed) – simply change the paths provided by the tutorial to where you’ve installed it.

You can check if Rosetta is installed properly by running the command ~/rosetta_workshop/rosetta/main/source/bin/validate_database.linuxgccrelease (no additional options needed). Note that the “extension” of the program changes with platform and compile type. It may be something else like .macosclangrelease or even .static.linuxclangrelease. Run ls ~/rosetta_workshop/rosetta/main/source/bin/validate_database.* to see what’s present on your machine. Remember your extensions and adjust the extensions in the tutorial accordingly.

Other needed programs

In addition to Rosetta, you’ll need a number of other tools.

In addition to standard Unix command line tools (such as grep/awk/sed/head/tail/sort) you’ll also need:

  • Web browser (e.g. firefox)
  • Text editor (e.g. gedit — Not a word processor!)
  • Image viewer (e.g. gthumb or eog)
  • Spreadsheet program (e.g. loffice or Excel)
  • PyMol (Chimera can be substituted)
  • PDF reader (e.g. acroread)
  • Perl
  • R and Rscript
  • Python 2.7 (see below)

The BCL will also be used for conformer generation in the ligand docking tutorial. However, it will not be necessary to install or run the BCL if you do not wish to, as the generated input file will be provided for you.

Consult your local system administrator and/or your operating system documentation/resources to learn how to install these programs for your system.

Python

Python deserves special mention, as the majority of the scripts used for setup and post-processing will be written in Python.

You will need access to a Python 2.7 interpreter. While Python 2.7 is officially end-of-life, Rosetta contains many scripts which have not yet been updated to Python3, and a significant number of computing clusters still have Python2.7 as their main, default Python. As such, the tutorials have been written for a Python2.7 interpreter. You can check which Python you have installed by running python --version at the command line. If that is a 3.x version, you can potentially call for Python 2.7 specifically with python2.7 --version.

For the Python environment you’ll be using, you’ll need to install the following packages. These may be installable from your OS package manager, your python environment manager, or by using pip. Talk to your local system administrator or consult your OS documentation for details. Note that if there are multiple Pythons on your system, you will need to install the packages for the Python 2.7 installation you will be using for the tutorial.

  • Biopython 1.73 to 1.76 (`pip install biopython==1.76`)
  • matplotlib (`pip install matplotlib`)
  • pandas (`pip install pandas`)
  • seaborn (`pip install seaborn`)
  • weblogo (optional – `pip install weblogo`)
  • PyMol package (as distinct from the PyMol application, optional, `conda install -c schrodinger pymol`)

If you don’t have a Python2.7 version available, or if you can’t install the required packages to the system Python, you can install a local copy of Python2.7 in your home directory using the Conda system. Conda is a system which allows you to install (potentially multiple) versions of Python (and other programs) in your home directory in a way where updates (such as installing packages) do not interfere with the system Python. Miniconda is a version of Conda which has a Python2.7 installer.