Globus
Globus is a service which provides a set of tools to facilitate parallel, load-balanced, fault tolerant data transfers; these tools can be used to transfer, share and/or publish data. CHPC licenses these services and provides dedicated data transfer nodes.
Globus service, endpoints and app
File systems, such as CHPC file servers, or drives on your personal laptop, connect to the Globus service by the use of endpoints. CHPC maintains Globus endpoints both for the general and protected environments, using our Data Transfer Nodes (DTNs). The DTNs are secured in such a way that they can bypass the University firewall, thus providing a much faster access speed to the national high speed Internet2 network. In the Globus web app, the endpoints are denoted with the icon.
Access to endpoints is provided by the collections. Mapped Collections are created by the endpoint administrators, and CHPC provides mapped collections to CHPC file servers through the CHPC defined collections listed below. Guest Collections are created by users and can be used for sharing data. The collections in the Globus web app are displayed with the icon.
Globus’ central usage interface is the web app. For logging in, one can either select the University of Utah for uNID and campus password authentication, or create a free Globus ID account. Globus ID is especially useful if one has accounts at multiple institutions, since the Globus ID can be used as a master account covering al the institutional identities. Having accounts at multiple institutions may be also managed by clicking the Account on the toolbar found on the righthand side. Linked multiple identities allow one to have access to endpoints from different institutions and initiate data transfers between them.
After logging into the app with the University credentials, one can search the Endpoints and Mapped Collections to find those that belong to CHPC. We recommend to use CHPC Utah as a search phrase, as there is also a CHPC in South Africa. The Mapped The collection for the general environment access is University of Utah - CHPC DMZ Clustered Endpoint and for the protected environment (PE) use University of Utah - CHPC PE Clustered Endpoint.
File Transfers using the CHPC Globus Managed Collections
CHPC Managed Collections can be used for data transfer either within CHPC, or between CHPC and other institutions with Globus access.
- Log in the Globus app with either your Globus or your institution's account.
- In the left menu bar, select File Manager, and in the Collection entry box search for and select a CHPC Managed Collection (search phrase CHPC Utah). If you choose a PE Managed Collection, you may need to re-authenticate with the
University of Utah credentials, or two factor depending on when you last did that
in your browser session.
- Click on the Transfer or Sync to button at the right side of the File Manager window, and in the new pane's Collection entry box search for and select the collection you want to transfer to.
- Select the files or directories you want to transfer and click the Transfer or Sync to button to initiate the transfer.
- The transfer is initiated and is performed in the background.
- As the transfer is proceeding, click the Activity button, select your transfer and monitor the progress. When the transfer completes, you will receive an email notifying you it is done and giving you statistics of the transfer.
Copying data from local machine to CHPC
For moving data between personal computers and CHPC, we have been recommending file transfer programs that implement the secure copy (scp) protocol, such as WinSCP for Windows. A Globus equivalent is the Globus Connect Personal (GCP). It is a Windows, Mac or Linux program that one installs on a personal computer, which creates a personal collection. One can then use the web app to transfer files between this personal collection and any other Globus collections the user has access to, including the CHPC Managed Collections. The advantage of this approach over the traditional scp approaches is much higher transfer speeds, which is especially important for large files.
Set up Globus Connect Personal on your local machine
To set up GCP, follow the GCP documentation. To add external drives such as USB, follow this guide.
If you have protected data on your local computer, that you need to transfer to CHPC, you have to select the the High Assurance (HA) during the installation. If you have already installed GCP without the HA, you'll need to first uninstall this GCP. Follow these steps, on the top of the general GCP documentation to install the GCP HA:
- First, log into the Globus Web App with your University of Utah identity, or with your Globus ID that has tied in the U's identity.
- In the left panel click Settings. Then click the Subscriptions tab. Click Find a Subscription, search for utah, select “University of Utah (HA)". Click Join Subscription. Fill out the form and click Submit Application (you may be asked to authenticate again under the identity you chose for the application). CHPC staff will get an e-mail to approve this addition to the CHPC subscription.
- In the meanwhile, download and install the GCP.
- When installed, the collection setup window appears. Check High Assurance check box, set the Owner Identity to your University of Utah identity, choose reasonable Timeout and click the Save button.
- You will receive an e-mail when the subscription is approved by CHPC staff. In the Globus Web App left menu panel, click Collections. Click on the Administered by You tab and find your personal collection, click Edit Subscription Status. You should now be able to select University of Utah (HA). Once that is done the GCP is ready to use. Alternatively, the personal collection information can be accessed by right clicking on the running GCP application's icon in the Taskbar, and selecting Web: Collection Details.
Transfer data from/to local machine
Once the local GCP application is installed, one can see the personal collection in the Globus web app by going to Endpoints–Administered By You.
To transfer data from local machine to CHPC, log into the Globus web app and do the following:
- Choose the File Manager in the left side menu
- Choose two panes view in the upper right
- Click on the search on the right pane to see the Collection Search window
- Choose the Your Collections tab to see all the collections that you have access to.
- Select the GCP on your personal computer. The File Manager will return with the right pane displaying files on your local computer
- In the left pane, search for CHPC Utah and choose the appropriate managed collection, which will bring your CHPC directory
contents:
- University of Utah - CHPC DMZ Clustered Endpoint in the general environment for transfers to/from sites external to campus
- CHPC DMZ NO HA in the general environment for transfers to/from sites external to campus, allows external collaborators without University of Utah identity to write to CHPC based guest collection
- University of Utah - CHPC Inter-Campus Clustered Endpoint in the general environment, for internal to campus transfers
- University of Utah - CHPC PE Clustered Endpoint in the protected environment
- Navigate to the appropriate directories in both the local and CHPC collections, click the files or directories to transfer and click the Start button at the bottom of the page to initiate the transfer
- The transfer will run in the background, and you will receive an e-mail once the transfer is finished. Statistics about the transfer can be obtained by clicking on the Activity menu on the left hand side toolbar, and choosing the appropriate transfer item.
Note: To access data on other file systems than your home (e.g. scratch or group spaces), type this file system's path (e.g. /scratch/general/vast) into the Path field of the Globus web-based file manager and hit Enter.
Sharing data with collaborators
Within the Globus endpoints, one can choose directories to share with others. These shared directories are called Guest Collections. Guest collections are built on top of CHPC Globus endpoints. In the general environment, these endpoints are dtn0[5-8], for external transfers to/from general environment and intdtn0[1-3] for internal to campus transfers in the general environment, and pe-dtn0[3-4] for the PE.
Globus provides an e-mail invitation and a web link that the collaborators can use to access the data. The advantage of this approach is that it eliminates the need for data duplication. One can directly share data that are located on the CHPC storage, rather than having to copy it elsewhere for sharing.
To create a share in the general environment:
- In the Endpoints menu, search for CHPC Utah to find the dtn0[5-8] or intdtn0[1-3] endpoints
- Click on the endpoint name, and navigate to the Collections tab.
- Click on the Add a Guest Collection button to create the collection. All your existing Guest Collections are also listed.
- You may be prompted to re-authenticate with the University credentials, and to allow the Collection App to access certain information. Agree on that.
- If prompted register your account (your uNID) with the Globus Connect Server using uNID@utah.edu as the Globus Identity.
- The Create Guest Collection screen appears. The Base Directory has to be the absolute path to the data you want to share, e.g. /uufs/chpc.utah.edu/common/home/u0123456/sharing. Also enter the Collection Display Name, optionally the Description, Keywords and Default Directory. Then click the Create Collection button to create the collection.
- Once the collection is created, you have several immediate options available, including Share data on this new collection with others to add collaborators to share the data in the collection, or to get a link for sharing, that can be sent to the collaborator.
To create a share in the protected environment:
- In the Endpoints menu, search for CHPC Utah to find the pe-dtn0[3-4] endpoints.
- Click on the endpoint name, and navigate to the Collections tab.
- Click on the Add a Guest Collection button to create the collection. All your existing Guest Collections are also listed.
- You may be prompted to re-authenticate with the University credentials, and to allow the Collection App to access certain information. Agree on that.
- If prompted register your account (your uNID) with the Globus Connect Server using uNID@utah.edu as the Globus Identity.
- The Create Guest Collection screen appears. The Base Directory has to be the absolute path to the data you want to share, e.g. /uufs/chpc.utah.edu/common/home/u0123456/sharing. Also enter the Collection Display Name, optionally the Description, Keywords and Default Directory. Then click the Create Collection button to create the collection.
- Once the collection is created, you have several immediate options available, including Share data on this new collection with others to add collaborators to share the data in the collection.
- Adding collaborators to the collection is the only option in the PE, due to the file permissions limits enforced on the PE file systems.
All your existing Guest Collections are available under the Endpoints menu, Shareable By You tab. Clicking on a collection and moving to the Permissions tab allows to add users to share, get the share link, or modify the access permissions.
When collaborators input the link in a web browser (for general environment only), or, respond to the invitation e-mail when you added them to the collection, Globus web app appears in the web browser, prompting them to authenticate - either with their own institution’s credentials, or to create a new free Globus account. Once the collaborators authenticate with Globus, the shared collection will be accessible to them under the Endpoints menu, Shared With You tab.
Multi-user data collaboration is facilitated by the Globus Groups. The Group menu item lists the groups one belongs too. One can also create new groups and add members to them. To share data with a group, choose the Guest Collection, and in the permissions tab, Add Permissions – Share With, select the newly created group. Here one can also modify the data access to be read or read/write.
Note that for the write access to the Guest Collections, the collaborator needs to authenticate to Globus with an University of Utah identity, since most of our endpoint use High Availability (HA) security. The only exception to that is the CHPC DMZ NO HA endpoint, which does not use the HA, and is only available for the general environment (PE requires the HA). Therefore, for external collaborators without UofU identity who need write access (e.g. wanting to share their data), set up the Guest Collection through the CHPC DMZ NO HA endpoint.
Use of Globus with the Command Line Interface (CLI)
Globus has a Command Line Interface (CLI) allowing users access to Globus from the shell. The Globus CLI requires its own installation and is a standalone, open source application.
- Globus's Command Line Interface is maintained as a python package. It requires python 2.7+ or 3.3+ to function and relies on pip for installation. Installation instructins for Globus CLI can be found here: https://docs.globus.org/cli/installation/
- A quick guide to using Globus CLI can be found here: https://docs.globus.org/cli/#getting_started
- Examples of Command Line Interface usage can be found here: https://docs.globus.org/cli/examples/
- A reference to all Globus CLI commands can be found here: https://docs.globus.org/cli/reference/