The Agave Platform


A common task for ACI facilitators is assisting researchers with deploying or getting their software packages to run on the local or national resources and working with the job scheduler.  For the most part the reality of using HPC and HTC resources requires getting into the command line, which many new graduate students and post-docs are not familiar with.  Many of us provide intro to Linux/Unix training and Software Carpentry trainings that include some command line modules, but what if you didn’t have too?  The facilitators at the University of Hawaii (UH) have been investigating ways to allow more of our users to interact with ACI storage and compute without the command.

An obvious way to provide non-command-line interaction with computational infrastructure is by constructing a web-based interface.  Such an interface would allow a researcher to select systems, data and application to use together to get to their computational goals by simply clicking a submit button and then being notified when things are finished.  There are some tools available for doing things like this and one of UH’s partners the Texas Advanced Computing Center (TACC) has developed middleware, the Agave platform, that supports REST API interfaces which provide all the support for transferring data, launching execution and moving the results back along with notifications, role based access controls, easily adding storage and compute systems and a host of other useful tools that would be required for a secure and functional system.

Currently the UH and the TACC have been working on bringing the Agave Platform (http://agaveapi.co) to the researcher at the University of Hawaii for the last several months.  The Agave Platform is the underlying middleware that powers several science gateways such as:

  • CyVerse – (http://www.cyverse.org/) whose mission is to design, deploy, and expand a national cyberinfrastructure for life sciences research and train scientists in its use by offering:  a data storage facility; interactive web-based, analytical platform; cloud infrastructure to use remote servers for computation, analysis, and storage; web authentication and security services; support for scaling computational algorithms to run on large, high-speed computers; education and training in how to use cyberinfrastructure; and people with expertise in all of the above.
  • iPlant – (http://www.iplantcollaborative.org/) CyVerse grew out of iPlant which was originally intended for biological science support but is now domain agnostic and offers the same services with infrastructure, tools and applications that support biology.
  • BioExtract – (https://www.bioextract.org/) an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatics workflows.
  • DesignSafe – (https://www.designsafe-ci.org/) – a cloud based environment for research in natural hazards engineering
  • Araport – (https://www.araport.org/) the Arabadopisis information portal. a one-stop-shop for Arabidopsis thaliana genomics. Araport offers gene and protein reports with orthology, expression, interactions and the latest annotation, plus analysis tools, community apps, and web services.

The University of Hawaii (UH) is in the process of standing up an initial proof of concept science platform that will support authentication/authorization, data storage, data management, metadata management, provenance, computation and workflows by leveraging the newly released Agave deployer, which automates installation and setup of the Agave Platform dependencies using a combination of dev-ops automation and docker containers.  Additionally, UH plans to have a basic extendable web frontend on top of the UH Agave platform tenant by utilizing Agave ToGo, which is an AngularJs client-side web application that provides GUI access to the basic Agave API functions.  Eventually, UH will have it’s own gateway capable of allowing UH researchers to transfer data from any system they have access to and run computation on any computational system they have accounts and allocations on such as XSEDE, OSG etc all through a single web interface.  In addition to just being able to execute their own software researchers can also utilize a number of applications shared by other users or installed centrally, which further reduces some of the effort needed to get running on ACI computational resources.

In April 2016 UH and TACC conducted a successful workshop on training UH researchers how to use the Agave platform using their command line tools and Jupyter notebooks to launch computation on the UH ITS HPC cluster using the Agave platform to automate file transfers and generation of the scheduler batch scripts.  However, TACC is going to provide a workshop at XSEDE 2016 on using the Agave platform without the command line using their publicly accessible Agave API tenant.  If you are going to be at XSEDE 2016 for the workshops I highly recommend checking it out.  Also, TACC has been working hard to provide additional documentation and beginner’s guides to using the Agave APIs (command only at this time) here (http://agaveapi.co/documentation/)

We at UH ITS Cyber infrastructure look forward to providing additional updates and information as this project progresses.