IRNC: AMI: Collaborative Research: Software-Defined and Privacy-Preserving Network Measurement Instrument and Services for Understanding Data-Driven Science Discovery

Grants

ABSTRACT

Data intensive science discovery at a global scale has imposed new requirements on the speed and management of international research and education networks. At the connection points of these international networks, it is critical to measure the network data flows to understand network traffic patterns, identify network anomalies and provide insights to network control and planning. However, the ever-increasing network speed, the massive amount of network flows and the changing measurement objectives have made the flow-level measurement on very high-speed networks extremely challenging. The Advanced Measurement Instrument and Services (AMIS) project leverages many-core, programmable network processors to prototype and deploy an advanced measurement instrument to enable services for accurate network monitoring and in-depth traffic analysis. The instrument supports flow-granularity measurement at line rate up to 100Gbps and software application programming interfaces to examine selected flows, with no impact to the performance of user traffic. With scalable hardware and an open source software stack, the measurement services equip network operators with effective tools to quantify flow-level network performance and study network flows through privacy-preserving computational analytics. This project is built on a consortium of academia, industrial partners, network operators and international alliances, who bring unique expertise and resources to achieve the objectives of high performance, programmable flow-granularity network measurement. The outcomes from this project will significantly benefit data driven science discovery, such as astronomy and space weather studies, and will promote broadened participation of underrepresented groups (such as Hispanic and female students) through the involvement of multiple universities, including an EPSCoR university and a Hispanic Serving Institution.

NSF Link

CC*DNI Networking Infrastructure: An Software Defined Networking-Enabled Research Infrastructure

Grants

ABSTRACT

For a growing number of data-intensive research projects spanning a wide range of disciplines, high-speed network access to computation and storage — located either locally in the campus or in the cloud (e.g., at national labs) — has become critical to the research. While old, slow, campus network infrastructure is a key contributor to poor performance, an equally important contributor is the problem of bottlenecks that arise at security and network management policy enforcement points in the network.

This project aims to dramatically improve network performance for a wide range of researchers across the campus by removing many of the bottlenecks inherent in the traditional network infrastructure. By replacing the existing network with modern software defined network (SDN) infrastructure, researchers will benefit from both the increased speed of the underlying network hardware and also from the ability to utilize SDN paths that avoid traditional policy enforcement bottlenecks. As a result, researchers across the campus will see significantly faster data transfers, enabling them to more effectively carry out their research.

This project builds on and extends a successful initial SDN deployment at the University of Kentucky, adding SDN switches to ten research buildings and connecting each of them to the existing SDN research network core with 40G uplinks. To ensure high-speed access to the Internet/Cloud, the research core network is being linked through a new 100 Gbps link to Internet2. Research traffic in the enabled buildings is automatically routed onto the SDN research network where SDN flow rules allow research traffic to bypass legacy network infrastructure designed to police normal traffic. By extending the SDN network capabilities to several new buildings on campus, a wide range of researchers are able to archive significantly higher network throughput for their research data.

Summary of the CC-NIE Project:

-100G* Internet2 via UK regional research network
-100G* Remote data center for research computing expansion
-Bypass existing firewalls, distributions, and edge devices.
-Deployment of GENI Racks
-Deployment of OpenStack cluster for education
-Deployment of 40G Research network SDN switching core
-Deployment of 40/100G Research network SDN routing core.
-Deployment of 40G HPC Data Transfer Node (DTN)
-Deployment of 40G Network Address Translation (NAT) server
-Deployment of 40G Virtual Research Cluster (VRC)
-Consolidate replacement of 100Mb and 1Gb access layer with 1Gb access layer in 2.5 buildings ~1400x1Gb ports

NSF Link

CICI: Secure and Resilient Architecture: NetSecOps — Policy-Driven, Knowledge-Centric, Holistic Network Security Operations Architecture

Grants

ABSTRACT

Network infrastructure at University campuses is complex and sophisticated, often supporting a mix of enterprise, academic, student, research, and healthcare data, each having its own distinct security, privacy, and priority policies. Securing this complex and highly dynamic environment is extremely challenging, particularly since campus infrastructures are increasingly under attack from malicious actors on the Internet and (often unknowingly) internal campus devices. Different parts of the campus have very different policies and regulations that govern its treatment of sensitive data (e.g., private student/employee information, health care data, financial transactions, etc.). Furthermore, data-intensive scientific research traffic often requires exceptions to normal security policies, resulting in ad-hoc solutions that bypass standard operational procedures and leave both the scientific workflow and the campus as a whole vulnerable to attack. In short, state-of-the-art campus security operations still heavily rely on human domain experts to interpret high level policy documents, implement those policies through low-level mechanisms, create exceptions to accommodate scientific workflows, interpret reports and alerts, and be able to react to security events in near real time on a 24-by-7 basis.

This project addresses these challenges through a collaborative research effort, called NetSecOps (Network Security Operations), that assists information technology (IT) security teams by automating many of the operational tasks that are tedious, error-prone, and otherwise problematic in current campus networks. NetSecOps is policy-driven in that the framework encodes high-level human-readable policies into systematic policy specifications that drive the actual configuration and operation of the infrastructure. NetSecOps is knowledge-centric in that the framework captures data, information, and knowledge about the infrastructure in a central knowledge store that informs and guides IT operational tasks. The proposed NetSecOps architecture has the following unique capabilities: (1) the ability to capture campus network security policies systematically; (2) the ability to create new fine-grained network control abstractions that leverage existing security capabilities and emerging software defined networks (SDN) to implement security policies, including policies related to both scientific workflows and IT domains; (3) the ability to implement policy traceability tools that verify whether these network abstractions maintain the integrity of the high-level policies; (4) the ability to implement knowledge-discovery tools that enable reasoning across data from existing security point-solutions, including security monitoring tools and authentication and authorization frameworks; and (5) the ability to automatically adjust the network’s security posture based on detected security events. Research results and tools from the project will be released into the public domain allowing academic institutions to utilize the resources as part of their best-practice IT security operations.

NSF Link

MRI: Acquisition of the Kentucky Research Informatics Cloud (KyRIC)

Grants

ABSTRACT

This project will create a big data cloud infrastructure, the Kentucky Research Informatics Cloud (KyRIC), to accelerate data-driven discovery and computational research education across multiple disciplines. Scientific discovery today is being enabled through computational and data intensive research that exploits enormous amounts of available data. KyRIC will advance a number of exciting research programs across many disciplines, such as Bioinformatics and System Biology Algorithms, Large Graph and Evolutionary Network Analysis, Image Processing, and Computational Modeling and Simulation. Breakthroughs in KyRIC-enabled research will have important societal benefits in a number of areas, such as increasing agricultural yields, improving economic competitiveness, and creating new products and markets.

KyRIC will use a hybrid architecture to support massively parallel applications that will address exciting and challenging new data and memory intensive research in big data science. The KyRIC hybrid system will consists of two subsystems: a 50 nodes cluster, each with 4 10-core processors, 3TB RAM, and an 8TB SSD array; and a Peta-scale storage system providing 2 PB of object-based storage. KyRIC will employ leading-edge cloud management software that will allow nodes to be reconfigured, scheduled, and loaded with problem-specific applications software based on the current mix of big data jobs being executed by users. As a result, the project will enable and support a wide range of new research activities, each with its own unique characteristics that are beyond the capacity of our existing infrastructure. KyRIC will be readily accessed by researchers across the state utilizing our latest high-performance network, with multiple 100GB/s links from Lexington to Louisville and Cincinnati. KyRIC will also join XSEDE to better integrate the University of Kentucky (UK) with national multi-petascale capabilities.

KyRIC will provide intuitive access, rapid infrastructure customizations, and higher bandwidth and lower latency between the desktop and resources like XSEDE to facilite improved algorithm design, software development, and interactive data analysis. KyRIC will be used by over 1000 UK researchers (faculty, staff, and students) and by computational research collaborators across the state of Kentucky, notably University of Louisville (UL), Northern Kentucky University (NKU), and Kentucky State University (KSU). The resource will make exciting data-intensive projects possible, enhance computational research education for graduate and undergraduate students, help attract and retain talented younger faculty, and promote big data science and technology, thus impacting Kentucky’s and the nation’s economic development.

NSF Link

CC-NIE Integration: Advancing Science Through Next Generation SDN Networks

Grants

This University of Kentucky CC-NIE Integration project is focused on the ever-growing demands for improved cyber infrastructure to support

data-intensive scientific research. This project, a partnership led by UK Information Technology using technology and services from UK

Computer Science, the Laboratory for Advanced Networking, and the UK Center for Computational Sciences, provides software-defined network (SDN) infrastructure and control to UK researchers and affiliates. The project not only upgrades network components, it also provides a programmable network infrastructure tailored to the needs of researchers. Separation of UK research traffic from administrative and academic traffic enables research traffic to avoid the institutional policy constraints currently placed on all traffic.

The resulting SDN network will have a lasting impact on research projects spanning a wide range of areas including astrophysics, bio-medical, computer vision, visualization, and networking research. The most obvious improvement will be enhanced capacity for data-intensive research applications, with transmission speeds of up to 10 Gbps at the research access layer, 10 Gbps from the distribution layer to the UK research core, and 10+ Gbps from the research core to Internet2 and the regional networks (KyRON and KPEN). In addition, system administrators will be able to apply fine-grained control and prioritization of traffic across the campus backbone. It will also enable end-to-end user-defined provisioning of network access and capacity so that each research project can obtain precisely the performance it requires of the network. Finally, being integrated with the GENI network will enable researchers to access additional resources all across the GENI network.

Outcomes:

-100G* Internet2 via UK regional research network

-100G* Remote data center for research computing expansion

-Bypass existing firewalls, distributions, and edge devices.

-Deployment of GENI Racks

-Deployment of OpenStack cluster for education

-Deployment of 40G Research network SDN switching core

-Deployment of 40/100G Research network SDN routing core.

-Deployment of 40G HPC Data Transfer Node (DTN)

-Deployment of 40G Network Address Translation (NAT) server

-Deployment of 40G Virtual Research Cluster (VRC)

-Consolidate replacement of 100Mb and 1Gb access layer with 1Gb access layer in 2.5 buildings ~1400x1Gb ports

GENI : Network Monitoring

Grants

ABSTRACT (SUBCONTRACT FOR MONITORING)

The major objective of GENI (Global Environment for Network Innovation), a virtual laboratory for exploring future internets at scale, is to create major new opportunities to understand, innovate and transform global networks and their interactions with society. Dynamic and adaptive, GENI opens up new areas of research at the frontiers of network science and engineering, and increases the opportunity for significant socio-economic impact.

GENI will employ a flexible and adaptable framework that incorporates spiral development, i.e., iterative prototyping, and federation, i.e., connecting heterogeneous networks, substrates and technologies. It is expected that each turn of the spiral will take advantage of what currently exists, what has been learned from the previous spiral, what can fruitfully be federated and what might be achieved through new development and prototyping activities. Spiral 1 will allow academic-industry partners to create end-to-end GENI prototypes with a strong emphasis on the design and implementation of multiple GENI control frameworks. The ultimate goal is to design end-to-end prototypes of a virtual laboratory with a suite of infrastructure that will support future experiments and research challenges articulated by the Network Science and Engineering (NetSE) research community.

Intellectual Merit: This award gives funding to BBN Technologies to provide management and oversight of all GENI-related activities. GENI will 1) support at-scale experimentation on shared, heterogeneous, highly-instrumented infrastructure, 2) enable deep programmability throughout the network, promoting innovations in network science, technologies, services and applications, and 3) provide collaborative and exploratory environments for academia, industry, and the public to catalyze groundbreaking discoveries and innovation.

Broader Impact: Encouraging community engagement is critical to success. A GENI outreach plan will be developed and implemented. Each year, three GENI Engineering Conferences will be held. Open application for travel grants will ensure that researchers and students at underserved academic institutions and regions of the country will be able to participate. Potential partners in industry and international funding agencies will also be encouraged to attend. Students and young faculty will be hired as interns at BBN. As needed, interdisciplinary workshops will be convened that bring together researchers that don?t normally communicate, but from which GENI and the research community can benefit.

There is no pre-ordained outcome for these activities; the resultant GENI infrastructure suite could be an enahanced Internet, enhanced testbeds, federations of enhanced testbeds, something brand new (from small to large), federation of all of the above and/or federation with related international efforts. The goal is to promote innovation, entrepreneurship, and economic growth.

NSF – Subcontract 1939C C.O. Raytheon BBN Technologies

GENI Monitoring Alerts

The GENI monitoring alerts system is based on the detection of events based on metric data that polled from remote systems. Raw data is published to a queueing system, which allows multiple complex event queries to operate on the same data stream in parallel. Output of complex queries can generate Nagios alerts, log results to a database, or both.

Poll to raw metric stream

As part of the polling process raw data is both recorded in a database and pushed to a queue. The queue serves as a fanout interface for a one-to-many raw metric subscription service.

*

In the previous figure P represents our polling agent, which publishes data to a queue exchange represented by X. Clients, designated as C1 and C2, subscribe to exchanges by binding their own queues to exchanges. In the example, data published by P is replicated by X to client queues amq.gen-RQ6.. for client C1 and amq.gen-As8… for client C2.

Stream query of metric stream

The publish/subscribe queuing system allows streams of raw metric data to be replicated between many processes in parallel. This allows us to instantiate one or more complex event processing engines CEPE per replicated data stream and one or more queries inside of each CEPE. We make use of the Esper  http://www.espertech.com/ CEPE.

Esper complex event processing engine

Esper allows us to analyze large volumes of incoming messages or events, regardless of whether incoming messages are historical or real-time in nature. Esper filters and analyzes events in various ways, and respond to conditions of interest. An example of the Esper CEPE architecture is shown in the figure below.

**

Simply, CEPE queries are pattern-based (matching) subscriptions describing a possible future event. If the described event occurs, a described output is emitted from the CEPE.

Esper query format

In a typical database we query existing data based on some declarative language. We can think of and Esper query like an upside down SQL, where if events occur in the future, results will be emitted. The Using the ESPER query language, EPL (similar to SQL) complex events can are described. The EPL language reference and examples can be found here: [ http://esper.sourceforge.net/esper-0.7.5/doc/reference/en/html/EQL.html]

Consider the following EPL query:

select count(*) from MyEvent(somefield = 10).win:time(3 min) having count(*) >= 5

  • There exist a stream of events named MyEvent.
  • In the MyEvent stream there are events that contain a field named: somefield
  • In a 3 minute window, if somefield = 10 five or more times, emit data.

Just as traditional relational databases, and their related SQL queries, use specific data type operations based on column data types, data streams processed by Esper are defined by strongly typed object classes. In the previous EPL query the somefield field would have to defined as a numeric time in order for mathematical comparison to work.

GENI monitoring stream data format

For GENI Monitoring alerts, we use the LogTick class shown in the code block below:

public static class LogTick

{

        String source;

        String urn;

        String metric;

        long ts;

        double value;

        public LogTick(String source, String urn, String metric, long ts, double value)

        {

            this.source = source;

            this.urn = urn;

            this.metric = metric;

            this.ts = ts;

            this.value = value;

        }

        public String getSource() {return source;}

        public String getUrn() {return urn;}

        public String getMetric() {return metric;}

        public long getTs() {return ts;}

        public double getValue() {return value;}

        

        @Override

        public String toString()

        {

        return “source: ” + source + ” urn:” + urn + ” metric:” + metric + ” timestamp:” + ts + ” value:” + value;

        }

    }

Example GENI monitoring stream queries

Note how the following data types are used in the example queries.

       

        String source;

        String urn;

        String metric;

        long ts;

        double value;

       

There exist two types of queries:

Alert Queries: are used to send remote alerts to remote Nagios  https://www.nagios.org/documentation hosts. These queries require 5 explicitly defined values to be emitted by the query including, “nagiosserver”, “hostname”, “servicename”, alertlevel, and “alertmessage”. The function used to generate the payload sent to your Nagios server is shown below:

public void alert(String hostName, String serviceName, String alertLevel, String alertMessage)

    {

        MessagePayload payload = new MessagePayloadBuilder()

            .withHostname(hostName)

            .withLevel(Level.valueOf(alertLevel))

            //.withServiceName(“Service Name”)

            .withServiceName(serviceName)

            .withMessage(alertMessage)

            .create();

   

The following queries are examples of Alert Queries:

  • If metric gpo:is_available is set to 1 emit OK select ‘nagiosserver.somedomain.com’ AS nagiosserver, urn AS hostname, metric AS servicename, ‘OK’ AS alertlevel, ‘Alert comes from rack ‘ || ‘ source:’ || source AS alertmessage from LogTick(metric=’gpo:is_available’) where value = 1
  • If metric gpo:is_available is set to 1 emit CRITICAL select ‘nagiosserver.somedomain.com’ AS nagiosserver, urn AS hostname, metric AS servicename, ‘CRITICAL’ AS alertlevel, ‘Alert comes from rack ‘ || ‘ source:’ || source AS alertmessage from LogTick(metric=’gpo:is_available’) where value = 0
  • If a urn with the metric gpo:is_available is observed once, but not observed again for 60 min emit WARNING select ‘nagiosserver.somedomain.com’ AS nagiosserver, a.urn AS hostname, a.metric AS servicename, ‘WARNING’ AS alertlevel, ‘Alert comes from monitoring system ‘ || ‘ source:’ || a.source AS alertmessage from pattern [ every a=LogTick(metric=’is_responsive’) -> (timer:interval(60 min)) and not LogTick(urn=a.urn) ] group by a

In addition to Alert Queries there are Report Queries. Report Queries do not provide external alerting, but don’t require a specific output format. The output of a Report Query will be stored in a database, which is accessible from the Monitoring site.

The following queries are examples of Report Queries:

  • Ping times greater than 10,000ms select * from LogTick(metric=’ping_rtt_ms’) where value > 10000.0
  • If a urn is seen and then not seen again for 60min select count(*) from pattern [ every a=LogTick -> (timer:interval(60 min)) and not LogTick(urn=a.urn) ] group by a

Creating stream queries

  1. Login to the GENI Monitoring site: [ http://genimon.uky.edu]
  2. Click on the Alerting System under the GENI Reporting tab, as shown in the figure below.

  1. On the Alert page click on Build New Alert on the top right of the screen, shown in the figure below.

  1. You are now in the stream query builder page, shown in the figure below.
  1. On the stream query builder page, click on Query Node under Add Alert Node, shown in the figure below.
  1. In the query node fill in the Query Name and Query String fields. The query name field should describe your query and the query string should be a valid EPL query, which uses the LogTick class.
  1. Click on the left edge of your query node and connect your query node to the source node. The source node is the source of LogTick events, based on raw polling metrics. An example query is shown in the figure below.
  1. You must now provide a destination for the query output. On the stream query builder page, click on Destination Node under Add Alert Node, shown in the figure below.
  1. Using the dropdown box on your destination node select your query destination, then connect your destination node to your query node, much how you connected your query node to your source node.
  1. Once a source, query and destination have been configured, as shown in the figure below, click on Submit Alert on the Alert Building Tools toolbar.

References

*Image from RabbitMQ tutorial  https://www.rabbitmq.com/tutorials/tutorial-three-python.html

**Image from Esper  http://www.espertech.com/