summary.tex 14.3 KB
Newer Older
1 2
\title{Elastic IDS Overall Architecture Explained}
3 4 5
    Praveen Kumar Shanmugam\\
    School of Computing\\
    University of Utah
6 7 8 9 10 11 12 13 14 15 16 17




18 19
    This is describes about the overall architecture and the challenges in the
    project \ldots
20 21 22

23 24 25 26
The main goal of the system is to make the IDS management simple and automated
in a core network. Our approach uses the capability of Software Defined Network
(SDN) to manage the network from a single point and leverage the elasticity of 
the cloud computing resource and build a generic framework for the IDS management.
27 28 29 30 31 32 33 34 35 36 37 38

The Campus network consists of many services in its core network. We assume
that the whole network is SDN capable. It consists of a SDN capable Cloud
which is a generic Cloud service.

\section{Problem Statement}
As the IDS installation requires manual installation of tap points in the
network and IDS server/sensor node setup which is a long and painstaking
With the cloud infrastructure enabling the power of elasticity with on demand
instance creation and deletion and SDN providing the full network view and
39 40
control giving the possibility of automation with right kind of framework to
do away with the manual part.
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68

\section{Proposed Solution}
The SDN Controller in the campus network makes it possible to tap traffic at
any point in the network and deliver it to the IDS instance in the cloud. The
SDN Controller can talk to the Cloud Controller Interface to instantiate new
IDS instance in case of huge traffic to be monitored or a new tap point is
specified by the network administrator. 

\section{Campus Network}
The Campus Gateway (CG) serves as the entry point of the core network in which all
the inbound and outbound traffic passes through. The Campus Cloud Gateway
(CClG) forms the boundary between the campus core network and the Campus Cloud
Network. Cloud Compute Gateway (ClCG) acts as the entry for point for the
cloud traffic.\\

The Cloud consists of Cloud Controller which manages instances in it and it
also has a SDN controller inside the cloud which spans through out the cloud
network devices within it.\\

The Campus Core network also has an SDN controller which spans across the core
network devices.\\

\section{CNAC Interface}
We introduce a Cloud and Network Access Controller (CNAC) module which forms
the brain of the framework. CNAC interfaces with the Cloud Controller and
the Cloud SDN Controller for controlling the IDS instances and network path
for that instance. Without the access to the network in the Cloud for the IDS
instance it is very difficult to deliver the traffic of interest just like the
69 70 71 72
packet which reaches the destination without manual setup.
CNAC Interface also controls the Campus SDN Controller to enable the tap
points and install/delete flows based upon the CNAC commands to help monitor
the traffic or mitigate the attack.

\section{Architecture Diagram 1.0}

77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
One primary instance to monitor the traffic constantly. And there must be
standby instances ready to monitor rather than instantiating newly. When these
standby's are used then a new instance to be spawned. This is just to make
sure that we react to the situation as soon as possible.

We are looking into BRO IDS as it is easy to add rules to alert when something
goes out of normal. The catch is that these rules are handcrafted.

Another thing to consider is the traffic that is tapped before the firewall
and after the firewall. The traffic after the firewall is always considered to
be safe but it may not be the case in few.

Closed loop has two parts. One the instance creation and the second the
network changes for reacting to the alert. As of now there is no defined way
to have specific actions for some alerts and it is well beyond the scope of
this project. Requires extensive machine learning and extensive parsing rules
and an important thing will be the false positive actions.
We'll be concentrating on the cloud instance creation for the taps rather than
taking network level actions.

Another thing to look out for is that Security Onion is not meant to scale.
Have to manually test the Server Sensor combination to come up with the
maximum capacity. 

Few cases for new instance creation are heavy load to one IDS, drastic alert
from single source like a flood of TCP connection. Should come up with various
such cases. This will be good enough to start off with.

Another thought to have SDN in cloud like a pluggable entity. Have a tapping
gateway and just deliver it to the instance to make it more practical. [TO-DO :
Make it clear]

Provision with dedicated fiber connections for tunnelling. We can leverage
this to make the inbound traffic isolated from the core network traffic
wherever possible. Terminologies for the traffic : Inbound, Outbound, Hybrid.

114 115 116 117 118 119 120 121
\subsection{Networking Part}
Flowvisor cannot used as it doesn’t allow flows to be created when there is a
conflict arrives. This is exactly what we want to accomplish but without
changing the behavior or the existing network.\\

The main controller which controls the core network will not be touched. RYU
controller which is to be used by our CNAC interface will do the job of
pushing and pulling flows. The communication as of now is done by REST APIs or
Praveen Kumar Shanmugam's avatar
Praveen Kumar Shanmugam committed
JSON. [As of now I'm doing the testing of the capabilities using REST APIs].\\
123 124 125 126 127 128

The system knows all the ports in each switch which are dedicated for the IDS
traffic route. Given the tap point the CNAC will form a graph in which
switches the flows are to be installed, then it compares it with the existing
flows in each of the switch and tries to come up with the equivalent rules
which makes the tapping to happen but preserving the existing behavior of the
Praveen Kumar Shanmugam's avatar
Praveen Kumar Shanmugam committed
core network's service traffic.\\
130 131 132 133 134 135 136 137 138

The modified rules and the added rules are to be maintained by the CNAC when
the tapping has to be removed.

\subsection{Cloud Part}
Flowvisor can be used in the cloud part which we want the traffic to be
isolated and delivered to the specific VM without affecting the others

139 140 141 142
\section{More Motivation}
    \item Tapping Issues in Current Network: Mirror costs is very high.
    \item If span ports are used, the first thing that is taken down when the
        traffic increases are these ports. [+ SDN : can be selective on what we
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
        monitor rather than the whole traffic]
    \item To have a fully secure network the cost will be multiple times the
        actual network setup cost with monitors, taps etc. for it.
    \item Hospital Data path are not tapped as it has very high bandwidth and
        requires high end tapping optics of high expense.

    \item Inbound traffic tunnelling increases network usage in core network.
    \item Crossing of administration boundary between the Cloud and the Core

    \item True Mirror: Is SDN mirror port a true mirror as claimed.
    \item Monitor the router usage with and without our Arch.
    \item Numbers always vary when tested with a sampled traffic and a live

168 169
\section{Tools Considered}
170 171
    \item GENI to be used for the cloud part (inside GENI itself). [Not easy.
        Possibly use openstack]
172 173
    \item Security Onion (SO)~\cite{securityonion:online}
174 175
    \item Chef: To distribute the rules of the IDS installation in Security Onion.
    \item CLIPS/DROOL : Rule based modules.
176 177 178
    \item RYU SDN Controller~\cite{RYU}
    \item FOAM : aggregate manager for GENI~\cite{FOAM}
    \item TestON: SDN Testing suite~\cite{teston}
179 180 181

\section{Security Onion}
182 183 184 185 186 187
It is a Linux distro which forms a generic framework to plug different IDS
(Intrusion Detection System) and NSM (Network Security Monitoring) tool. This
also supports ELSA (Enterprise Log Search and Archive) which is a query tool
for looking at all of the supported IDS and NSM logs through the single


190 191 192 193 194 195 196 197 198 199 200 201 202 203

\section{Plumbing to be done}
    \item Security Onion is a GUI tool : Tailor this to work in a single line
        installation script to automatically setup in sensor mode.
    \item Add Chef Support to SO.[optional at this point of time]
    \item CNAC to support talking to the Cloud Controller to instantiate new
    \item CNAC to support query interface with SO Server to take necessary
        actions in the Campus network.
    \item CNAC to talk to Cloud SDN Controller to setup path for IDS traffic
        from Campus Core network.
    \item CNAC must have rule based support for forming the closed loop part
        of the architecture.
204 205
    \item GENI to be used for the networking part.
    \item RYU controller which supports OpenFlow 1.3 to be used.
206 207 208

209 210 211 212 213 214 215
    \item Two SDN controllers, one in cloud and other in core network need to
        be configured to tap the traffic without affecting the existing
    \item What kind of rules to be taken care in CLIPS/DROOL to take action in
        the network. Example. Load Balancing, Suspicious TCP connections for a
        single service etc.
216 217 218
    \item Splitting of traffic is done in two areas. One in cloud and other in
        the core network. The inbound network usage must be checked when
        installing new taps.
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236

%    \centerline{\includegraphics[width=1.0\textwidth]{ElasticIdsArch.jpg}}
%    \caption{Elastic IDS Architecture Diagram}
%    \label{DM}

%\section{Previous work}\label{previous work}
%A much longer \LaTeXe{} example was written by Gil~\cite{Gil:02}.

%In this section we describe the results.

%We worked hard, and achieved very little.

\section{Current Promising Approach}
238 239 240 241 242 243 244 245 246 247 248 249
\subsection{Things TO-DO}
    \item Tapping mechanism using inbound approach.
    \item How to tap without disrupting existing network operations.
    \item Add SDN as a visible cloud resource.
    \item Cloud resource for IDS instantiation.
    \item inter domain SDN between cloud and core network.
    \item Extend Model to Multi-side/Multi-domain.

\section{Architecture Diagram 2.0}
This is followed by openstack which uses SDN in its underlying network.
250 251 252

253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315
FlowVisor enforces isolation between each slice, i.e., one slice cannot
control another user's traffic. FlowVisor creates rich ''slices'' of network
resources and delegates control of each slice to a different controller.

\section{FOAM : OpenFlow aggregate manager}
FOAM uses GENI v3 rspecs, with the OpenFlow v3 extensions which GENI use to 
allow experimenters to allocate OpenFlow resources. And uses FlowVisor to
keep the isolation of flowspaces between slivers.

\section{Foam and FlowVisor architecture}

\section{Literature Survey}
\subsection{Playing Nice (TO-DO:2)}
  Idea is to use flowvisor when pushing rules so that the existing network
  doesn't get affected with our rules. Though our rules are important any
  violation to the existing network is not desired and hence a rule which is
  denied by our CNAC controller is legit and failure have to handled.

  \subsection {Add SDN as a visible cloud resource (TO-DO:3)}

  \subsubsection{OpenStack Neutron}
  For creating virtual networks between cloud tenants in the cloud. This is
  achieved using SDN which are used in two places.
      \item For the TOR control to connect physical machines using internal network.
      \item Physical Machine runs openvswitch to connect to different VMs.

  The current API's provided by OpenStack uses SDN to achieve the network,
  subnet and port abstraction. These also provide firewall rules to handpick the
  traffic type to the VMs.\\

  This allows tenants to create private network inside the cloud. These are achieved
  using OpenFlow rules using VLANS and GRE tunneling.\\

  Though it uses SDN controllers we can't get our hands directly on the controller.
  These are controlled by REST APIs from OpenStack Quatum Server based upon
  the requests. If we are to make these SDN controller accept rules based on our
  needs, OpenStack must be tailored to allow custom flows. Hopefully these kind of
  debug possibilities should be there. I have to look into it.\\

  \textbf{Example. Deliver all traffic with signature (tuple 5 match) to the this VM.
  This must be done in a controlled way as other VM traffic might be
  sucked in. [think through]}

   Note : OpenStack has neat documentation for the all the services :)\\

   Of the options we have in Open Source Cloud like Eucalyptus, Emulab,
   OpenNebula, OpenStack is by far the best option we have.

   \subsubsection{Amazon Virtual Private Cloud}
   These also try to provide the user with the same functionality but the
   underlying architecture is a closed one. This again the network is with the cloud
   between the same administrative tenants.

   \subsubsection{Related Work}
   \textbf{SDN in Cloud}\\
   The paper talks about using OpenFlow to create a virtual network
   abstraction for the tenants. This is what OpenStack Neutron does.
       \item An OpenFlow based Network Virtualization Framework for the
       \item Cloud orchestration with SDN/OpenFlow in carrier transport
       \item Cloud computing networking: challenges and opportunities for
321 322 323 324

   \textbf{Multi-Site Work}
       \item Cloud Service Delivery Across Multiple Cloud Platforms~\cite{6009336}is a work in progress infrastructure to connect multi site tenants. This
326 327 328 329 330 331
           is achieved by OpenFlow switches connecting the site and installing flows based on
           the connectivity requirements of the services.


334 335