Category Archives: Remote Campus

ICASTART, ICAEND “ICA-LIKE!!!”

In 2008 I had a conversation with Jay Tomlin asking him if he would put in an enhancement for ICA Logging on the AGEE. Basically we wanted the ability to see the external IP Addresses of our customers coming through the Access Gateway. As you are likely aware, what you get in the logs are the IP Addresses bound to the workstation and not the external IP Address that they are coming through. In the last ten years, it has become increasingly rare for an end user to actually plug their computer directly into the internet and more often, they are proxied behind a Netgear, Cisco/Linksys, and Buffalo switch. This makes reporting on where the users are coming from somewhat challenging.

Somewhere between 9.2 and 9.3 the requested enhancement was added and it included other very nice metrics as well. The two syslog events I want to talk about are ICASTART and ICAEND.

ICASTART:
The ICASTART event contains some good information in addition to the external IP. Below you see a sample of the ICASTART log.

12/09/2012:14:40:46 GMT ns 0-PPE-0 : SSLVPN ICASTART 540963 0 : Source 192.168.1.98:62362 – Destination 192.168.1.82:2598 – username:domainname mhayes:Xentrifuge – applicationName Desktop – startTime “12/09/2012:14:40:46 GMT” – connectionId 81d1

As you can see, if you are a log monger, this is a VERY nice log!! (Few can appreciate this) With the exception of the credentials everything is very easy to parse and place into those nice SQL Columns I like. If you have Splunk, parsing is even easier and you don’t have to worry about how the columns line up.

ICAEND:
The ICAEND even actually has quite a bit more information and were it not for the need to report ICA Sessions in real time, this is the only log you will need. Below is the ICAEND log.

12/09/2012:14:41:12 GMT ns 0-PPE-0 : SSLVPN ICAEND_CONNSTAT 541032 0 : Source 192.168.1.98:62362 – Destination 192.168.1.82:2598 – username:domainname mhayes:Xentrifuge – startTime “12/09/2012:14:40:46 GMT” – endTime “12/09/2012:14:41:12 GMT” – Duration 00:00:26 – Total_bytes_send 9363 – Total_bytes_recv 587588 – Total_compressedbytes_send 0 – Total_compressedbytes_recv 0 – Compression_ratio_send 0.00% – Compression_ratio_recv 0.00% – connectionId 81d16

Again, another gorgeous log that is very easy to parse and put into some useful information.

Logging the Data:
So, this was going to be my inaugural Splunk blog but I didn’t get off my ass and so my eval of Splunk expired and I have to wait 30 days to use it again (file that under “phuck”). So today we will be going over logging the data with the standard KIWI/SQL (basically a poor man’s Splunk) method.

So the way we log the data, if you haven’t been doing this already, is we configure the Netscaler to send logs to the KIWI Syslog server and we use the custom data source within KIWI to configure a SQL Logging rule. We then create the table, parse the data with a parsing script and voila, instant business intelligence.

Creating the custom KIWI Rule:

First, create the rule “ICA-START/END” with a descriptive filter configured as you see below.

Next you will optionally configure a Display action but more importantly you will configure the Script that parses the data.

Paste the following text (Below) into a file named Script_Parse_AGEE-ICA.txt and save it in the scripts directory of your KIWI install.

Function Main()

Main = “OK”

Dim MyMsg
Dim UserName
Dim Application
Dim SourceIP
Dim DestinationIP
Dim StartTime
Dim EndTime
Dim Duration
Dim SentBytes
Dim RecBytes
Dim ConnectionID

With Fields

UserName = “”
Application = “”
SourceIP = “”
DestinationIP = “”
StartTime = “”
EndTime = “”    
Duration = “”
SentBytes = “”
RecBytes = “”
ConnectionID = “”

MyMsg = .VarCleanMessageText

If ( Instr( MyMsg, “ICAEND_CONNSTAT” ) ) Then
SrcBeg = Instr( MyMsg, “Source”) + 6
SrcEnd = Instr( SrcBeg, MyMsg, “:”)
SourceIP = Mid( MyMsg, SrcBeg, SrcEnd – SrcBeg)

DstBeg = Instr( MyMsg, “Destination”) + 11
DstEnd = Instr( DstBeg, MyMsg, “:”)
DestinationIP = Mid( MyMsg, DstBeg, DstEnd – DstBeg)

UserBeg = Instr( MyMsg, “domainname”) + 10
UserEnd = Instr( UserBeg, MyMsg, “-“)
UserName = Mid( MyMsg, UserBeg, UserEnd – UserBeg)

StartBeg = Instr( MyMsg, “startTime “) + 11
StartEnd = Instr( StartBeg, MyMsg, ” “)
StartTime = Mid( MyMsg, StartBeg, StartEnd – StartBeg)

EndBeg = Instr( MyMsg, “endTime “) + 9
EndEnd = Instr( EndBeg, MyMsg, ” “)
EndTime = Mid( MyMsg, EndBeg, EndEnd – EndBeg)

DurBeg = Instr( MyMsg, “Duration “) + 9
DurEnd = Instr( DurBeg, MyMsg, ” “)
Duration = Mid( MyMsg, DurBeg, DurEnd – DurBeg)

SentBeg = Instr( MyMsg, “Total_bytes_send “) + 17
SentEnd = Instr( SentBeg, MyMsg, ” “)
SentBytes = Mid( MyMsg, SentBeg, SentEnd – SentBeg)    

RecBeg = Instr( MyMsg, “Total_bytes_recv “) + 17
RecEnd = Instr( RecBeg, MyMsg, ” “)
RecBytes = Mid( MyMsg, RecBeg, RecEnd – RecBeg)

ConBeg = Instr( MyMsg, “connectionId”) +12
ConnectionID = Mid( MyMsg, ConBeg)

Application = “NA”

end if

If ( Instr( MyMsg, “ICASTART” ) ) Then
SrcBeg = Instr( MyMsg, “Source”) + 6
SrcEnd = Instr( SrcBeg, MyMsg, “:”)
SourceIP = Mid( MyMsg, SrcBeg, SrcEnd – SrcBeg)

DstBeg = Instr( MyMsg, “Destination”) + 11
DstEnd = Instr( DstBeg, MyMsg, “:”)
DestinationIP = Mid( MyMsg, DstBeg, DstEnd – DstBeg)

UserBeg = Instr( MyMsg, “domainname”) + 10
UserEnd = Instr( UserBeg, MyMsg, “-“)
UserName = Mid( MyMsg, UserBeg, UserEnd – UserBeg)

AppBeg = Instr( MyMsg, “applicationName”) + 15
AppEnd = Instr( AppBeg, MyMsg, “-“)
Application = Mid( MyMsg, AppBeg, AppEnd – AppBeg)    

StartBeg = Instr( MyMsg, “startTime “) + 11
StartEnd = Instr( StartBeg, MyMsg, ” “)
StartTime = Mid( MyMsg, StartBeg, StartEnd – StartBeg)

ConBeg = Instr( MyMsg, “connectionId”) +12
ConnectionID = Mid( MyMsg, ConBeg)

EndTime = “NA”
Duration = “NA”
SentByes = “NA”    
RecBytes = “NA”

end if

.VarCustom01 = UserName
.VarCustom02 = Application
.VarCustom03 = SourceIP
.VarCustom04 = DestinationIP
.VarCustom05 = StartTime
.VarCustom06 = EndTime
.VarCustom07 = Duration
.VarCustom08 = SentBytes
.VarCustom09 = RecBytes
.VarCustom10 = ConnectionID

End With

End Function

Next you will create the custom DB format exactly as follows:
(IMPORTANT: NOT SHOWN Make sure you check “MsgDateTime” in this dialog box near the top)

Then you will create a new “Action” called “Log to SQL” and select the Custom DB Format and name the table AGEE_ICA and select “Create Table”. If you have not yet, build your connect string by clicking the box with the three periods at the top “…”

Then watch for ICASTART and ICAEND instances.

Then look at the data in your SQL Server:

Now you can report in real-time on external utilization by the following:

  • Utilization by IP Range
  • Utilization by Domain
  • Utilization by UserID
  • Utilization by time of day
  • Average Session Duration
  • You can tell if someone worked or not (“Yeah, I was on Citrix from 9AM to 5PM”)

Most of the queries you can reverse engineer from Edgesight Under the hood but if there is a specific query you are after just email me.

I get the average session duration with the following query:

select
avg(datepart(mi,cast([duration] as datetime)))
from syslog.dbo.agee_ica
where duration <> ‘NA’

 I tried to put everything in one table as you can see from the SQL Data Columns and the parsing script but you can split it up into separate tables if you want.

Thanks for reading!

John

The Evolution of the Remote Campus: HR 1722

In December of 2010 President Obama signed HR 1722, the Telework Enhancement Act of 2010. Basically this means that every Federal Agency has, now, less than 6 months to come up with a telework strategy for nearly 2 million federal employees. Recent storms in DC have caused sabers to rattle in the last two years to develop a telework strategy for business continuity.  However in an era of wage freezing, cuts and layoffs telework eligibility could mean the difference between key personnel staying or trying their luck in the private sector. One day a week at home in the DC Area could easily be the equivalent of $1000 or more back into an employee’s pocket.

Threaded into the legislation were requirements about reporting on the participation, providing for accountability and training employees on telework. I wanted to take the time to cover some of the concerns that come with this legislation and dispel the idea that somehow IT organizations are suddenly going to flip a switch and become teleworking hubs overnight. At my agency we recently had snow storms that all but shut down the city yet, well over half of the effected users were able to work at home as if it were business as usual. This did not happen with the flip of a switch and it took a few years of careful planning and painful lessons for us to get in a position to have this kind of success during the recent snow event.

Our solution is Citrix from stem to stern, a user connects to an AGEE and runs via a Virtual Desktop, either XenAPP or XenDesktop. We use Edgesight to monitor and alert on key metrics as well as to provide reporting and accountability.

There are a large number of resources concerning how to set up XenAPP and XenDesktop including how to work with profiles, how to size and scale your systems and I am not going to recreate the wheel here but I do want to go over some concerns that can potentially be forgotten as you plan a transition to having 10-20 percent of your workforce connecting remotely. Also, most remote access throughout the Federal Government is either VPN or Citrix, I want to contrast the benefits and risks of each technology and point out why I think thin computing may be the best answer when it comes to a large scale remote access solution.

Hopefully your agency has Citrix expertise on hand, if not, please do not be afraid to reach out to Citrix Partners who can work with your incumbent IT Staff or Systems Integrators  such as Perot, Lockheed, IBM, EDS, etc.  These guys are fiends at implementation of Citrix XenAPP and XenDesktop and will help train/transition your staff.

Bandwidth:
Prior to my latest non-fiber provider I had used both AT&T Uverse and FIOS. Both of these vendors provided 14+ MB download speeds. My current provider gives me about a 10MB download. This is great for surfing the web, delivering rich content on websites and watching movies on Netfix. For remote access solutions, these new high speed broadband connections can sap your agencies bandwidth post-haste. You have to ask yourself, is my agency ready to become an ASP?  I am currently setting up a Citrix SSL VPN for my agency and as part of the testing I went o my local CIFS share and downloaded a 100mb file, my speed actually got up to 5mb per second! I was thrilled to see how fast the file came down. Now, bring on 1000-3000 of my friends, all of us using VPN and what we have is a meltdown as my agencies’ bandwidth rapidly dwindles. While I was able to get up to 5mb down on my VPN connection, my equally productive, Citrix ICA Session hovers between 20K and 60K. Will my YouTube experience be the same? No, but it is good enough and I am consuming at least 125 times less bandwidth.

The table and subsequent chart below were taken from this website showing the number of government employees at a number of DC area agencies. According to Citrix Online in an article here, 61% of all government employees are in a “telework eligible” position. So for example in the table below you see that the department of Veterans Affairs has 8000 DC Area employees.

If 61% of the VA Employees are telework eligible and the work at home one day a week, that means 8000 employees times .61 divided by 5 would mean that 976 employees would be teleworking per day.

Agency

Employees in thousands

Metro DC Area employees in thousands

Executive departments

1,664

238

Defense, total

652

68

Army

244

20

Navy

175

25

Air Force

149

6

Other

84

17

Veterans Affairs

280

8

Homeland Security

171

23

Justice

108

24

Treasury

88

12

Agriculture

82

8

Interior

67

7

Health and Human Services

64

30

Transportation

55

9

Commerce

39

20

Labor

16

6

Energy

15

5

State

15

12

Housing and Urban Development

9

3

Education

4

3

 

To calculate the bandwidth I used 1MB as the reference for VPN, I feel like this is pretty low but I think you would have to at least earmark 1MB per person if you were to scale out a VPN Solution. I used 60KB for ICA, that is generally pretty accurate for a normal ICA Session that does not have heavy graphics. So with that you can see the difference in providing remote access via full VPN vs. ICA. In the case of the VA we can see that around 1GB would be needed to support 976 users via VPN and they would need around 60MB to support the same number of users via ICA. From a bandwidth perspective that is a huge savings.

Agency

1000’s empl

In Metro DC

20% Teleworkers

 VPN BW
in GB

 ICA BW
In GB

Army

244

20

2440

2.50

0.14

Navy

175

25

3050

3.12

0.18

Air Force

149

6

732

0.75

0.04

Other

84

17

2074

2.12

0.12

Veterans Affairs

280

8

976

1.00

0.06

Homeland Security

171

23

2806

2.87

0.16

Justice

108

24

2928

3.00

0.17

Treasury

88

12

1464

1.50

0.09

Agriculture

82

8

976

1.00

0.06

Interior

67

7

854

0.87

0.05

Health and Human Services

64

30

3660

3.75

0.21

Transportation

55

9

1098

1.12

0.06

Commerce

39

20

2440

2.50

0.14

Labor

16

6

732

0.75

0.04

Energy

15

5

610

0.62

0.04

State

15

12

1464

1.50

0.09

Housing and Urban Development

9

3

366

0.37

0.02

Education

4

3

366

0.37

0.02

 

Bandwidth Cart showing bandwidth requirements for VPN at 1MB vs. ICA at 60KB.

I am not trying to scare anyone with the bandwidth comparisons rather I am trying to drive home the paradigm shift that must take place in terms of what you deliver externally. You agency must be ready to transition from delivering just web content and maybe Remote Access to a few hundred users to becoming a service provider to several hundred remote users. Do you have the bandwidth to support 20% of your eligible workforce working remotely? I know 60KB looks a lot better than 1MB plus performance of client/server applications are going to be considerably better because transactions can occur on the switched network.

And finally, I want to quickly touch on your switched infrastructure. While you may have a campus of 2500 users they are likely distributed across as many as 10-20 switches and bandwidth is more than enough per person. While the ICA Bandwidth from the XenAPP or XenDesktop machine to the end user may only be 60K, from the XenApp/XenDesktop system to downstream applications, it is full SMB, TCP, SSL, HTTP, RTSP, etc. If you are going from supporting 2500 users across 20 switches to supporting 2500 users on two to four switches you need to make sure that the those switches can handle the sudden influx of usage. You need to treat your “Remote Campus” just like any other campus you have and you will need bandwidth similar to that of a core switch.

Security:
Another big challenge to a large scale remote access solution is security. I think the current status quo is that most VPN users are IT Staff and a few other select users that the agency allows to have VPN Access. Even with today’s endpoint analysis, ensuring a computer is a Government Asset, has virus software and even encryption software is no guarantee that they will not have some sort of malware. Cyveillance.com states that AV Vendors detect, on average, less than 19% of malware attacks. 0-day malware will almost certainly go undetected on your government issued workstation if it gets on there and the VPN Tunnel becomes a definite INFOSEC concern. This is another good reason to use ICA as it differs in many ways from VPN outside of its lower bandwidth usage.

The ICA Protocol sends screen refreshes over the wire on port 1494 or port 2598. Using the FIPS Compliant AGEE MPX 9700 series you can drastically reduce your attack surface by forcing SSL to the appliance and only allowing ICA protocols to traverse the network. This means no information ever leaves the internal network, only screen refreshes. Agencies can use Smart Access policies to determine whether or not users can print, save data locally or paste text onto their own systems. This, in effect, creates a secure kiosk that keeps data from leaving the network unless it is explicitly allowed. Is there still a role for VPN?, absolutely, for Sys Admins, Network and INFOSEC staff, there will always be a need for VPN but for the general mass populous, Citrix with ICA can deliver a full desktop and run applications on the switched network providing considerably higher level of security along with better overall performance. .

NOTE: During the snow event our Netscaler 9700 MPX had over 5000 connections on it and the impact on the CPU and memory was less than 5%. The device is new and I believe this is the first real test of the FIPS multicore models that Citrix Netcaler has. I would say this is a pretty stout machine!

Support:
Okay, so you have your secure Remote Access Solution, now you have to figure out how to support it. At my agency, the “Remote Access” campus is the 2nd largest at nearly 4000 users a day and over 10,000 users a month. Most campuses have at least 5-10 level II engineers supporting desktop related issues as well as general user questions. Most Citrix teams are made up of 3-6 engineers that I have seen so this begs the question. Can you support 10,000 users with 3-6 Engineers and still get anything done? Keeping your Level III staff out of the Desktop support business is going to take some careful planning and I think is a step that is often overlooked in the VDI/Virtualization realm. For starters, most of my colleagues have not been Desktop Technicians for 7-10 years. We needed a way to ensure that the end users could continue to call the Service Desk as they always have and get the help they need and avoid introducing a “blind spot” into our support strategy. One of my “Soapbox” issues with VDI deployments is the lack of consideration given to Desktop support during the implementation. I often wonder if the fact that VDI is so dominated by Architects and Engineers without being sold to the Desktop staff is the reason it has not skyrocketed after being called the next big thing by Gartner and other IT Pundits. Architects’, Engineers and Sys Admins may not be the only relevant audience in the VDI discussion, in fact, it may be possible that they are not even the MOST relevant audience in the discussion.

(Stop ranting and move on John). Okay for our deployment we realized that first, the users were remote so there WAS no desktop support person to come help them and two, we needed a better and more skilled Level One Service Desk to be able to support the influx of remote users. We engaged in what was, at the time, a unique training regimen for the Service Desk staff. Basically, a remote user who cannot get connected by the person who answers the phone, won’t be able to work or the call will get escalated to your Level III engineers. This will cause considerable dissatisfaction with the end users as well as Engineers who get overwhelmed with escalations. We have a 90% first call resolution rate as a result of extensive training of our call center. Further, the rate at which the end user can be helped by the first person they talk to on the phone is going to be directly proportional to the success of your remote access endeavor. Our training focused on a number of routine tasks, client installation, routine connectivity issues and credential related issues (reset paswords, etc) but it also focused on what the common calls were. To accomplish this we integrated business intelligence (SSRS) to provide a visual representation of our Service Desk call data.  Keep in mind, regardless of how talented your team is and how well engineered your solution is, the people answering the phone are the “Virtual face” of your system and they need to believe in it just as much as you do.

Monitoring the Level one calls concerning Citrix was a huge step in the QA of our system and was another major reason for our growth. By monitoring our calls we were able to build out focused training strategies as well as provide ourselves with situational awareness of our system. What we noticed was that 1-2 percent of all users would call the service desk with any number of standard issues regardless of how stable the system was. That means that if you suddenly have 1500 teleworkers each day, you will receive an additional 15-30 service desk calls that day. Keep this in mind as some call centers are already staffed pretty lean. 30 calls a day is likely another body’s worth of work. Other benefits of monitoring our level one calls was to check after a change to make sure we did not see a spike in calls. The basic rule was to assign a “Pit Boss” each day to monitor our Call dashboard and ensure that everything is running smooth. The standard rule is to look at a call and ask yourself “could we make a system change to prevent this call?” If yes, than take it into consideration and if not then don’t worry about it. As I said, 1-2% would always call no matter what (passwords, User Errors, etc). By monitoring the calls we were able to grow by over 50% over the next two years while reducing our call volume by nearly the same number.

Other important tools we use are Edgesight to look at historical data concerning a users Latency and which systems they logged into, GotoAssist so that the users could support end users out in the field in the same manner as a Desktop technician. Several Custom Powershell scripts to get key metrics from XenAPP and SQL Server Reporting Services, part of Edgesight, to create custom dashboards and integrate other data sources to provide a holistic vision of the entire environment.

Conclusion:
There are telework think tanks and pundits all over the internet right now. I know the amount of information right now is pretty overwhelming. I am trying to supplement some of that information with some real-world experience of moving from a fledgling Citrix farm to the 2nd largest campus at a large federal agency. As I stated, treat your telework environment as a campus. Find out what support your population has at the desktop and make sure you can get as close to that as possible remotely. Again, the person answering the phone HAS to be able to get them back online or things will go downhill from there. Watch your support calls and take an active interest in your systems impact on your call centers and service desk. Work with them and sell them on the system and be supportive of their concerns. Right now, if we make a mistake, there will be 100 calls to the service desk in less than 30 minutes. Understand the impact of 100 service desk calls in 30 minutes and understand that when Remote Access is down, a whole campus is down.

Thanks for reading.

John