application-performance-management-in-the-clouds-lessons-learned-14-728

Business is driving the IT to be more agile or fast changing IT landscape is enabling business to be more agile is like chicken and egg vicious circle. Irrespective, who is driving the other, common acceptance is that both of them are complementing each other and evolving together. Business can’t live without IT support and IT is irrelevant if there is no user!!! IT life cycle starts when a business function develops business needs and looks at IT for the support. Business needs are converted to technical requirement that drives development of technical specification addressing key features, properties and attributes such as performance requirements. Let’s take an example – An Insurance giant high level business need is “Build an application that can process about 1 million of claims in parallel with average total turnaround time (TAT) less than 30 seconds per claim and average web response time not exceeding 1 sec at any point of time”. This is one line of business statement but involves tons of infrastructure components starting from user interface application at client side (e.g. Java, JavaScript, HTML, Android, .Net, MF/CICS etc.) to middleware (JMS, MQ..) to Webserver (WAS, Tomcat, Apache) to App Server (WAS, JBoss, Apache) to DB, Files and storage while involving a great deal of networking components, firewall encapsulation, load balancer technique, storage RAID architecture, back end MF data processing engine and many more. Each component plays crucial role to ensure end to end application response is meeting the expectation as required. Application performance management involves APM tool to ensure we have the close traction of application performance and allows IT staff to penetrate the set up involving packet inspection, data analysis, threads detection etc. to take up stock of situation and addresses the root cause on time before performance worsens. In the midst of BYOD (Bring your own device), SMAC (Social networking applications, Mobile computing and newer application layer, Big Data from escalating growing devices –  as we call IoT “Internet of Things”, Advance analytics solutions, Cloud architecture involving public, private and hybrid), inclusion of sophisticated network and security system such as NSX (Network hypervisors), UCS (Unified computing system), evolving solutions in IDS (Intrusion Detection System),DPI(Deep Packet Inspection), DLP (Data Loss Prevention) and many more newer components in virtualizations (ESXi), the whole IT Infrastructure is getting massive and complex making a perfect APM solution a very daunting task unless we develop a strategic blue print by designing framework for APM. The framework can allow the APM solutions evolve as IT landscape changes and requirement becomes complex.

application-performance-management-dashboard

Let me attempt to define APM (Application Performance Management) to the host of stakeholders not only limiting to the IT professionals, but also to end users, business honchos, security, regulatory and compliance departments etc. APM solution in IT is to monitor and manage performance of business & software application by capturing essential data through system object transactions involving compute, network, DB, storage components. This provides an ability to capture and store all activity that traverses IT infrastructure and enable IT staff to carry out explorative root cause analysis to proactive and preventive activities to keep application healthy and available while exceeding KPI and meeting SLA.

As topic indicates, IT landscape is changing very fast to address demand of users such as BYOD, SMAC, IoT applications, security threat, innovations etc. while keeping key application components hidden, distributed and buried into layers of IT solutions involving SDDC, SDN, Virtualizations, IaaS, PaaS, SaaS and many more. APM has become essential and MUST to have component in the Infrastructure Management Service (IMS). This requires IT and Business to develop a framework that can embrace the importance of APM in this fast changing IT requirement. The pace of change will even grow faster as times pass by. The element of APM should be introduced right at requirement gathering phase to design, development to settling down in the infrastructure phase.

The omOPM hackinous headlines are incessant; corporate networks and IT resources are under ever-increasing attacks from those seeking sensitive customer or employee information. We heard about OPM (Office of personnel Management) attack last month that was not even detected by strong IDS tool such as Einstein System. Please check out my earlier blog on Security threat in the fast changing IT landscape http://www.techmanthan.com/index.php/2015/07/19/it-security-race-between-it-evolution-and-security-threat/ . Such attack might come from unfriendly governments, criminal organizations, or disgruntled individuals, our critical information system assets are always under threat. Security experts agree that the rapidly changing nature of malware, hack attacks, and government espionage practically guarantees IT infrastructure will be compromised. The question is not whether our corporate network will be compromised, but what to do when the breach is detected. The best APM solutions offer forensic capabilities with post-event intrusion resolution to track and eliminate intrusions as well as fortify existing defenses to prevent future attacks.

I have been reading articles and publications by vendors claiming to have robust e2e APM solutions, I strongly feel certain following attributes are essential in current IT dynamics that must be considered while onboarding any APM solution through either make or buy approach.(a) We should look for an APM solution that can appreciate fast traffic up to 40-50 Gb in a Data Center (While volume and velocity is increasing exponentially so is the threat requiring us to look for such APM solution that can withstand such heavy traffic). (b) As network connects all dots in the IT dynamics, so becomes most crucial element for analysis. Henceforth, 2nd important attributes in APM solution becomes expert analytics capability in network activity. Please read my blog related to this this point @ http://www.techmanthan.com/index.php/2015/08/09/infrastructure-operation-analytics-healthy-infrastructure-keeps-business-healthy/ In order to find the specific illicit event among millions of legitimate packets we need analysis tools that offer deep-packet inspection (DPI) to quickly assist in determining when and where a particular anomaly or unexpected incident has occurred. (c) Attributes that address to filter packet meeting known threat signature and archive the same for later diagnosis and analysis. (d) Adequate storage capacity (extent of petabyte) for traffic to allow post event analysis. APM needs to work like a surveillance camera and need to equip with enough capacity to hold traffic data for sufficient time-frame post (e) Ability to replay event or reconstruct the event.

Besides, APM should have the ability to establish pattern or recognizable signature in network packets from viruses, hacker attack, and unauthorized access and subsequently distribute criteria in the network enabling probes to raise alert detecting such patterns. APM solution should address multi-tiered and real-time nature of mission critical applications by quickly isolating service anomalies and their underlying specific application or network segment or specific infrastructure component in order to avoid any negative revenue impact. In DDoS (Distributed Denial of Service) situation, APM should be able to pin point the originator which is potentially spreading the malicious code choking application servers. Origin could be one PC or botnet spreading malicious codes.

image_thumb

A great deal of network monitoring solutions and tools can address the requirement of robust APM. As published in my last blog, network carries packets between two identifies (IP, Mac address etc.) encapsulating data/transaction/user request/response etc. following appropriate protocols can enable APM solution to decipher overall performance and related data points to further analysis. Such new network-based APM approach is basically agentless system taps into network devices and watches network content and traffic as it flows across the enterprise, analyzing application response times and identifying errors using wire protocols. It’s a departure from conventional approaches to APM, which used agents installed on application servers to obtain a sample of performance metrics from a select number of points throughout the IT environment, including LAN, WAN and any relevant databases to see where a legacy application was being tripped up. APM is evolving and growing rapidly, shifting from the legacy of monitoring the network or specific infrastructure components to one ensuring the user experience performs well. The increased use of the Web, cloud and mobile applications demand complex products for network engineers and managers to track their performance. The robust APM framework should pull people together to carry out investigation and analysis so that all of IT staff can understand and embrace the underlying data points. Conventional APM solution is based upon what developers/testers see performance in a controlled environment as opposed to network based APM solution that is based upon real world situation by tapping directly into the network. We have to aggressively transform old and conventional approach to the newly real world network based APM else we would not be able to gauge performance and act proactively to respond fast evolving IT landscape. APM tool must have the capability to cut across all components and reflect summarized view with an ability to drill down and locate issues to specific set of components. User interface and related features should let any IT players to carry out investigation without requiring many of specialists.

images

 

Summary: IT and Business team must investigate, analyze and invest time, money and energy to get whole business and underlying IT landscape wrapped up in a robust and adequate APM solution to track, monitor and analyze application behavior enabling themselves to take measure proactively before the whole multi-tier infrastructure get overtly complex and complicated and business gets hurt. It has become increasingly important that applications, which companies rely on to drive revenue, meet user’s expectation. IT teams need to quickly identify the root cause of problems, preferably, before the issues affect users. I have been observing and reading features about various APM tools and solutions published by vendors with varying degree of pros and cons and looking at features claimed by vendors like Dynatrace, Riverbed, IBM, CA and many more towards providing e2e application performance monitoring and capability. I don’t want to be critical upon one’s solution. All of them carry great deal of features and benefits. However, We need to first make a list of requirement with the context of current IT landscape and transformation strategy followed by analyzing make and buy analysis before approaching any solution provider. Many of the organizations including ours, have been involved in modernizing monitoring set up including APM. We have to move towards modernization before we are competed out. My attempt through this blog is to embrace the importance of APM, key attributes that differentiates from the conventional set up and appreciates the fast changing IT landscape involving BYOD, SMAC, IoT, SDDC, SDN, and Virtualizations etc.

Application Performance Management: More critical in SMAC, BYOD & IoT Era