The evolution of the IXP Network Engineer… #peeringweek @lonap

Tales from the rarely sighted and lesser spotted IXP Network Engineer…

From the beginning, the principle of an Internet Exchange Point (IXP) is simple. It’s just a layer 2 network, to which participating service providers connect.

Most IXPs started small, and were managed by volunteer efforts, or by another organisation until they become large enough to become an independent organisation, and maybe employ network engineers.
So what do these engineers working at IXPs do?

In the beginning, we just installed the hardware, plugged in the cables, configured a few things and then went to the pub. Life was good! But those days in the pub weren’t to last!

As soon as an IXP grows to more than a few tens of members, starts to cover multiple sites and switches, and operate with any level of redundancy, a surprising amount work is needed to keep everything operating efficiently and reliably, scale the platform and think about its future.

The exact role of a ‘Senior Network Engineer’ is wide and varied between different organisations. For the IXP, this usually means everything from cabling and rummaging around in racks, maintenance, testing new technology and hardware, fault diagnosis and fixing, monitoring systems, upgrades, to maintaining and managing all the usual selection of internal systems such databases and web sites, e-mail servers, DNS and VOIP which help the IXP operate.

Regardless of the exact duties, a good network engineer always thinks in terms of at least Layers 1 to 3 of the OSI model, and has a good understanding of the underlying protocols and the network topology, but also needs a fair amount of ‘instinct’ and ‘gut feeling’.

An IXP is just a ‘layer 2 blob’ and essentially looks like a large switched infrastructure to the connected networks, although monitoring this layer 2 blob can be surprisingly tricky. I remember grappling with the difficulties of IXP monitoring back in 2002, and we still grapple with it today, with only slightly better, and mostly home grown tools.

There are some cool monitoring systems available – both open source and commercial – for monitoring Layer 3 and 4 servers and services, but monitoring at Layer 2, with its mixed bag of redundancy protocols, topologies, proprietary ring protection protocols, logging formats, management capabilities and hardware limitations, continues to be a challenge for most monitoring systems.

Monitoring the underlying Layer 2 infrastructure to determine the exact cause of a fault from the limited data available in a timely manner is still often quite a dark art. When the network is behaving unpredictably, the monitoring systems can do little more than send a flood of alarms: “Something bad just happened, here’s 200 alarms, have fun!” also the monitoring will usually be several minutes behind the actual situation.

Remaining calm in this situation and to keep thinking in OSI layers, fault-finding in a methodical way, whilst keeping an open mind, is a key part of being an IXP Network Engineer.

So this is how network engineers need to think, and has been for many years. The Ethernet standard recently celebrated the 40th anniversary of its invention, with more than a few changes along the way.

So what has changed, and what challenges have Network Engineers yet to face?

The connections have got faster, the hardware more capable and with more features than ever before, bringing more functionality which is often a mixed bag! What we used to call ‘switches’ and ‘routers’  have long been merged in to single devices that do both. And I have to admit, I’m not much a fan of ‘clouds’ because I’m more inclined to ask “Yes, but what is in the cloud! and how do I configure it?

Look inside the cloud, and we still find the messy stuff of the Internet, and that’s far too scary to present on a nice marketing PowerPoint slide, So the messy stuff and the engineers who look after it remain largely hidden, at least, when things are working as they should!

So, call it a cloud, a platform, a router, a switch, virtual service… The reality for now at least, is there will still be a need for well-designed networks, still a bunch of racks with a bunch of equipment, and still the hidden network engineers working to keep it all working smoothly, make sure the blinking lights keep blinking, and fix it when things break!

I can see current and future trends – the ‘clouds’ and the ‘software defined network’ are requiring network engineers to become something different: semi-programmers that understand the cloud integration, the new technologies and all the old stuff: the network, cabling, servers, developments in Ethernet and hardware, routing protocols and all the usual challenges, and at the same time, how to design and scale a robust and reliable “Cloud”  (whatever this means!) which needs to be thought of as less of a Layer 2 blob, but as a magic cloud thing offering more services than just the traditional model of an IXP.

Some potentially exciting developments are afoot in the worlds of Optical Networks and Open architecture switching platforms are causing us to bin the old ways of working, and start thinking again about what it is we want. Can this technology be good at what it does, not cost a fortune, and be flexible enough? Switches are here that run Linux and all the flexibility that brings, with the hardware capability not too far behind..

But still as Network Engineers, we need to either do one of two things:
a) Become some sort of Super Network Engineer (with optional cape and outside underpants!) learning a small amount about a lot of new technologies and making the mistakes along the way, while never having much sleep!

b) Work out how to tame ‘the layer 2 blob’: come together and automate most of the boring and tricky fault finding stuff we’ve been grappling with for far too many years, to give us time to learn the new approaches and think about future network designs.

So how to move on? Well, IXP Network Engineers need to become less hidden, work together more with vendors than before to convince them to fix the nasty CLIs, broken management, missing monitoring, tiny management CPUs, the inconsistent mish-mash of supported protocols and missing documentation! All too often, we buy the hardware because of the packing-shifting, capacity capability, and live with the limitations forever.  This would enable us to think about the future, make the network easier to run, and then maybe we can go to the pub again once in a while….

Other Peering Week posts on trefor.net include:
UK internet history – The Early Days of LONAP by Raza Rizvi
INEX’s IXP Manager – Tools to help manage an Internet Exchange by Barry O’Donovan
Regional Peering in the UK by James Blessing
Co-operation makes internet exchanges future proof by Pauline Hartsuiker
Experience of launching an IXP in North America by Ben Hedges

Enjoy this article? Please share it with your friends.

share on Facebook
Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed in /var/www/html/wp-content/plugins/trefor_social_share/social_items/linkedin_share.php on line 40

Warning: file_get_contents(): Failed to enable crypto in /var/www/html/wp-content/plugins/trefor_social_share/social_items/linkedin_share.php on line 40

Warning: file_get_contents(http://www.linkedin.com/countserv/count/share?url=http://www.trefor.net/2014/03/19/the-evolution-of-the-ixp-network-engineer-peeringweek-lonap/&format=json): failed to open stream: operation failed in /var/www/html/wp-content/plugins/trefor_social_share/social_items/linkedin_share.php on line 40
share on LinkedIn
Warning: file_get_contents(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /var/www/html/wp-content/plugins/trefor_social_share/social_items/twitter_share.php on line 40

Warning: file_get_contents(http://urls.api.twitter.com/1/urls/count.json?url=http://www.trefor.net/2014/03/19/the-evolution-of-the-ixp-network-engineer-peeringweek-lonap/): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in /var/www/html/wp-content/plugins/trefor_social_share/social_items/twitter_share.php on line 40
share on Twitter
Warning: file_get_contents(): SSL operation failed with code 1. OpenSSL Error messages: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed in /var/www/html/wp-content/plugins/trefor_social_share/social_items/google_share.php on line 38

Warning: file_get_contents(): Failed to enable crypto in /var/www/html/wp-content/plugins/trefor_social_share/social_items/google_share.php on line 38

Warning: file_get_contents(https://plusone.google.com/_/+1/fastbutton?url=http://www.trefor.net/2014/03/19/the-evolution-of-the-ixp-network-engineer-peeringweek-lonap/&count=true): failed to open stream: operation failed in /var/www/html/wp-content/plugins/trefor_social_share/social_items/google_share.php on line 38
share on Googleshare on Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*