InterWeb.org.uk

Interweb Blog Hyper-V, App-V, Dell Equallogic, Forefront TMG Security, DPM 2010, SCVMM

4Aug/112

EqualLogic Replication over Nortel ERS 5500 Series – When disaster stikes!

If you have been keeping upto date with my previous blog articles, you will know that we have recently taken delivery of a Dell EqualLogic PS6000 32TB. My intention is to use this as a replication partner for our existing PS Series Group.

With the new EqualLogic initialized and firmware upto date, I continued to set up replication, this was again as expected with EqualLogic a quick and painless task, I will blog on setting up replication in another article.

For the initial replication we co-located our new EqualLogic in the same Datacentre as our current PS group, created replication schedules and away it went all volumes replicated and continuing to replicate on an hourly schedule, no issues what so ever. All of this on our backend iSCSI network of course.

The next task was to move our replication partner to our secondary Datacentre accross the other side of our site in a seperate building, to offer onsite replication. the link to this secondary Datacentre is a 6GB MLT using Nortel/AVAYA as now is, Ethernet Routing Switch 5500 Series. We needed a way of transporting this backened iSCSI traffic over our current network infrastructure.

We mapped out, where we wanted this traffic to go and proceeded to create a VLAN (non routing) on our CORE Routing Stack so that we could link our existing backend iSCSI network to our Secondary Datacentre. This would provide us with a SAN iSCSI VLAN which is non routable and keep our backend iSCSI traffic segregated form our LAN traffic using existing infrastructure.

What happened next raised an issue, which just confirms why I would want to keep iSCSI backend network traffic completely seperate from existing infrastructure, I am assured that this would not of happened in a CISCO environment (Thanks Andy Irving CCIE Routing and Switching - amongst others).

As Andy was packing up for the afternoon, one of the switches in our CORE Stack, a Nortel ERS 5530 suddenly went offline, and the second ERS 5530 proceeded to shutdown all of its MLT ports, no information in the logs relating to this I may add. We have had a complete Nortel infrastructure in place for about 4 years now including Nortel CS1000 with absoloutly no issues what so ever, so what happened next was just unexplainable. The CORE stack lost its configuration and the base switch in the stack the ERS 5530 that origionally went offline, reset its self and lost all of its MLT configuration. with the second ERS 5530 still shutting its ports down. some quick re-configuration by Andy from our Technology Partner and our CORE was back and working, but it looks like Spanning tree on our backend network had taken our EqualLogic volumes offline, including our Hyper-V CSV volume and all our Hyper-V VM's offline. Now that we had taken our SAN iSCSI network over our LAN this front end LAN issue had greatly effected our backend iSCSI network, causing massive TCP re-transmits.

With this in mind we decided to remove the SAN VLAN from our CORE stack and ensuring that there was no link between our LAN and iSCSI LAN. This did not cause the issue in the first place as this had been tested and working fine, but the CORE Stack issue had highlighted the fact that LAN connectivity issues at our CORE would affect our SAN Network.

So what next ? We will proceed to extend our SAN iSCSI network to our secondary Datacentre utilizing dedicated Firbe optic, with Dell PowerConnect at either end, just creating an extension of our existing PowerConnect 5400 Series iSCSI infrastrcuture, keeping physically away from our existing LAN, ensuring that if there is a LAN connectivity issue at our CORE Routing Stack then our SAN volumes will be completely segregated.

In conclusion we beleieve this to be a bug on the Nortel ERS 5500 Series software and have yet a explainable reason for its actions, In all his years of experience Andy has never seen this type of behaviour before and his quote of "This wouldn't happend in a CISCO setup!" was a corker.

In my conclusion, just another reason for me to want to keep iSCSI traffic physically seperate from any other network behaviour.

Onwards and upwards.

    Comments (2) Trackbacks (0)
    1. Mark, what we could have done in this scenario is configure the Nortel ERS5530 switch in Multiple Spanning Tree Mode (MST) rather than the default Nortel STP. MST allows the creation of more than one STP instance so an issue on one VLAN should not affect the other instance and its related VLANs.

      We could have created a new STP instance and mapped the ISCSI VLAN alone into it. By making the STP root bridge one of the Dell Powerconnect switches it would provide a level of isolation, therefore an STP convergance issue would not affect the backend VLAN.

      As mentioned this issue would not have arisen with a Cisco core network as by default it uses per VLAN spanning tree (PVST+) and more commonly now Rapid PVST+.


    Leave a comment

    No trackbacks yet.

    Get Adobe Flash player