rssLink RSS for all categories
 
icon_red
icon_green
icon_red
icon_red
icon_blue
icon_green
icon_green
icon_red
icon_red
icon_red
icon_orange
icon_green
icon_green
icon_green
icon_green
icon_blue
icon_red
icon_orange
icon_red
icon_red
icon_red
icon_red
icon_green
icon_red
icon_red
icon_red
icon_red
icon_orange
icon_green
 

FS#5301 — FS#9224 β€” RBX1 internal network

Attached to Project— Anti-DDoS
Maintenance
Paris DC1
CLOSED
100%
As part of
https://status.ovh.co.uk/?do=details&id=5292
we are going to change the OSPF AREA of all
the small routers in RBX1.
Date:  Friday, 30 August 2013, 16:05PM
Reason for closing:  Done
Comment by OVH - Thursday, 29 August 2013, 18:38PM

rbx-3/4/5/6-m1/m2 done


Comment by OVH - Thursday, 29 August 2013, 18:40PM

rbx-7/8/9/10/11/12/14/15/16/17-m1/m2: done
there is some card damage on rbx-14, rbx-4, rbx-3.
we are replacing with spares.


Comment by OVH - Friday, 30 August 2013, 01:20AM

We finished with routers of the first vlan (70 routers) but there is nothing going right.

OSPF process couldn't be resumed. we are now trying to simplify configuration to avoid announcing LSA then restart the routers that crashed. we already have some routers UP. but too many remains to be done.


Comment by OVH - Friday, 30 August 2013, 01:24AM

While we try to resume the OSPF router by router,we add the BGP configuration to move OSPF from these small routers.


Comment by OVH - Friday, 30 August 2013, 01:24AM

Remaining 2 routers to resume.


Comment by OVH - Friday, 30 August 2013, 01:25AM

It is UP at least 1 router from 2. OSPF is cut. And everything works but on BGP.


Comment by OVH - Friday, 30 August 2013, 01:31AM

Communication between RBX routers doesn't go through the internal network, but through the backbone. We are checking in order to fix this problem.


Comment by OVH - Friday, 30 August 2013, 01:31AM

Internal network resumed. RBX1 network is stable.


Comment by OVH - Friday, 30 August 2013, 03:36AM

On RBX1 we have a very particular network configuration based on 2 routers in RBX1 rbx-1-6k and rbx-2-6k.
Those 2 routers manage the interco for about 120-130 small routers.
This architechture is used since 2006 , we have it only in RBX1 and particularity of this configuration made ​​it complicated to establish all new services (VAC, the vrack etc.).
We had to simplify this configuration until we replace all these routers by 4 big ones, as we do in all other DCs (4 routers arrived 2 weeks ago and we are expecting to switch RBX1 by the end of September).

We knew simplifying this configuration would have an impact on the availability of RBX1 DC and we knew we will have to change some routers by spares.
So we choosed to perform the intervention by the day while we have the maximum staff available in order to intervene quickly on hard.
And it worked. We finally had to remove completely the OSPF and use only BGP.

88 routers were reconfigured ,still remaining 42 more.
We will perform the final configuration of these last 42 routers without generating new failures ,then we will remove the old conf.
After this shot, we wonder if we would have done this right from the beginning...

RBX1 problem impacted 2 other routers managing the vrack 1.0 which have not been updated since 2-3 months.
With an uptime of several years and with today's problems,we had RAM fragmentation and we had to restart it.

The other DCs have not been impacted.
The problem concerns RBX1 DC rt a part of the evening, the vrack 1.0 / IP LB.
VAC1 didn't work properly during this period.

We are sorry for the faults generated and their period.


Comment by OVH - Friday, 30 August 2013, 11:09AM

We will finish the reconfiguration of the
remaining 42 routers. To do so, we will
directly set up the BGP configuration, then
shut down the OSPF after the verifications.


Comment by OVH - Friday, 30 August 2013, 15:55PM

Configuration finished. We have no more issues.

We will now isolate RBX1 from the backbone
like all other DC routers.


Comment by OVH - Friday, 30 August 2013, 15:59PM

Done.

We can now isolate the internal network
from the backbone.


Comment by OVH - Friday, 30 August 2013, 16:02PM

Done.

Old SBG/RBX network shut down. Mitigation by VAC 2
in SBG and VAC3 in BHS is reaching servers hosted
correctly in RBX1. yes :)
Packets from SBG to RBX are passing on the new network.


Comment by OVH - Friday, 30 August 2013, 16:04PM

Old RBX/GRA network shut down. The traffic is
flowly well between the DCs and across the new
internal network.


Comment by OVH - Friday, 30 August 2013, 16:04PM

All done.