[pfSense] NIC Failover

Adam Thompson athompso at athompso.net
Mon Sep 12 01:24:42 EDT 2011


If I understand you correctly, you're trying to use pfSense in transparent 
mode, across two independent uplinks that are part of the same [M|R]STP 
domain?

IIRC, pfSense can participate in the [R]STP topology change process (i.e. 
pfSense speaks STP itself), which means you would probably not be seeing 
the behaviour you expected: in that scenario, pfSense looks (to STP) like 
just another switch - albeit a broken switch that doesn't forward all 
packets :-)

In FreeBSD, RSTP is turned on and off via ifconfig(8), with the "[-]stp" 
keyword, the "proto [rstp|stp]" directive controls backward compatibility 
mode, and "[-]ptp" alters the behaviour of the state machine (Learning 
mode bypass, aka "portfast").  However, ifconfig(8) states the default is 
STP disabled for all bridged interfaces, yet the Networking Handbook 
states the default is RSTP enabled for all bridged interfaces.  I note 
that those two facts are not - precisely - contradictory.

+------+      +------+
|switch|      |switch|
+------+      +------+
      \RSTP    /RSTP
       \      /
      +-------+
      |pfsense|
      +-------+
       /     \
      /RSTP   \RSTP
+------+      +------+
|switch|      |switch|
+------+      +------+

In a diagram like this, the question is does pfSense participate as though 
it were a switch (which, technically, it is while operating in transparent 
mode), or as though it were merely a "bump in the wire" (which it also is 
while operating in transparent mode)?

If you think of the pfSense box in the center of the diagram as an 
RSTP-capable switch, it's immediately obvious why there aren't two 
separate RSTP domains in the network.  If you think if the pfSense box as 
two distinct wires, you would expect two separate RSTP domains.

Which topology are you aiming for?  Since FreeBSD doesn't support MSTP - 
the kernel runs a single RSTP process - I don't think you can achieve two 
distinct STP domains if the pfSense box participates in STP at all.

I don't know if turning off STP in FreeBSD allows STP packets to pass 
through the bridge unhindered, since doing so would technically violate 
IEEE 802.3 (STP packets are link-local multicast frames, never to be 
forwarded).

I believe, although I'm not certain, that in this topology, you can never 
have two distinct RSTP domains operating correctly.  The solution is to 
use 802.1q and run a single trunk through the pfSense box, and join the 
switches at the top together with a .1q trunk, and the switches at the 
bottom together with a .1q trunk.  If you need the bandwidth of two links, 
the solution is... LACP (LAGs).  Run 802.1q-over-802.3ad (i.e. 
VLAN-on-LACP) to solve all the topology problems while preserving 
redundancy.

Note that if you have *two* pfSense boxes, you'll still need [R]STP to 
prevent loops as there will then be two independent bridged paths even 
with .1q trunking, but at least you'll only have a single [R]STP domain to 
worry about.

Always remember, when using any form of STP, to a) manually designate a 
root bridge, preferably one near the core of the network - i.e. the one 
with the most uplinks leading to it, b) designate uplinks (more 
specifically, Point-to-Point links) correctly to prevent unnecessary 
outages, and c) keep those port designations 100% up-to-date at all times.

While it's true that RSTP can achieve sub-10-second convergence, 
particularly in very small networks (<20 switches), LACP (and any other 
form of LAG, normally) still has some benefits IMHO:
  - it's a point-to-point solution, which means you don't have to touch 
every single switch in the network when you introduce or change LAGs, only 
the two devices you're connecting with a LAG
  - it's transparent to any and all protocols (except its own PDUs)
  - it allows, under normal conditions, the full use of all the bundled 
links simultaneously (subject to some statistical assumptions about 
traffic distribution)
  - it doesn't interfere with any other L2 or L3 redundancy protocol, 
including [M|R]STP and ECMP
  - it can be treated, in general, like an ordinary Ethernet interface 
that just happens to have some redundancy built-in
  - failover time can be configured to be as low as one millisecond 
(1ms!), or 10 to 20 packets at full line rate, depending on the vendor and 
switch model involved
  - it (almost) never puts a port into blocking state except when said 
port is misconfigured

As you can tell, I dislike STP intensely, and I'm a big fan of LAG 
technology, whether that be Etherchannel, LACP, or something else.  I do 
agree that both DCB and TRILL will largely make STP's limitations 
irrelevant.  STP was certainly necessary before the advent of LAG 
technology - and still is in some cases - but IMHO has been made obsolete 
in most designs by multi-chassis LAG.

-Adam Thompson
 athompso at athompso.net



> -----Original Message-----
> From: list-bounces at lists.pfsense.org [mailto:list-
> bounces at lists.pfsense.org] On Behalf Of Joseph Hardeman
> Sent: Sunday, September 11, 2011 21:23
> To: 'pfSense support and discussion'
> Subject: Re: [pfSense] NIC Failover
>
> Interesting
>
> I do now when building out a redundant network so that you have
> multiple paths to the same destination, you have to have some sort
> of method allowing traffic to be able to change its path if a
> switch or fiber in the middle goes down, while VLAN's do help in
> separating traffic RSTP allows for the quickest way for traffic to
> switch between network links.  For instance if you have a circle
> network (basically a loop) Spanning-tree or Rapid Spanning-tree
> helps manage what path is chosen, basically disabling the other
> path, and keeps the network from over running itself by the loop,
> just like OSPF also will help direct traffic by opening the
> shortest path.
>
> Actually the LAGG I was speaking about was the LAGG configuring in
> pfSense not on the switch side, when the IP moved over to the
> failover NIC on pfSense then spanning tree would kick in on the
> vlan that is running that network and see that it is now available
> off a different leg than previously.
>
> Now, I of course could definitely be wrong about spanning tree and
> the best way to manage a network, there a whole lot of smarter
> people out there than me and I am quite aware of my limitations.
> :-)  So I am more than happy to hear and learn of a better way of
> doing things.  Anything I can do to make our lives easier I am
> happy to do.
>
> Joe
>
>
> -----Original Message-----
> From: Jim Thompson [mailto:jim at netgate.com]
> Sent: Sunday, September 11, 2011 9:12 PM
> To: Joseph Hardeman
> Cc: 'pfSense support and discussion'
> Subject: Re: [pfSense] NIC Failover
>
> Most of the issues with STP are dealt with via 802.1w (rapid
> spanning tree)
>
> On Sep 11, 2011, at 9:15 AM, Joseph Hardeman wrote:
>
> > Hey Everyone,
> >
> > So I can do the failover and yes all of the switches are managed.
> I did see where to setup the LAGG on the pfSense system.  I have to
> deconfigure the two nics I want to use and then set them up in
> failover mode I think.  On the switch side, I was using 2 separate
> switches with rapid spanning tree on their uplink ports and ports
> to the pfSense system to assist in fast failover.  I will give it a
> shot on Monday and see how it goes.
> >
> > Thanks.
> >
> > Joe
> >
> > -----Original Message-----
> > From: list-bounces at lists.pfsense.org [mailto:list-
> bounces at lists.pfsense.org] On Behalf Of Chris Buechler
> > Sent: Sunday, September 11, 2011 1:04 AM
> > To: pfSense support and discussion
> > Subject: Re: [pfSense] NIC Failover
> >
> > On Sun, Sep 11, 2011 at 12:46 AM, Austin G. Smith
> <Austin at digitalcompass.com> wrote:
> >> I have had issues with stp on the firewall in this type setup
> previously.
> >> Mileage may vary for others..
> >>
> >
> > If you're bridging, yeah that can be a concern depending on your
> config. Failover lagg without bridging won't cause any issues with
> STP though. May see switches on occasion that have an issue with a
> MAC quickly moving from one port to another related to its CAM
> table, or sometimes with security features on the switch, but
> that's pretty unusual with typical switch configs. And usually in
> that scenario you're going to be on two diff switches anyway with
> failover lagg.
> > _______________________________________________
> > List mailing list
> > List at lists.pfsense.org
> > http://lists.pfsense.org/mailman/listinfo/list
> > _______________________________________________
> > List mailing list
> > List at lists.pfsense.org
> > http://lists.pfsense.org/mailman/listinfo/list
>
> _______________________________________________
> List mailing list
> List at lists.pfsense.org
> http://lists.pfsense.org/mailman/listinfo/list





More information about the List mailing list