Host configuration

Mon Feb 18 01:57:47 UTC 2008

Andy Hood wrote:
> David W. Hankins wrote on 16/02/2008 06:10:46 AM:
> 
>> On Thu, Feb 14, 2008 at 03:39:50PM -0500, Frank Sweetser wrote:
>>> The *simplest* (not necessarily best, especially in all circumstances -
> just
>>> the simplest) solution is to flat out avoid requiring DHCP on a small
> number
>>> of core servers that must be brought up to provide essential services
> (login
>>> authentication, DNS, etc).
>> Yes, precisely.  It is also simplest to flat out avoid requiring
>> ethernet autonegotiation on a small number of core servers and all
>> inter-switch links that must be brought up to provide essential
>> services.
> 
> Quoting from
> http://publib.boulder.ibm.com/infocenter/systems/scope/hw/index.jsp?topic=/iphb1/iphb1_vios_managing_sea_attr.htm
>  (because it's too hard to find a copy of the IEEE spec):
> 
> According to the IEEE 802.3z specification, gigabit speeds of any duplexity
> must be autonegotiated for copper (TX)-based adapters.

There is no half duplex 1000T (thank god). The only thing "negotiated" 
by autoneg on 1000T is flow control. The history behind why there is a 
half-duplex 1000SX is fascinating, but not relevant here.

> 
> AIX 5.3 and gigabit ethernet causes us no end of grief because local
> network policy says all switches must be set to fixed speeds, and the
> drivers follow the IEEE spec, so can not be configured 1000/Full.

This is way OT. But with respect: that's a bad local policy.

The argument "autonegotiation is bad in the datacentre" argument is junk 
IMNSHO. We suffered through it here; I railed against it and was proven 
comprehensively right. We've been much better off since going autoneg 
everywhere, on links between servers, switches and routers.

As far as I can tell, the autoneg == bad meme steps from the bad old 
days when 100Tx was just getting rolled out and a considerable number of 
so-called "good" network cards were in fact s**t (e.g. 3c905b). In a 
network over >20k hosts, >150 servers and >1200 edge switches, we have 
had zero autoneg problems in the best part of 5 years. We have had 
problems on hosts with autoneg disabled and forced by "clever" people to 
to full duplex, when of course the switch falls back to half, and 
performance suffers abysmally, but not badly enough to outage the box so 
the misconfiguration can persist for months.

There's a reason the spec made it mandatory. It provides excellent fault 
detection and protects against misconfiguration; on fibre links it 
protects against some very common problems e.g. uni-directional links.

...and before people start in, UDLD protects against a *different* set 
of uni-directional problems at layers above the PHY - e.g. patching TX 
and RX fibres to different remote points.

Coming back to the original subject: DHCP for servers is a somewhat 
different matter. There are in many cases reasons why it doesn't work. 
The main ones I can think of are servers with >1 IP (e.g. terminating >1 
SSL website) and any servers which actually support the DHCP (e.g. DNS).

On the flip side: for simple single-homed servers I have no qualms about 
recommending DHCP, because they will be on fixed-address statements on 
 >1 DHCP server. Our DHCP infrastructure is basically the most reliable 
service we run; but judging from the doubts expressed in this thread, 
maybe others have more troubles ;o)