Edit this page | Blame

Rebooting the GN production machine(s)

I needed to add the hard disks in the BIOS to make them visible - one of the annoying aspects of these Dell machines. First on Tux02 I cheched the borg backups to see if we have a recent copy of MariaDB, GN2 etc. The DB is from 2 days ago and the genotypes of GN2 are a week old (because of a permission problem). I'll add a copy by hand of both - an opportunity to test the new 10Gbs router.

Tasks

Before rebooting

On Tux02:

On both:

Info

Reconfiguring disks

We can add a lot of SATA disks. To configure run racadm and set serial settings

: racadm set iDRAC.Serial.Enable 1 Dump everything

: racdump

: get nic.nicconfig.4 : hwinventory

Reboot

: racadm>>serveraction powercycle

: racadm>>serveraction hardreset : racadm>>serveraction powerstatus : racadm>>connect com2

racadm>> config -g cfgSerial -o cfgSerialBaudRate 115200 config -g cfgSerial -o cfgSerialCom2RedirEnable 1 config -g cfgSerial -o cfgSerialSshEnable 1 config -g cfgIpmiSol -o cfgIpmiSolEnable 1 config -g cfgIpmiSol -o cfgIpmiSolBaudRate 115200 get iDRAC.serial

Now you should see the menu in

: console com2

Routing

On tux02 eno2d1 is the 10Gbs network interface. Unfortunately I can't get it to connect at 10Gbs with Tux01 because the latter is using that port for the outside world.

Playing with 10Gbs on Tux01 sent the hardware in a tail spin, what to think of this solution on

bnxt_en 0000:01:00.1 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg {0x0 0x0} len:0

Solution was to power down the server(s) and *remove* power cords for 5 minutes.

The Linux kernel shows some fixes that are not on Tux01 yet

In our case a simple reboot worked, fortunately.

Tags

  • assigned: pjotrp
  • status: unclear
  • priority: medium
  • type: system administration
  • keywords: systems, tux01, tux02, production
(made with skribilo)