Troubleshooting Multiple Disks¶
If a system has multiple disks and pfSense software has been installed on both, it is possible they may conflict in one or more ways. For example, this can happen if an older disk was left in place after adding a new disk of a different type and reinstalling to the new disk.
In these situations best practice is to remove the unused disk but that is not always possible. For example, if the original installation was using an embedded disk such as eMMC. If the disk cannot be removed, then the next best solution is to clear the metadata from the unused disk.
A common way multiple disks conflict is if they both use the same ZFS label. In that case it is unpredictable which ZFS pool will be used by the OS and it may change depending on the boot order. Another way is if the OS boots the kernel from one disk but mounts the other disk in the operating system, leading to a situation where the installed OS appears up-to-date but is booting with an outdated kernel.
Identify the Disk¶
To clear the metadata safely, first identify the unused disk. This may take some
investigation, but typically disks are listed in the full boot log output in
/var/log/dmesg.boot
, the output of sysctl kern.disks
, along with other
OS commands such as geom list disk
, gpart list
, and geom -t
. In some
cases it’s clear which is which, such as when using an add-on SSD instead of
eMMC, where the eMMC disk is named mmcsdX
and the SSD is ndaX
or
adaX
.
It’s also possible that the drive that loaded the kernel at boot time is
different from the drive mounted as the root of the filesystem (/
). Once
booted, it’s not possible to determine which drive loaded the kernel, but
it is possible to determine which drive holds the root filesystem.
Note
If the unused disk cannot be definitively identified, take a backup, clear the data from all disks, and then reinstall.
When a system has multiple disks, odds are high that the disk holding the live root filesystem is the intended disk and whichever disk is not used for the root filesystem is the one that should be wiped to avoid conflicts.
UFS¶
On UFS systems, look at the output of df /
in the first column
(Filesystem
):
If it’s a disk device (e.g.
/dev/ada0s2a
), then note it and move on. The filesystem device name will include a slice or partition identifier at the end (e.g.s2a
in the previous example) but it should be possible to match the disk name against the list insysctl kern.disks
.In this case, it is located on
ada0s2a
which is the first filesystem in the second slice of the diskada0
.If it contains a label (e.g.
/dev/diskid/
,/dev/gpt/
, or/dev/ufs/
), look at the output ofglabel status
and match the start of the filesystem label with theName
column and find the disk device in the correspondingComponents
column.$ df / Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/diskid/DISK-9D1CEC59s2a 7353532 1664148 5101104 25% / $ glabel status Name Status Components diskid/DISK-9D1CEC59 N/A ada0
In this case, the root filesystem is located on
/dev/diskid/DISK-9D1CEC59s2a
which is the first filesystem in the second slice of disk IDDISK-9D1CEC59
, which corresponds to the diskada0
.Other label types may not always include a slice or partition identifier.
ZFS¶
For systems using ZFS, check output of zpool status
and look at the disk
names in the output:
$ zpool status
pool: pfSense
state: ONLINE
scan: scrub repaired 0B in 00:00:17 with 0 errors on Wed Feb 22 11:03:52 2023
config:
NAME STATE READ WRITE CKSUM
pfSense ONLINE 0 0 0
nda0p4 ONLINE 0 0 0
errors: No known data errors
In this output, the ZFS pool is located on nda0p4
which is the fourth
partition on the disk nda0
.
Using the Geom Tree¶
If there is any doubt about the devices in question based on the filesystem
device, run geom -t
. That command outputs a tree style view of all disks and
their components, such as slices/partitions. This can make it relatively simple
to narrow down the disk which contains a given ID, partition, or slice with
minimal searching through command output:
$ geom -t
Geom Class Provider
nda0 DISK nda0
nda0 PART nda0p1
nda0p1 LABEL gpt/efiboot0
msdosfs.gpt/efiboot0 VFS
gpt/efiboot0 DEV
nda0p1 DEV
nda0 PART nda0p2
nda0p2 LABEL gpt/gptboot0
gpt/gptboot0 DEV
nda0p2 DEV
nda0 PART nda0p3
swap SWAP
nda0p3 DEV
nda0 PART nda0p4
nda0p4 DEV
zfs::vdev ZFS::VDEV
nda0 DEV
mmcsd0 DISK mmcsd0
mmcsd0 DEV
mmcsd0 LABEL diskid/DISK-9D1CEC59
diskid/DISK-9D1CEC59 DEV
diskid/DISK-9D1CEC59 PART diskid/DISK-9D1CEC59s1
diskid/DISK-9D1CEC59s1 DEV
msdosfs.diskid/DISK-9D1CEC59s1 VFS
diskid/DISK-9D1CEC59 PART diskid/DISK-9D1CEC59s2
diskid/DISK-9D1CEC59s2 DEV
diskid/DISK-9D1CEC59s2 PART diskid/DISK-9D1CEC59s2a
diskid/DISK-9D1CEC59s2a DEV
ffs.diskid/DISK-9D1CEC59s2a VFS
mmcsd0boot0 DISK mmcsd0boot0
mmcsd0boot0 DEV
mmcsd0boot1 DISK mmcsd0boot1
mmcsd0boot1 DEV
Clear the Disk¶
In these examples the unused disk is mmcsd0
.
The commands in these examples must be run from a console or SSH shell prompt. Do not attempt to execute these commands from the GUI. The best practice is to run them from the console and to have installation media on hand in case a reinstall is necessary.
Tip
If any of the commands generate an error, boot the Netgate Installer and
perform the commands from a shell launched through the installer menu (AMD64
and AARCH64). When booted from install media, the disks in the device will
not be mounted and can be safely cleared. For ARMv7 devices, boot the
recovery installer and use Ctrl-Z
to suspend the recovery process and
reach a shell prompt to run the commands.
Wipe Metadata¶
The quickest and easiest way to wipe a disk is to clear its metadata.
The following commands clear the disk partition metadata, ZFS metadata, and also wipe the start of the disk to clear the partition table and other data at the beginning of the disk. Depending on the situation it may only be necessary to clear the ZFS metadata but it’s safer to clear it all.
### Stop a legacy style GEOM mirror and clear its metadata from all disks
### Mirror name may vary, check "gmirror status" output.
# gmirror destroy -f pfSenseMirror
### Clear the ZFS label (exact partition may vary)
# zpool labelclear -f /dev/mmcsd0p4
### Clear the partition metadata
# gpart destroy -F mmcsd0
### Wipe the first 1MB of the disk
# dd if=/dev/zero of=/dev/mmcsd0 bs=1M count=1 status=progress
Note
Alternately, skip the first two commands and omit the count=1
on
dd
to wipe the entire target disk from start to end.
Wipe Start and End of Disk¶
Another tactic is to wipe only the start and end of the disk. However, this approach is a much more complicated process as it involves calculations based on the sector size and number of sectors on the disk:
### Wipe the first 1MB of the disk
# dd if=/dev/zero of=/dev/mmcsd0 bs=1M count=1 status=progress
### Wipe the last 1MB of the disk
# dd bs=`diskinfo mmcsd0 | awk '{print $2}'` \
if=/dev/zero \
of=/dev/mmcsd0 \
count=`diskinfo mmcsd0 | awk '{print ((1024 * 1024) / $2)}'` \
seek=`diskinfo mmcsd0 | awk '{print $4 - ((1024 * 1024) / $2)}'` \
status=progress
Note
Be sure to replace every instance of the target disk in each command, as the disk is referenced numerous times to obtain the necessary calculation numbers.