Project

General

Profile

Bug #28142

Make sure swap mirror is destroyed when installing to disks that previously had a swap mirror

Added by Sam Fourman over 2 years ago. Updated almost 2 years ago.

Status:
Done
Priority:
Important
Assignee:
Ryan Moeller
Category:
OS
Target version:
Seen in:
TrueNAS - TrueNAS 11.1-U2
Severity:
Low Medium
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
No
Needs Merging:
No
Needs Automation:
No
Support Suite Ticket:
Hardware Configuration:
ChangeLog Required:
No

Description

First screenshot illustrates the issue, the second one shows that gmirror forget swap resolves the issue.

Associated revisions

Revision a5b512bb (diff)
Added by Ryan Moeller almost 2 years ago

[install.sh] Make sure swap mirror is destroyed Fixes an edge case when installing to disks that previously had a swap mirror set up but the partition table was destroyed without destroying the swap mirror. Ticket: #28142

Revision 214e5a0c (diff)
Added by Ryan Moeller almost 2 years ago

[install.sh] Make sure swap mirror is destroyed Fixes an edge case when installing to disks that previously had a swap mirror set up but the partition table was destroyed without destroying the swap mirror. Ticket: #28142

History

#1 Updated by Dru Lavigne over 2 years ago

  • Assignee changed from John Hixson to Sean Fagan

#4 Updated by Dru Lavigne over 2 years ago

  • Assignee changed from Sean Fagan to John Hixson

#5 Avatar?id=13649&size=24x24 Updated by Ben Gadd over 2 years ago

  • Target version changed from TrueNAS 11.1-U2 to TrueNAS 11.2

#6 Updated by Dru Lavigne almost 2 years ago

  • Assignee changed from John Hixson to William Grzybowski

#7 Updated by Alexander Motin almost 2 years ago

  • Status changed from Not Started to Unscreened
  • Assignee changed from William Grzybowski to Ryan Moeller

I am moving this to Ryan since he was in this area recently.

#8 Updated by Ryan Moeller almost 2 years ago

  • Status changed from Unscreened to Not Started

#9 Updated by Sean Fagan almost 2 years ago

Hm. The code does "gmirror destroy -f swap" so why is that not sufficient?

#10 Updated by Ryan Moeller almost 2 years ago

Indeed, that's what I am wondering. In the first screenshot we see that happening near the top. And then a disk gets partitioned below that. I need to look into what might be trying to modify the partitioning again after that. It may have been a section of code for adding a swap partition that was commented out previously and recently got removed by me.

#11 Updated by Ryan Moeller almost 2 years ago

According to git blame, that commented out code had been commented out for at least 3 years, so that wasn't it.

#12 Updated by Sean Fagan almost 2 years ago

All the glabel stuff is obnoxious, and keeps biting us.

#13 Updated by Ryan Moeller almost 2 years ago

  • Status changed from Not Started to In Progress

The first screenshot shows the error is happening after the existing partition tables on both ada0 and ada1 have been destroyed and cleared with dd, and the error occurs between ada0 and ada1 being partitioned.
The second screenshot shows that `gmirror forget swap` didn't help delete the swap partition that was successfully created on ada0, but `gpart destroy swap` (which we already do) did work.

The problem occurred after ada0 had been partitioned. Each disk is partitioned twice (for a good reason I won't get into here), and this looks to be the first pass. What must be failing is destroying the partitions on ada0 after they have been created. There should be no gmirror metadata on the swap partition at this point, as we have destroyed the swap gmirror metadata. However, if a swap mirror previously existed in the same location, but the partitions had been destroyed without destroying the swap mirror first (ie outside of the installer), the metadata would still exist because we didn't originally see a partition there with metadata to destroy. If that is the case, adding yet another invocation of `gmirror destroy -f swap` before destroying the temporary partition table should prevent this error.

I'll try this theory out to see if it reproduces the reported behavior.

#14 Updated by Sean Fagan almost 2 years ago

The GEOM metadata is at the end of the partition, I think. And the installer only does 1mbyte at the beginning and end of the disk.

BTW, while you're in there with the partitioning code, should we drop the freenas-boot partitions down to the nearest GByte?

#15 Updated by Ryan Moeller almost 2 years ago

The installer does clear 2 MiB at the beginning and end of the disk, and the gmirror metadata is stored in the last sector of the provider, which is the swap partition somewhere in the middle of the disk.

For the freenas-boot (freebsd-zfs) partition, I worry that rounding down to the GiB can be throwing away valuable space on 8 GiB devices, where you might be just under 8 GiB after carving out a partition table and efi partition. What are the benefits of rounding this? I can see it being convenient if later cloning to a barely smaller storage device. Is there another reason?

I am wondering if the swap partition should be aligned to a 4k boundary? Currently it is not, though the zfs partition is. If it is worth forcing the alignment of the zfs partition, it may be worth aligning the swap partition the same way.
I also noticed the swap mirror is set to use only one device for reads rather than the default load balancing. I don't understand the rationale for this.

#16 Updated by Sean Fagan almost 2 years ago

Rounding down the size is helpful when someone goes to add a thumb drive as a mirror. The sizes are so variable that even 1mbyte isn't enough to ensure it. 4, 8, or 16Mbytes as the round-down value would be worth considering.

And yeah, align the swap partition to 4k.

#17 Updated by Ryan Moeller almost 2 years ago

Ok, I'll include those changes with the related cleanup I am doing.

I've used the following steps to reproduce the scenario I described:

# simulate installer configuration
sysctl kern.geom.label.disk_ident.enable=0
gmirror load

# set up a test disk image
truncate -s 5g test.img
disk=$(mdconfig -a test.img)

# create a standard partition table
gpart create -s gpt $disk
gpart add -t freebsd-boot -s 256m -i 1 $disk
gpart add -t freebsd-swap -a 4k -s 1g -i 3 $disk
gpart add -t freebsd-zfs -a 4k -i 2 $disk

# label the mirror
gmirror label testmirror ${disk}p3
gmirror status testmirror # the partition is active

# temporarily disable gptid labels so the mirror doesn't immediately start back up on the gptid provider
sysctl kern.geom.label.gptid.enable=0

# stop the mirror so it is no longer active but metadata remains on disk
gmirror stop testmirror

# destroy the partition table
gpart destroy -F $disk

# re-enable gptid labels
sysctl kern.geom.label.gptid.enable=1

# Now the image is in the state I expect ada0 in this ticket was in.
# We simulate the installer:

# nothing up my sleeves
gmirror destroy -f testmirror
gpart destroy -F $disk
dd if=/dev/zero of=/dev/$disk bs=1m count=2
size=$(diskinfo $disk | cut -f 3)
dd if=/dev/zero of=/dev/$disk bs=1m oseek=$((size / (1024*1024) - 2))

gpart create -s gpt $disk
gpart add -t freebsd-boot -s 256m -i 1 $disk
gpart add -t freebsd-swap -a 4k -s 1g -i 3 $disk
gpart add -t freebsd-zfs -a 4k -i 2 $disk

# this produces the error message seen in the first screenshot after the partitions were created
gpart destroy -F $disk

# it failed because the mirror is active again
gmirror status testmirror

# the fix:
gmirror destroy -f testmirror

# now it works
gpart destroy -F $disk

# clean up
mdconfig -du ${disk#md}
rm test.img

The behavior matches, so I will apply the fix for this issue.

#18 Updated by Ryan Moeller almost 2 years ago

#19 Updated by Ryan Moeller almost 2 years ago

#21 Updated by Ryan Moeller almost 2 years ago

  • Status changed from In Progress to Ready for Testing
  • Needs Merging changed from Yes to No

#22 Updated by Dru Lavigne almost 2 years ago

  • Project changed from TrueNAS to FreeNAS
  • Subject changed from TrueNAS ISO instalation fails due to swap in use to Make sure swap mirror is destroyed when installing to disks that previously had a swap mirror
  • Category changed from OS to OS
  • Target version changed from TrueNAS 11.2 to 11.2-RC2
  • Needs Doc changed from Yes to No
  • Migration Needed deleted (No)
  • Hide from ChangeLog deleted (No)
  • Support Department Priority deleted (0)

#23 Updated by Dru Lavigne almost 2 years ago

  • Status changed from Ready for Testing to Done

Also available in: Atom PDF