Project

General

Profile

Bug #4419

GUI reporting incorrect size of Pool

Added by Philip Nunn almost 7 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Nice to have
Assignee:
William Grzybowski
Category:
GUI (new)
Target version:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
Yes
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:
ChangeLog Required:
No

Description

In 9.2.1.2 rc/release, the GUI is now reporting an incorrect size for a Pool, although running "zpool list" reports the correct size.

NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
Pool1 1.36T 374G 1018G 26% 1.00x ONLINE /mnt

GUI reports 994.9GiB

GUI size error.JPG (35.9 KB) GUI size error.JPG Philip Nunn, 02/28/2014 10:40 PM
Capture.PNG (66.4 KB) Capture.PNG eraser -, 02/28/2014 10:43 PM
563
564

Related issues

Related to FreeNAS - Bug #6174: The capacity for the volume 'vdev1' is currently at 92%, while the recommended value is below 80%.Closed: Behaves correctly2014-09-17

Associated revisions

Revision f227eb61 (diff)
Added by William Grzybowski about 6 years ago

Use zfs list to show the size for datasets in the volumes datagrid Size column is not quite right yet, we'll probably remove that column since computing that value for datasets is not trivial. Ticket: #4419

Revision 75724243 (diff)
Added by William Grzybowski about 6 years ago

Get rid of the Size column in volume datagrid Ticket: #4419

Revision ff381bd7 (diff)
Added by William Grzybowski about 6 years ago

Include the root dataset in the datagrid as well This will let us differentiate a pool from a dataset. Ticket: #4419

Revision bf6d785f (diff)
Added by William Grzybowski about 6 years ago

Do not show dataset actions for the pool row Ticket: #4419

Revision 5656cb24 (diff)
Added by William Grzybowski about 6 years ago

Use zpool list to show stats in the datagrid Ticket: #4419

History

#2 Updated by eraser - almost 7 years ago

564

I also noticed this problem after upgrading to 9.2.1.2. GUI incorrectly shows my pool as 1 TiB in size. (see attached screenshot)

#3 Updated by TECK - almost 7 years ago

Same here.

#4 Updated by Cyber Jock almost 7 years ago

I just posted to that thread, but no.. here's what you need to know about all this stuff:

"zpool list" give you raw disk size. This is including parity data that you will have to write, and parity data written. This number will always be the bigger value because it includes free parity space. But you could never actually use all the space listed as "free".

"zfs list" gives you disk size, after parity. You could also look at this as "user available space" as this is the most space that your users could conceivably write to the pool.

df/du just aren't suitable for ZFS, so don't try to use them(nobody used them, just stating this for future reference).

In terms of what numbers are displayed in the Volume Manager, with 9.2.0(the version I use), the values are from "zfs list". If versions after 9.2.0 have changed what is parsed and from where, I can't really explain the change. But, in my opinion, the values from "zfs list" are the most reasonable values to provide because showing free space including parity(and I think mirror space) from "zpool list" will give users a number that is bigger than what they could conceivably write to the pool.

This has had MANY tickets written against it. The real problem is that snapshots, dedup, compression, and a few other factors will affect the total amount of disk space used and free. MANY user get confused with all of these numbers and go by a strict "flat" data space check(roughly equivalent to getting "Properties" in Windows and waiting for total disk space to calculate). This isn't the most accurate way to get available and used disk space for ZFS because Windows won't include snapshot data. This isn't always understood by ZFS users.

Overall, if "zfs list" is being parsed, you can rest assured everything is working properly and that if you think you are missing disk space to figure out what is using that disk space.

Every time we have a RELEASE version and people start looking at their numbers more closely many find that this is in error. I couldn't find any tickets that would explain a logic change for the numbers provided, and I don't have a 9.2.1+ system to check the actual numbers against. But my guess based on historical tickets and questions in the forum is that this is behaving as designed. A simple check by a developer to determine if disk space in the volume manager is from "zfs list" would be sufficient to close this ticket in my opinion.

I'll paste this in the appropriate forum posts too.

#5 Updated by Philip Nunn almost 7 years ago

Something still seems amiss in the GUI to me. My Pool is a two disk mirror and the GUI reports the disks at 1.5 TB each. "zpool list" shows it correctly being 1.36TiB in size, and "zfs list" is only showing the pool available space, not the total size. The GUI is reporting the size and available space from zfs list, but in my opinion, they should be two different calculations (like they were in 9.2.1). I would think the size column would report on the size of the Pool from zpool list and any datasets/zvols size would be reported the same unless they have a quota and then report on that size. The available column should be calculated on zfs list though. The available column and size column shouldn't match. Whatever they had in 9.2.1 seemed correct to me (and at one point to the dev's too ;-).

#6 Updated by Jordan Hubbard over 6 years ago

  • Assignee set to William Grzybowski

This seems to keep coming up. So we need to change how this is calculated?

#7 Updated by Cyber Jock over 6 years ago

Jordan Hubbard wrote:

This seems to keep coming up. So we need to change how this is calculated?

Not necessarily.. How are those numbers being caculated? The problem is that users don't really understand all these numbers thrown at them from df/du/zpool list/zfs list. Even I have to look them up to make sure I'm understanding them. It's all very confusing. Even more so when you start talking about datasets, quotas, reservations, and snapshots. Suddenly many users get the impression their data is there and free space is just ending up in some abyss.

How do I find out what commands are being parsed to give the volume manager output? That's where the truth will lie....

#8 Updated by eraser - over 6 years ago

Looks like the "Size" column in the GUI is calculated by adding the "Used + Available" columns and rounding to a single decimal point.

However, the "Used" column values in the GUI are actually the "REFER" values from the output of "zfs list" (?!).

..so the GUI "Size" ends up being the sum of "REFER + AVAIL" output from 'zfs list'. I think this is wrong and that the GUI Size column should actually be the sum of "USED + AVAIL" output from 'zfs list'.

Thoughts?

#9 Updated by eraser - over 6 years ago

Looks like this information is populated in FreeNAS 9.2.1.2 as follows:

In the 'gui\storage\admin.py' file it looks like a call to the new API is used gather the necessary information. The following URL can be manually run to display the "incorrect" size information (look for the "total_si" value):

http://freenas.local/api/v1.0/storage/volume/?format=json

However I can't find the code that actually generates responses to the "/api/v1.0/..." calls. Does anyone know where that is done?

#10 Updated by Cyber Jock over 6 years ago

eraser: Sounds like you might be onto something.. hoping that a dev will step in and answer your question. I'm not on 9.2.1+ so I can't really provide much help. I don't have any spare hardware to setup a 9.2.1 box to test this either. :(

#11 Updated by Anonymous over 6 years ago

The root cause here has always been that the volume size display is fed from the 'df' command and 'df' does not know how to compute FREE for ZFS; it assumes that like on UFS, SIZE - USED = FREE which is obviously wrong since USED is the used amount in that dataset (except for the root of the pool which is the overall utilization). It should be fed from 'zfs list -o space' for ZFS datasets which always shows the FREE from the pool (or quota, if configured).

This has been a bug since FreeNAS 8.0.0, and possibly into 0.7.

If 'USED' on a specific dataset does not match with calculations such as 'du' then the usual UNIX things may be afoot (deleted files being held open, sparse files) as well as ZFSisms (snapshots holding storage for deleted/overwritten files, parity blocks on RAID-Z, COW for files that are being actively overwritten, and compression sometimes write uncompressed blocks and compresses them later).

#12 Updated by William Grzybowski over 6 years ago

  • Status changed from Unscreened to Screened

#13 Updated by Cyber Jock over 6 years ago

Had this issue on TrueNAS too... OLP-295-96716

I recommend this ticket be pulled back into 9.2.1.6 and deal with this. We've had many tickets complain about improper disk space free and disk space used since the 8.0.4 days. The problem is we've never really sat down and discussed how to properly provide the values. So I think we should have a round table discussion, hash out how it should work and then make it behave properly once and for all.

#14 Updated by Jordan Hubbard over 6 years ago

  • Target version changed from 79 to 103

#15 Updated by Cyber Jock over 6 years ago

This is a copy/paste from an email between myself, Jordan and William.

Ok, here's my thoughts. This is definitely open for discussion and debate
as I'm not going to pretend to have covered every base even though I will
try, but in my mind this seems to make the most sense...

The "Available" should list what I could theoretically write if I did a dd
write to that location(excluding compression since we can't easily predict
that anyway). If a quota is set it may be a small fraction of the total
pool or dataset. If a reservation is set in another dataset it may also
mean I have less space than I think I have because of the other reservation.
For example, if I have a 10TB pool with a dataset that has 5TB reserved I
can't possibly write more than 5TB to the pool's root directory because the
dataset has reserved 5TB for itself. If no reservation is set on the
dataset and no reservations are used elsewhere in the entire pool then the
dataset's available space would reflect the same value as the pool's total
free space since the entire pool and all datasets are a "free-for-all".

The "Size" should be:

- For datasets: the size should be blank unless a reservation or quota is
set. If a reservation or is set then the reservation value should be
provided in parenthesis(500GB). If a quota is set then it should be listed
between hyphens 500GB. If both a quota and reservation is specified then
the quota will be displayed since that's actually a limitation. Next to the
quota or reservation a % will tell the admin what % of the quota or
reservation is used. To prevent weird numbers like 110% reservation due to
someone using more disk space than the reservation any dataset with more
data than the reservation will simply show a 100% value.

- For pools: The total space that would be available if the pool were
empty and I wanted to do a dd write(basically total disk space minus parity
lost). For example, if I had a 4 disk RAIDZ2 pool I'd have 8TB raw but 4TB
should show up for "size". It's really 3.47TB or something because of the
disk size mismatch between binary and base-10, but I think you guys will
understand my thought process and I would expect to see 3.47TB. This value
will only change if you add a vdev or swap out your disks for bigger
disks(thereby expanding the pool size).

"Used":

Datasets: If you have 100GB of data in a dataset with no sub-datasets and
the dataset has a 500GB reservation then the dataset should show the
quantity of data in the dataset(100GB). The reservation(or quota) value
will be able to be seen in the size field for easy comparison comparison.
If there are datasets underneath a dataset the “used” should display only
the data in that dataset and not sub-datasets unless the user click the
carrots for the parent dataset to “hide” the sub-dataset. When you hide the
sub-dataset then the number for used changes to include the sum of that
dataset and all sub datasets underneath. This number should reflect the
actual quantity of physical storage blocks taken(ie if you have 50GB of data
that compresses at 2:1 only 25GB of space should show for used). You can
easily figure out your disk usage with a calculator if you so desire.

As another option we could do a used and compressed and list the amount of
data that the dataset thinks its holding before and after compression. But
since most people consider compression to be "free" and they are more
concerned with the quantity of uncompressed data they can store I don't
think this is particularly necessary unless we want to be badasses. Might
be useful for companies though as they can see hard numbers of how much
actual uncompressed data they have on the pool.

Pools will have 2 possible outcomes:

Outcome 1: If the pool has datasets underneath then the used will be the
same as the datasets(show just the data in the pool). If you click the
carrot to hide all of the datasets and below then you go to outcome 2. (“zfs
refer” for the pool I think is best for this). This would be what you would
normally see when you go to the volume manager since everything is expanded
fully by default.

Outcome 2: The first number will basically be situation where used = size -
snapshots - avail - reserved but not used in datasets. Then a slash (/) and
a second number that includes the total sum of all used space on the pool
from everything(this is how much space i can truly write to the pool in all
locations taking into account all dataset reservations, quotas, snapshots,
etc. This will be the first number for the pool + the data consumed by
snapshots at the present time. A % can be provided for one or both numbers
depending on what william thinks looks best. Personally I’d include the
on the second number at least because that will show at a glance, how “full”
the pool is. Since we are constantly talking about 80
and 95% full this
value and the value on the script should agree(right?) and anyone can see
the warning and go to this page and see the same percentage as the warning.

Loophole: This does mean if you set a 500GB reservation on a dataset with a
300GB quota you will see 200GB of space that will not match up with any
numbers provided in the chart anywhere, but a good admin shouldn't be
setting a reservation higher than the quota unless he's trying to throw away
disk space anyway. If we want to be extra user-friendly we could add a
script to check for this kind of mistake at 3am and provide an email to the
admin if they’ve done something like this. Of course, that should be
another ticket and is not something I would recommend for 9.2.1.6-RELEASE.

This gives the most important data that most people care about:

1. Total pool size and actual usage if you minimize the carrots. Double
duty for the carrots as you can get sums of data from a dataset and its
children very easily.(major plus when you have a large number of small
datasets which some have 100s or 1000s of).

2. Usage for the datasets now includes reservations and quotas since those
aren’t easily found in the WebGUI.(quotas are a particular nightmare because
the dataset can be filled and you still have free disk space in the pool
which is why quotas override reservations in the GUI).

Another somewhat related thought... do we want to hide the warden templates
by default? I see that the .system is hidden by default. I vote "yes" only
because it clutters up the volume manager with templates that you really
shouldn't be messing with anyway.

#16 Updated by William Grzybowski over 6 years ago

  • Target version changed from 103 to 9.3-M3

#17 Updated by Josh Paetzel about 6 years ago

<william> jpaetzel: i'm working on #4419, what do you think about removing the size column? specially now that ufs is not officially supported

That seems reasonable to me.

#18 Updated by William Grzybowski about 6 years ago

There is another point for discussion I want to raise.

The first row is somewhat "overloaded" in the sense that "tank" is a pool and a dataset, this is not clear at all in the UI.

What we could do is generate two rows for "tank", the main row is the pool information, the first and only child is the dataset information, e.g.

NAME                 USED        AVAILABLE      COMPRESSION   RATIO   STATUS
- tank               1.5TiB      500GiB  (2TiB) -             -       HEALTHY
  - tank             1.4TiB      500GiB         lz4           1.00x   -
    - tank/dataset   100GiB      490GiB         lz4           1.00x   -

However I can understand I could be confusing to the user as well in a first sight, but I guess it makes sense.

Thoughts?

#19 Updated by Josh Paetzel about 6 years ago

I think that is a fantastic idea TBH.

#20 Updated by William Grzybowski about 6 years ago

Ok, who is willing to check the new datagrid to provide some insight as to whether this is now enough or there are sill issues to be addressed?

#21 Updated by William Grzybowski about 6 years ago

  • Status changed from Screened to Resolved

#22 Updated by Dru Lavigne about 6 years ago

  • Status changed from Resolved to Closed

Verified in M4.

#23 Updated by William Grzybowski about 6 years ago

  • Related to Bug #6174: The capacity for the volume 'vdev1' is currently at 92%, while the recommended value is below 80%. added

Also available in: Atom PDF