Jump to content

ZFS and Ubuntu not using enough ram???

Bmoney

I have 64GB ram with 52 TB of raidz2 zpool. I been reading and the system is set up to only use 50% of available ram which lines up with the fact my ram usage is only at 34GB. When I do large transfers I get a kernel panic. I need 10GB for Ubuntu and other services tops. I want to give ZFS the other 54TB of ram how do I do this. 

Link to comment
Share on other sites

Link to post
Share on other sites

Take a look at this post. Seems that you can adjust how much the system uses by editing a config file. 

Link to comment
Share on other sites

Link to post
Share on other sites

Helps a little now I am wondering what should be in the zfs.conf I need to create now

 

Link to comment
Share on other sites

Link to post
Share on other sites

What do you mean by "give" ZFS the other 54GB? 

Have you configured ARC? You might want to check the zfs.conf - the line "zfs_arc_max"

It's normally 50% of your total memory availability which should be 32GB in your case anyway which is what i'd leave it at.

ZFS should have some spare memory for scrubbing & resilvering as it needs to.

 

I guess its a multi purpose server if you're using 10GB memory already for Ubuntu...

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO | 12 x 8TB HGST Ultrastar He10 (WD Whitelabel) | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

P.S it doesn't sound like a memory availability issue for the panic. 

 

Either you're having a hardware compatibility problem, or theres some sort of configuration fault. 

 

Have you checked dmesg to see if theres any obvious system errors at the time of the crash? 

Spoiler

Desktop: Ryzen9 5950X | ASUS ROG Crosshair VIII Hero (Wifi) | EVGA RTX 3080Ti FTW3 | 32GB (2x16GB) Corsair Dominator Platinum RGB Pro 3600Mhz | EKWB EK-AIO 360D-RGB | EKWB EK-Vardar RGB Fans | 1TB Samsung 980 Pro, 4TB Samsung 980 Pro | Corsair 5000D Airflow | Corsair HX850 Platinum PSU | Asus ROG 42" OLED PG42UQ + LG 32" 32GK850G Monitor | Roccat Vulcan TKL Pro Keyboard | Logitech G Pro X Superlight  | MicroLab Solo 7C Speakers | Audio-Technica ATH-M50xBT2 LE Headphones | TC-Helicon GoXLR | Audio-Technica AT2035 | LTT Desk Mat | XBOX-X Controller | Windows 11 Pro

 

Spoiler

Server: Fractal Design Define R6 | Ryzen 3950x | ASRock X570 Taichi | EVGA GTX1070 FTW | 64GB (4x16GB) Corsair Vengeance LPX 3000Mhz | Corsair RM850v2 PSU | Fractal S36 Triple AIO | 12 x 8TB HGST Ultrastar He10 (WD Whitelabel) | 500GB Aorus Gen4 NVMe | 2 x 2TB Samsung 970 Evo Plus NVMe | LSI 9211-8i HBA

 

Link to comment
Share on other sites

Link to post
Share on other sites

I agree with Jarsky, your kernel panic is not related to the amount of RAM allocated (unless you have bad RAM). I have 32gb and 10gb networking at home and seeing speeds around 800-900mbyte/s.

 

Just reread and saw you're not using FreeNAS, editing the config may be a good idea to give ZFS more access to your RAM if you're not getting the speeds you want. I would install netdata to give you a little more inisght to how the resources are being used.

 

Link to comment
Share on other sites

Link to post
Share on other sites

mike yes. I looking for those speeds. I am using Ubuntu with 10 drives in raidz2 and I am only getting 200MB/s transfers

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Bmoney said:

mike yes. I looking for those speeds. I am using Ubuntu with 10 drives in raidz2 and I am only getting 200MB/s transfers

 

There could be a couple things going on.

Are you using a 10GB switch or directly connecting?

What type of files are you transferring?

Is it 200MB/s in both directions?

Have you tested the 10GB connection using iperf?

Have you tested your Raidz2 pool using dd to make sure there's nothing wrong?

Link to comment
Share on other sites

Link to post
Share on other sites

Direct peer link, static IP4, Jumbo Frames of 9086 and MTU matching. Yes on DD and Iperf yes

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Bmoney said:

Direct peer link, static IP4, Jumbo Frames of 9086 and MTU matching. Yes on DD and Iperf yes

 

Funny story, I'm getting my speeds without jumbo frames and through a switch. Try reverting to normal 1500 and see what happens.

 

What were the results for DD and Iperf? (curious)

Link to comment
Share on other sites

Link to post
Share on other sites

10000+0 records in
10000+0 records out
20971520000 bytes (21 GB, 20 GiB) copied, 83.0949 s, 252 MB/s that can't be right? The DD test?

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Bmoney said:

10000+0 records in
10000+0 records out
20971520000 bytes (21 GB, 20 GiB) copied, 83.0949 s, 252 MB/s that can't be right? The DD test?

 

A slow disk in a parity array can cripple the whole array - other disks and the parity calculation rely on every disk. If you're getting 252MB/s on your DD test of 10 disks in RaidZ2 there's an issue.

 

Not being overly familiar with DD, I used somebody elses methodology: https://www.ixsystems.com/community/threads/notes-on-performance-benchmarks-and-cache.981/

Basically disable compression and use the below:

 

to test write speeds:

 dd if=/dev/zero of=tmp.dat bs=2048k count=50k

to test read speeds: 

dd if=tmp.dat of=/dev/null bs=2048k count=50k

 

Link to comment
Share on other sites

Link to post
Share on other sites

 pool: Leyline
 state: ONLINE
  scan: scrub repaired 0B in 25h39m with 0 errors on Mon Mar 11 03:04:00 2019
config:

    NAME        STATE     READ WRITE CKSUM
    Leyline     ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
        sdb     ONLINE       0     0     0
        sdc     ONLINE       0     0     0
        sdd     ONLINE       0     0     0
        sde     ONLINE       0     0     0
        sdf     ONLINE       0     0     0
        sdg     ONLINE       0     0     0
        sdh     ONLINE       0     0     0
        sdi     ONLINE       0     0     0
        sdj     ONLINE       0     0     0
        sdk     ONLINE       0     0     0
    logs
      sda2      ONLINE       0     0     0

Link to comment
Share on other sites

Link to post
Share on other sites

Weird Iperf now

Accepted connection from 10.1.20.2, port 55307
[  5] local 10.1.20.1 port 5201 connected to 10.1.20.2 port 55308
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   517 MBytes  4.34 Gbits/sec                  
[  5]   1.00-2.00   sec   600 MBytes  5.03 Gbits/sec                  
[  5]   2.00-3.00   sec   610 MBytes  5.12 Gbits/sec                  
[  5]   3.00-4.00   sec   625 MBytes  5.24 Gbits/sec                  
[  5]   4.00-5.00   sec   643 MBytes  5.40 Gbits/sec                  
[  5]   5.00-6.00   sec   616 MBytes  5.17 Gbits/sec                  
[  5]   6.00-7.00   sec   637 MBytes  5.34 Gbits/sec                  
[  5]   7.00-8.00   sec   635 MBytes  5.33 Gbits/sec                  
[  5]   8.00-9.00   sec   633 MBytes  5.31 Gbits/sec                  
[  5]   9.00-10.00  sec   658 MBytes  5.52 Gbits/sec                  
[  5]  10.00-10.04  sec  24.4 MBytes  5.52 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-10.04  sec  6.05 GBytes  5.18 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.1.20.2, port 55364
[  5] local 10.1.20.1 port 5201 connected to 10.1.20.2 port 55365
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   133 MBytes  1.11 Gbits/sec                  
[  5]   1.00-2.00   sec   141 MBytes  1.18 Gbits/sec                  
[  5]   2.00-3.00   sec   142 MBytes  1.19 Gbits/sec                  
[  5]   3.00-4.00   sec   142 MBytes  1.19 Gbits/sec                  
[  5]   4.00-5.00   sec   143 MBytes  1.20 Gbits/sec                  
[  5]   5.00-6.00   sec   139 MBytes  1.17 Gbits/sec                  
[  5]   6.00-7.00   sec   142 MBytes  1.19 Gbits/sec                  
[  5]   7.00-8.00   sec   143 MBytes  1.20 Gbits/sec                  
[  5]   8.00-9.00   sec   143 MBytes  1.20 Gbits/sec                  
[  5]   9.00-10.00  sec   142 MBytes  1.19 Gbits/sec                  
[  5]  10.00-10.04  sec  5.29 MBytes  1.20 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-10.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-10.04  sec  1.38 GBytes  1.18 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
 

Link to comment
Share on other sites

Link to post
Share on other sites

dd: error writing 'tmp.dat': No space left on device
45651+0 records in
45650+0 records out
95735767040 bytes (96 GB, 89 GiB) copied, 573.522 s, 167 MB/s
 

Seems like it is only testing my boot drive not my zpool

 

Link to comment
Share on other sites

Link to post
Share on other sites

16 minutes ago, Bmoney said:

dd: error writing 'tmp.dat': No space left on device
45651+0 records in
45650+0 records out
95735767040 bytes (96 GB, 89 GiB) copied, 573.522 s, 167 MB/s
 

Seems like it is only testing my boot drive not my zpool

 

you have to navigate to where you want to test it or specify it lol go ahead and delete the tmp.dat so your disk isn't full.

 

dd if=/dev/zero of=/your/zfs/pool/dataset/tmp.dat bs=2048k count=50k

 

Just make sure compression is off (by default it is on and lz4, you can create a new dataset with it off if you don't want to mess with any existing datasets.)

Link to comment
Share on other sites

Link to post
Share on other sites

 count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 300.323 s, 358 MB/s
 

Link to comment
Share on other sites

Link to post
Share on other sites

Also to note I have no system crashes via Gigabit links only on 10GB links. I have tried a ASUS ROG Nic and a Broadcom 10GB nic. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Bmoney said:

 count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 300.323 s, 358 MB/s
 

At this speed your compression is disabled for sure. Something is odd, for 8-10 disks this is way too slow. Did you follow a guide to setup ZFS on linus?

Link to comment
Share on other sites

Link to post
Share on other sites

56 minutes ago, Bmoney said:

Also to note I have no system crashes via Gigabit links only on 10GB links. I have tried a ASUS ROG Nic and a Broadcom 10GB nic. 

 

I see, these are 10GBase-T cards then? Those might actually be the cause then, they're semi-new to the game and you may need to play with the drivers a bit.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×