wpirobotbuilder

Retired Staff

View Profile See their activity

Posts
2,501
Joined
January 3, 2013
Last visited
June 23, 2017

Reputation Activity

wpirobotbuilder got a reaction from Vitalius for a blog entry, Best Buy, You're Not All That Bad May 12, 2014

Okay, maybe you're bad at some things. Like charging exorbitant amounts for an HDMI cable (like most retailers do).
wpirobotbuilder got a reaction from Alpha17x for a blog entry, What exactly is an "optimal" rig? April 15, 2014

First, it would be great to know what "optimal" means.
wpirobotbuilder got a reaction from wng_kingsley7 for a blog entry, What exactly is an "optimal" rig? April 11, 2014

First, it would be great to know what "optimal" means.
wpirobotbuilder got a reaction from alpenwasser for a blog entry, A Potential Dual-1366 Machine April 5, 2014

In terms of raw performance, you can pick up some older dual-socket CPUs that'll beat out modern quad cores at the same price point (though they'll use more power).
The build will probably be a ZFS storage box at first, and if I upgrade later on I might use it to host virtual machines or as a computing node or something.
Specs:
Intel x5650 @2.66 GHz (x2) - Each one consumer about the same amount of power as a 4670K and has the same raw performance (though much less performance per clock)
24GB DDR3 1333 ECC Memory (3x8GB) (x2) - The motherboard I'm looking at has 6 slots, and supports up to 96GB of memory. However, 96GB of registered ECC memory would cost an arm and a leg, and I found 24GB kits for relatively little, so I'm going with 48 to fully populate the board. For a ZFS build, a lot of RAM will make it speedy, and an ESXi machine will help with VM hosting.
ASUS Z8NR-D12 - I ended up not being able to find a Z8NA-D6, but Newegg is selling this model for $200. What's more, it's being sold from a reputable reseller with some RMA support. Unfortunately it limits the cases I can install it in.
RM 850 - It's got two 8-Pin CPU connectors, which are required for the motherboard I'm using. I'm looking for one with less wattage, and the Seasonic 520W might be one (I'm looking for confirmation).
NVidia GT 520 - The board doesn't come with a graphics adapter, and I have one lying around.
I'll probably pick up an LSI 9211 if I end up needing more than 6 drives.
The drives will probably be WD Red or Seagate NAS, my research shows little reason to choose an SE over a Red, with the exception of peak performance. However my intention is to have a lot of data hit the cache first before going to the disk, so the performance doesn't matter as much.
If the system performs well enough, I might pick up an Intel I350-t4 (4x1Gb) or an Intel x540 (10Gb NIC). A network iSCSI target for game footage, video editing, backups will help a good deal if the array can push upwards of 300 MB/s.
I need an SSI EEB case. A full tower ATX seems like a waste, so I'm looking at the GD07 from Silverstone. I might go rackmount, but I'm having trouble finding a rackmount case with SSI EEB compatibility. Here's a 2U one from Norco. A 1U would be ideal but is much harder to find power supplies for (and is also much louder).
wpirobotbuilder got a reaction from Vitalius for a blog entry, Programmers: Get Your Act Together Before You Disrupt The Internet March 26, 2014

Start by reading this technical post from CloudFlare.
UDP itself has uses, but the 'monlist' command in the Network Time Protocol is entirely useless. It was probably a "feature" some genius programmer thought they'd need but didn't, and it just got left in the code base.
In proper software design, the majority of development strategies will prevent you from having code in your codebase that isn't used. If programmers for the NTP codebase had been developing code properly, this command would never have been implemented, and we wouldn't be dealing with these sorts of DDos attack. Apparently SNMP is being predicted to fall next, with almost three times the potential for DDos that NTP amplification can do, so better start cleaning up those vectors for DDos.
If this attack had been using SNMP amplification, we would be seeing attacks on the Terabit scale (1 Tb/s), which is on the order of disrupting the infrastructure of the internet.
wpirobotbuilder got a reaction from Vitalius for a blog entry, A More Flexible Way To RAID March 26, 2014

In traditional RAID setups, the data is spread out across all drives in the RAID array. If you were running RAID 5 with six drives, for example you would have five drives worth of storage space, with the missing space being for parity data. If you lose one drive, you haven't yet lost all your data. Throw in a new drive, and rebuild your RAID array (and pray you don't run into an unrecoverable read error or silent corruption, because then you can kiss your data good-bye).
This has obvious advantages: It is space efficient for the amount of redundancy it provides, and can increase read/write performance with good hardware, since there are multiple drives to run I/O on. However, a single drive going down will kill all of your data.
What if you wanted to be able to choose how much overhead you wanted to use for parity calculations? Or what if you wanted a drive loss to not completely kill all of your data? Here's an approach.
Here, we have a single giant parity RAID setup. Each color represents pieces of data belonging to a single data chunk (e.g. all the red blocks represent a chunk, spread out over all the drives). This is how traditional RAID works.

The proposed "betterRAID" method is to have a fixed parity ratio in a RAID array. For instance, if I want a RAID volume with N parity drive worth of space for every M drives worth of space, then I will have N/(N+M) for my overhead (for N = 1 and M = 4, I have a RAID 5 with five drives, pretty common). However, let me use any number of drives with this setup, and write a given chunk of data to 5 of those drives, then the next chunk to the next 5 drives, and so on, like this:
Here, the red data is written across five drives (twice as much data is written to an individual drive) instead of across all ten drives. The orange data gets written to the next five drives, then the green, etc.

Notice that if I kill any two drives, I am guaranteed to have 50% of my data survive in the worst case, and 100% of my data survive in the best case. To gain this advantage over traditional RAID 5, I sacrificed one additional drive worth of space (one drive for every five, meaning two drives of the ten are reserved for parity). Obviously, for very large files that span tons of data chunks, they will become corrupted. For smaller files (which can fit inside of a single data chunk), they would survive if the chunk survived, and therefore would be recoverable.
Here is a slightly more complicated example. Black lines indicate dead drives.

In this case, we write chunks of data across five drives (with 20% of that space used for single-parity), and have 18 drives total in our array. In this case, we can kill two drives, and in the worst case we have lost only 25% of our data.
To clarify: A "chunk" is not a complete file. A chunk is just a chunk of data (say, 512KB). If I was writing a 10KB file, it would fit within that chunk, and the next file I wrote might also fit within that chunk. When the chunk is completely full of data, the next one would start to be filled with new incoming data. Writing a multi-gigabyte file would span thousands of chunks.
There are obvious upsides to this, most notably the fact that losing more drives than there are parity will not destroy all data, though much of it would likely be corrupted if it spanned many chunks. This also makes disaster recovery a little bit better, ensuring that a failure will not necessarily kill absolutely everything. In addition, if we used dual-parity we could make it even harder to kill data.
The downside is that now it is harder to manage the data for an individual file, since you have to find which drives the data lives on.
This doesn't provide the same level of protection that dual-parity or triple-parity RAID does. It provides a measure of disaster recovery in case a RAID fails completely. I think it'd be really cool for a software RAID solution like ZFS to implement something like this for RAID Z1, Z2 and Z3.
wpirobotbuilder got a reaction from Vitalius for a blog entry, On The Computational Complexity of Being Greedy March 17, 2014

During my lunch break today, I wrote this MATLAB function:
function [rate] = greedy( balance, periods, profit)
% greedy Takes in a numerical balance, an integer number of periods, and
% the desired profit to be made from the loan holder. Returns the optimal
% APR in decimal form (i.e. 3.94% = .0394)
payment = (profit + balance)/periods;
syms x
fun = 0;
for i = 0:periods-1
fun = fun - payment*(1+x/12)^i;
end
fun = fun + balance*(1+x/12)^periods;
solns= double(solve(fun == 0, x));
rate = solns(solns> 0);
end
The coding description is pretty brief; I'll outline it a little more:
P is the desired monthly payment from the loan holder such that, over a given number of periods n, we will obtain the desired profit R, in dollars, from an initial loan balance B, also in dollars. It can be calculated as follows:
P = (B + R)/n
fun(x) is the amount of money left on the balance B after n periods of interest, which compounds at rate x, assuming the loan holder makes a single payment of amount P every period of interest. It is generated as a symbolic expression of the variable x, which evaluates to:
fun(x) = B*(1 + ax)n-1 - P(1 + ax)n-2 - P(1 + ax)n-3 - ... - P(1 + ax)1 - P
Where B is the starting loan balance, P is the required monthly payment, n is the number of periods for interest to compound, and a is the inverse of the rate of interest compounding (1/12).
The goal is to find a zero of the function fun(x), i.e. (x : fun(x) = 0). This will return the optimal APR such that the loan is paid off in full after n periods of interest compounding.
This is not easy to solve. The polynomial order of fun(x) grows linearly with n, which means finding the roots of an (n-1)th order polynomial. If your loan compounds over 5 years (60 months), you are solving for the roots of a polynomial that looks like this:
f(x) = a0x59 + a1x58+ ... + a58x + a59
Yikes. Fortunately, MATLAB has the beautiful function solve, which you see in the script above. This allows us to solve for all the roots of the polynomial. As it turns out, most of them are complex numbers with a nonzero imaginary part, but most of the time there is one real root, which is the one you care about.
For example:
I have a $14162 car loan with a 3.94% APR, to be paid back monthly for 5 years. According to calculator.net, I will give the bank approximately $1463.89 over those 5 years, paying out $260.43 per month.
Running the MATLAB function,

We can see that our function works. And if the bank wanted to make a large sum, say $3200:

That is about the original APR I was offered before I got a cosigner with good credit.
This algorithm doesn't take very long to run on modern computers, making it incredibly easy to do this. Of course, it didn't stop loaners before, because you could work in reverse, trying different APRs and calculating the profit from each APR until you got something close to what you wanted.
It was a good lunch break.

Sign In

wpirobotbuilder

Posts

Joined

Last visited

Reputation Activity

My Activity Streams