Jump to content

A 1 Petabit DVD-like disc has been created

Uttamattamakin
44 minutes ago, leadeater said:

We have dual diverse 100Gbps connection at each campus as well as guaranteed 100Gbps from NZ to AUS and 100Gbps from NZ to North America. We'd be using that first before entertaining the idea of couriering physical media.

https://www.reannz.co.nz/the-network/reannz-network/

Yea I meant to ask you if you have rough pricing on 100 Gbps pricing.  I know back in the day like 8-10 years ago at the colocation I was using which happened to be sitting on one of the fiber termination points for overseas, the colocation offered like $2000/month for 10 Gbps speeds (but iirc it had a contractual guarantee on QOS...after they had an 2 hour outage is when they offered the contracts to all existing clients)...but this was like 8 years ago and I know pricing should be lower (well I guess depending on what guarantees and such).  I don't really deal with side of things anymore, so I'm a bit clueless in terms of current market pricing of such connections (aside from knowing they exist, and they aren't massively expensive anymore...assuming the infrastructure already is there)

 

But yea, I agree...backend connections over using physical media transfers any day.  There really isn't much use-case to not just spend a bit of extra time using the connection vs investing in additional equipment for what likely will be one off transfers.

 

40 minutes ago, tim0901 said:

Big science experiments generally don't send things via the public internet. They use academic networks, which are completely physically separate to public internet infrastructure.

Yea, but it's just easier writing internet than explaining additional networking stuff 😛  Plus it's easier to grab pricing on more publicly facing stuff as comparisons.

 

It's just telling though when public internet options are exceeding the demands that would be needed for even larger science experiments.

 

 

1 hour ago, leadeater said:

20 years from now consumer demands and computers won't be effectively 1 million times faster or using 1 million times the data capacity.

Honestly, it feels like for a lot of the demand we are almost reaching trailing end of the S curve (baring some revolutionary technology or some new consumption).

 

8k, maybe 16k will be about the limit for consumers where the visual differences becomes a placebo.  Audio, we already have hit the essentially the peak (we could switch to double point precision instead of single float, but that's greatly diminishing as well).

 

Volumetric videos might push storage a bit, but I don't really see that a widespread thing.

 

Games still need frequent updates.

 

So those are the big things which have really pushed storage over the last few decades...so baring a new use case which consumes data I wouldn't be surprised if most people have much more than 8TB - 16 TB of data in 20 years time.  [Aside from people like me which archive all my movies I purchase]

 

1 hour ago, Godlygamer23 said:

I clearly demonstrated the side that I was on, but having clear units being used shouldn't be a debate.

But the argument could be made that you chose the side which essentially tried making it their way.

 

Confusion will happen no matter what, as long as some manufacturers insist on using the 1000 metric instead of 1024 metric.

 

Back in the day, you bought a 700 MB CD, it was 700 * 1024 * 1024 bytes.  You bought 512 MB's of RAM, you had 512 * 1024 * 1024 bytes.  Windows displayed 700 MB, and 512 MB respectively.  The biggest thing that wasn't using 1024 was hard drives.

 

Confusion will always exist as long as you have the insistence of using 1000 for metrics when dealing with computer metrics...because computers are as a whole binary systems.  I'd argue since everything is typically utilized on a computer to mean 1024, we should have stuck to that standard.

 

As an example, if an ad had 8GiB of RAM, and 1 TB of storage.  By default Windows would show either using 1000 or 1024...but that still leads to consumers being confused because they see 9xx GiB of storage when they were told TB...because lets face it, no user actually googles the difference.  What should happen is storage manufactures should have just designed it to actually have 1 TiB, or slightly overprovision so they could round down to the nearest one.

 

Honestly, the insistence on Gi being the proper standard I think has actually hurt the general knowledge base...because it's no longer clear.  Before it would be documented which system is used, but now it's not really (and because there was decades of use-cases before hand there is overlap which leads people to calling 1024 bytes a kilobyte still).

 

The rough part is harddrive makers still push to make it look like their metric is power of 2 related.  SSD's commonly come in 64, 128, 256, 512... sizes...if you want to blame anyone blame the ones who use 1000 bases but still target numbers that are similar to base 2.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, wanderingfool2 said:

Yea I meant to ask you if you have rough pricing on 100 Gbps pricing.

I do yea but I'm actually not allowed to say and it's not really relevant anyway since it's a closed academic and research network so it doesn't really help to know how much you're getting screwed on commercial networks and pricing haha

Link to comment
Share on other sites

Link to post
Share on other sites

11 hours ago, Doobeedoo said:

Heh, I mean be it any middle east or a bit further, some countries really have good internet. Meanwhile larger parts of eastern Europe crap. 

Somalia. For a long time you could get better internet in Mogadishu than in Brooklyn. 

 

6 hours ago, leadeater said:

The progression curve is a lot less in the later half of that 20 years so we won't be having single CPUs or GPUs as fast as the fastest say 10 super computers of today, probably not even close.

That is very true.  It's not like it used to be.  A computer from 5 years ago is still VERY useful now.    In 1990 a computer from 1985 was junk, same with a computer from 1990 in 1995.  

 

 

6 hours ago, leadeater said:

 

None of this actually has much to do with how much data consumers will actually need to store and use though. All those super computers I mentioned, they still have vastly more storage than any desktop or laptop of today.

 

Not that I am saying this technology shouldn't have been developed or anything but it simply lacks a use case and demand for it. It's going to take a lot more than just a optical disc with a capacity of 100TB to make anyone interested in it since IBM tapes can already do 150TB. It's not even better than our best archive media of today.

Maybe you are right.  I'd just be very surprised if we don't 10x or 100x the size of the average media file which could lead to a corresponding growth in the need for archival storage even for normal people.   Let alone scientist. 

 

Also I love that you have those numbers for CERN.  Take a guess on how much data they throw away but would totally store if they could?  Quite a bit.  Science is hungry for more and more data.  There is no such thing as too much. 🙂

 

5 hours ago, tim0901 said:

So no, while data is stored on magnetic tapes for long-term offline storage, it is sent around the world at least predominantly via GÉANT. CERN has access to multiple (I believe 4?) 100Gb connections onto GÉANT as well as connections to the public internet. 10TB of data would only take about 10 minutes to send at that speed and will most definitely be throttled by the connection at the other end. Maybe it's different to institutions on the other side of the world, but at least within Europe, the days of sending CERN data by FedEx are long gone.

 

Great post and very informative.  Referring to so many Femtobarns of data comes from listening to lots of particle physicist speak and also for the purpose of this forum post I did post a link to a CERN publication that used it as a measure of data.    It is as you point out a imprecise way to talk about data.  Since as you point out they don't record all that occurs just what  is "interesting".  How do they determine what is interesting based on our best theories of how things work at that level. 

As for sending data across the ocean.  There is this. 

 

Remote Operations Center | CMS at Fermilab (fnal.gov)

 

Quote

The Remote Operations Center at Fermilab keeps U.S. scientists and technicians vitally connected to activities at the Large Hadron Collider at CERN in Geneva, Switzerland.

 

08-0118-03D.jpg
Image credit: Fermilab

 

Remote Operations Center at Fermilab

The ROC at Fermi National Accelerator Laboratory in Batavia, Illinois remotely supports the CMS experiment located 4,000 miles away in Cessy, France. The ROC allows US physicists to help operate the CMS detector and monitor the LHC accelerator.

Software developed at Fermilab and CERN coupled with fast data transfer between CERN and the US CMS computing center at Fermilab make remote operations possible. Real-time discussion among scientists at the ROC, CMS facilities at CERN, and the German laboratory DESY assist the seamless round-the-clock operation of the CMS experiment.

As for colleges and universities in the USA using a separate internet just between them.  What you call the public internet grew out of just such a network.  DARPANET.  Opening that up to the public by way of companies like ATT and Sprint is what became the internet (Which is not the same as the World Wide Web that came later). 

 

The institutions I worked and studied at contracted with private ISPs for services such as "Metro Ethernet" as they call it.  Basically a LAN that spans a city between buildings on campus or in the case of NIU across campus.   All of that said, when CD's DVD's and Zip disk were a thing.  Copying files to one, and walking to where it needed to be was still a thing.   

 

I have no doubt in my mind they don't do that now because WHY?  What storage media could possibly beat a download right now?  None.  IT would take the write speed, read speed and transit from place to place being significantly shorter than a download.  

 

At the same time it' not logical I'll admit.... but there is a flavor of 1.44 MB will be enough for most people for the next 50 years to saying that 1 Pb will never be needed. 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, leadeater said:

I do yea but I'm actually not allowed to say and it's not really relevant anyway since it's a closed academic and research network so it doesn't really help to know how much you're getting screwed on commercial networks and pricing haha

haha, yea fair enough [I always envied the prices for CALs that academic/gov't institutions got from Microsoft]

 

4 hours ago, Uttamattamakin said:

Maybe you are right.  I'd just be very surprised if we don't 10x or 100x the size of the average media file which could lead to a corresponding growth in the need for archival storage even for normal people.   Let alone scientist. 

Again, the growth trend has greatly shown the decline in growth; and you are ignoring what I said about general storage mediums running into issues with moores law (when applied to storage space).

 

Virtually one wants to sacrifice time to get up and put in a disc these days, when it's a simple search away; and less people would want to have to use discs to burn data especially when it would cost them a few hundred dollars to buy the burner anyways.

 

I've been collecting various forms of media for over 15 years now, and honestly the sizes really haven't increased drastically (or only when you go from 480 to 720 to 1080 to 4k).

 

The last realistic jump is to 8k...and even that might be a bit questionable in terms of if consumers would accept that.  1080p to 4k was noticeable, but not to the point where if I sat a normal user in-front of the monitor and told them it was a 4k TV (while secretly being 1080p) the majority wouldn't realize.

 

Remember, archiving DVD's back in the day would be 8GB, but these days archiving a 4k Netflix stream is ~8GB or less as well.  Again, if consumers really cared about quality that much, we wouldn't see the decline in disc medium as we have.  The general issue about "archival storage" as well, if you still need the space initially to hold a large amount of data, which we don't really have anyways in your hypothetical.

 

We just develop better codecs to reduce file sizes while maintaining a certain quality.

3735928559 - Beware of the dead beef

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, wanderingfool2 said:

The last realistic jump is to 8k...and even that might be a bit questionable in terms of if consumers would accept that.  1080p to 4k was noticeable, but not to the point where if I sat a normal user in-front of the monitor and told them it was a 4k TV (while secretly being 1080p) the majority wouldn't realize.

This I think is accurate in fact I recall saying things like this a few years ago when this channel was all about higher resolutions.  Like there is a limit to how much fidelity and angular resolution we need.   Controversially I think frame rates over 30 are icing on the cake.   I'd gladly rather have a higher fidelity simulation of how light, shadows and physics in general work ... than 16k resolution.   

 

You know like this. 

 

 

I can see 8k per eye VR/AR being a thing at some point.   You know because the screens have to be SO very close to your eyes.   

 

6 hours ago, wanderingfool2 said:

 

Remember, archiving DVD's back in the day would be 8GB, but these days archiving a 4k Netflix stream is ~8GB or less as well.  Again, if consumers really cared about quality that much, we wouldn't see the decline in disc medium as we have.  The general issue about "archival storage" as well, if you still need the space initially to hold a large amount of data, which we don't really have anyways in your hypothetical.

 

We just develop better codecs to reduce file sizes while maintaining a certain quality.

Maybe.  I can see totally different types of content being created which will demand more storage IF the storage exist to use it.  That's where I'm coming from.  

I'd be very surprised if in 20 years we are still consuming basically the same content as we are now.   On the other hand VR has been 10 years away for 40 years. 

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Uttamattamakin said:

How do they determine what is interesting based on our best theories of how things work at that level. 

Most of the data is junk because of a poor collision, as I described before with my bullet analogy.

 

When you collide particles in an accelerator, the aim is to produce a plasma of pure energy. This plasma then generates particles via pair production - essentially the conversion of pure energy into particle/antiparticle pairs. The more energy you give to the particles you're accelerating, the higher energy plasma you can create and so the more interesting particles can be created - and the more likely it is that the higher energy particles you're looking for are created.

 

But that's only the case if the interaction is head-on. If it's not, not all of that energy is contained in the collision, meaning we the interesting stuff we're looking for can't be produced. It also adds a level of uncertainty to the data - we want the collision energy to be as fixed as possible for scientific accuracy, so it's best to remove these "bad" interactions as early as possible.

 

Categorising interactions based on what theories we're investigating is done later, because what's interesting one group of researchers will be different to what's interesting to another. There are general directions as to what the field of physics is looking at on the whole - at the moment Higgs and Supersymmetry would probably be the big headlining topics - but each group will be looking for different decay mechanisms or signatures within that dataset, so we don't want to delete any of that data at the detector stage. This also allows them to look for interactions across the entire run's dataset, rather than only having access to whatever is generated after they pick something to look at.

 

 

 

Also, as a further note on CERN's compute capability since people seemed interested, everything I've discussed so far is what's based in Geneva, but there's a lot more than that. The worldwide LHC computing grid is the largest distributed computing grid in the world, comprising of 14 tier 1 sites (datacentres) and 160 tier 2 sites (universities and institutions) all of which contribute to the compute, storage and networking requirements of the experiments. These (and a few other) are all connected via the LHC Optical Private Network:

image.thumb.png.5134d8160444af705e3965a09af182e3.png

 

There's also LHCONE which connects tier 1 sites to tier 2s.

 

Data from CERN isn't just sent directly from Geneva to the universities. Geneva (tier 0) keeps the "raw" (mostly junk-free) data - they now have more than 1 exabyte of storage - as well as first passes for event reconstruction, but this is also shared with the tier 1s, kinda like a CDN. The tier 1s then distribute data to tier 2s and the countless tier 3/local users, as well as providing storage for processed data from tier-2 institutions.

 

CERN being very open with this stuff, you can watch the network usage live here: https://monit-grafana-open.cern.ch/d/HreVOyc7z/all-lhcopn-traffic?orgId=16&refresh=5m

 

But it's not very interesting at the moment - only in the 100-200 Gb/s range - because the LHC is switched off for winter maintenance. They did run an experiment on the network over the last 2 weeks though:

 

image.thumb.png.34060686a54154157ef3e287ead1496c.png

I wonder when the test started??

 

That load is meant to be representative of ~25% of what will be generated when the upgraded HL-LHC starts operations in 2028 - and note that's only in one direction!

 

I also found some graphics and slides giving an overview of CERN's computing status at the moment:

Spoiler

image.thumb.png.02b2df6dfa90261719655af5089d18b7.png

This is the main CERN Data Centre dashboard. As you can see - mostly hard disks! And remember - the LHC Is switched off at the moment!!

 

Those 5.1K WiFi APs tho...

 

image.thumb.png.4868916fdf2cc806f3caed6a8b7a4546.png

And that's the Openstack dashboard - Openstack being what powers CERN's private cloud infrastructure service. 2.07PB of file shares!

 

image.thumb.png.a0d8d9dda47529894bedce40621f70bb.png

Found it rather interesting that the vast majority of traffic is IPv6! Compared to ~1/3 worldwide according to cloudflare.

 

image.png.7cf867a7073ad74c1b94f34b518debef.png

That's at the CERN campus in Geneva I believe. The majority is hard disk, but flash is also being used. They've also reached 1TB/s read speeds.

 

image.thumb.png.f013bb28a0532511979e77af25e463d2.png

 

image.thumb.png.003e42912e04f1a4bc915d2b788e872f.png

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

just one question as i haven't seen it mentioned... is this magneto optical or nah?

 

if it isn't longevity is a big question mark and it'll probably never come out of the concept stages. 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×