Jump to content

Netflix make tool for data collection opensource

Netflix has made its internal tool for the collection of data opensource.

The tool can be used to collect large amounts of server logs and send them to a central server.

 

The tool, named Suro, can handle different types of data. Suro is used by Netflix to collect the 1.5 million "server-events" a second.

We're talking about server logs, user activity or other operational data. These are collected by Suro so they can be sent to several destinations,

like a Hadoop-cluster or a ElasticSearch-cluster, which makes it possible to see trends like malfunction can be detected.

 

Since today the tool will be opensource, Netflix announces. Suro itself is actually based on an opensource-tool: 

It's an altered and tuned verion on Chukwa, a tool made by the Apache Foundation for data collection. 

According to Netflix, one Sure-instance is able to process 60.000 messages a second.

 

Why Netflix make the tool opensource is onclear. It's happened before that a company has made its tools opensource,

like Facbook did with its php-accelerator and database-software Cassandra.

 

B5z3v0I.png

 

 

Source: http://tweakers.net/nieuws/93085/netflix-maakt-tool-voor-verzamelen-data-opensource.html (Be warned, it's in Dutch)

Intel Core i7-3770k @4.4GHz. 1.170V - Noctua NH-D14 - 8GB (2x4) G.Skill RipjawsX @1866 MHz. CL8 - Gigabyte GA-Z77X-D3H - Samsung 840 250GB - XFX HD7970 DD GHz. @1125MHz - CM 690 II - Seagate Barracuda 2TB + WD RED 3TB - Corsair RM750X

 FiiO E10 - Sennheiser HD 598 - Edifier R1600T - Sony MHC-EX50 - Razer Blackwidow Ulitmate 2013 - Razer Deathadder 2013 - LG IPS237L - Dell P2314H - Philips 237EQPH - LG 47LB561V

 LG G2 - HP Spectre X360 - Audio Technica ATH-M50X - Sennheiser IE 60

Link to post
Share on other sites

I'm going to stop using Netflix, now.

How do you think they compiled that list of "recommended" movies you should watch?

▶ Learn from yesterday, live for today, hope for tomorrow. The important thing is not to stop questioning. - Einstein◀

Please remember to mark a thread as solved if your issue has been fixed, it helps other who may stumble across the thread at a later point in time.

Link to post
Share on other sites

Now I'm glad that i stopped using nsaflix

Case: Phanteks Evolve X with ITX mount  cpu: Ryzen 3900X 4.35ghz all cores Motherboard: MSI X570 Unify gpu: EVGA 1070 SC  psu: Phanteks revolt x 1200W Memory: 64GB Kingston Hyper X oc'd to 3600mhz ssd: Sabrent Rocket 4.0 1TB ITX System CPU: 4670k  Motherboard: some cheap asus h87 Ram: 16gb corsair vengeance 1600mhz

                                                                                                                                                                                                                                                          

 

 

Link to post
Share on other sites

I think people are confused on what has happened. They are not open sourcing your data, they are open sourcing the tool that can collect data. So that other people can use the tool to collect OTHER data, not to collect netflix usage data. They just use it currently for that feature, but it could be used for any website to collect any data they want. The main thing is that it scales very high.

 

This doesn't mean I can install Suro on my PC and netflix data will start showing up for people... absolutely nothing close to that.

 

http://gigaom.com/2013/12/09/netflix-open-sources-its-data-traffic-cop-suro/

http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html

 

Also, 'collect data' doesn't necessarily mean collect private data. Usage data can be used to manage servers to make them work better.

Edited by EricX2
Link to post
Share on other sites

I think people are confused on what has happened. They are not open sourcing your data, they are open sourcing the tool that can collect data. So that other people can use the tool to collect OTHER data, not to collect netflix usage data. They just use it currently for that feature, but it could be used for any website to collect any data they want. The main thing is that it scales very high.

 

This doesn't mean I can install Suro on my PC and netflix data will start showing up for people... absolutely nothing close to that.

 

http://gigaom.com/2013/12/09/netflix-open-sources-its-data-traffic-cop-suro/

http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflixs.html

well, I'm sure you know who now has access to this tool now.

Case: Phanteks Evolve X with ITX mount  cpu: Ryzen 3900X 4.35ghz all cores Motherboard: MSI X570 Unify gpu: EVGA 1070 SC  psu: Phanteks revolt x 1200W Memory: 64GB Kingston Hyper X oc'd to 3600mhz ssd: Sabrent Rocket 4.0 1TB ITX System CPU: 4670k  Motherboard: some cheap asus h87 Ram: 16gb corsair vengeance 1600mhz

                                                                                                                                                                                                                                                          

 

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×